Sequence data for CHC 2018 project, filtering and parsing