-
Notifications
You must be signed in to change notification settings - Fork 10
Filtering Reads
WARNING: This documentation is outdated and will soon be updated.
Eoulsan defines a ReadFilter
interface that allows to filter reads.
In the following example, we use the IlluminaFilterFlagReadFilter
that filters reads from Illumina sequencer on the filter flag in the id of the read.
ReadSequenceReader reader = new FastqReader(new File("in.fastq"));
ReadFilter filter = new IlluminaFilterFlagReadFilter();
for (ReadSequence read : reader) {
if (filter.accept(read)) {
System.out.println(read));
}
}
reader.close();
The Readfilter
interface define two methods for read filtering:
-
boolean accept(ReadSequence read)
for single end filtering -
boolean accept(ReadSequence read1, ReadSequence read2)
for paired-end/mate-pair filtering
Usually the accept(ReadSequence read1, ReadSequence read2)
method is not implemented by developers as a call of the accept(ReadSequence read)
for each ends of the sequenced cluster is often enough to filter reads. The abstract class AbstractReadSequence
provide a ready to use accept
method for this case.
@Override
public boolean accept(ReadSequence read1, ReadSequence read2) {
return accept(read1) && accept(read2);
}
Several filters can be executed with only one call to the accept()
method using the MultiReadFilter
class:
ReadSequenceReader reader = new FastqReader(new File("in.fastq");
List<ReadFilter> filtersList = new ArrayList<ReadFilter>();
filtersList.add(new IlluminaFilterFlagReadFilter());
filtersList.add(new ValidReadFilter());
ReadFilter filter = new MultiReadFilter(filtersList);
for (ReadSequence read : reader) {
if (filter.accept(read)) {
System.out.println(read));
}
}
reader.close();
The following ReadFilter
implementations are available:
-
PairCheckReadFilter
checks if each pair of read to check comes from the same cluster, -
PairEndReadFilter
removes paired-end or single-end reads (useful ???), -
QualityReadFilter
filters reads with a bad mean quality, -
TrimReadFilter
filters polyN tail of reads and remove reads short reads, -
ValidReadFilter
filters reads that don't pass thevalidate()
ReadSequence
method, -
IlluminaFilterFlagReadFilter
filters reads that don't pass Illumina filter.
It is very easy to write a plug-in for the filterreads step of Eoulsan. In this section we will write a MyQualityReadFilter
class that filters on mean quality.
- First add
getName()
andgetDescription()
method to your new filter:
package com.example;
public class MyQualityReadFilter extends AbstractReadFilter {
@Override
public String getName() {
return "myquality";
}
@Override
public String getDescription() {
return "My quality threshold ReadFilter";
}
}
- Then add
setParameter
method that allow to configure our filter:
private double qualityThreshold = -1.0;
@Override
public void setParameter(final String key, final String value)
throws EoulsanException {
if (key == null || value == null)
return;
if ("threshold".equals(key.trim())) {
try {
this.qualityThreshold = Double.parseDouble(value.trim());
} catch (NumberFormatException e) {
return;
}
if (this.qualityThreshold < 0.0)
throw new EoulsanException("Invalid qualityThreshold: "
+ qualityThreshold);
} else
throw new EoulsanException("Unknown parameter for "
+ getName() + " read filter: " + key);
}
- And an
init()
method to initialize the plug-in once all the parameters has been set. Here for our example, if no threshold has been set throw an exception.
@Override
public void init() {
if (this.qualityThreshold < 0.0)
throw new IllegalArgumentException("Quality threshold is not set for "
+ getName() + " read filter.");
}
- Now we can add the
accept()
method:
@Override
public boolean accept(final ReadSequence read) {
if (read == null)
return false;
return mean(read.qualityScores()) > this.qualityThreshold;
}
- Now our
ReadFilter
can compile and can be used in a standalone program but not as a filterreads plug-in. To enable ourReadfilter
as a plug-in we must register it by adding the full name of the class in thefr.ens.transcriptome.eoulsan.bio.readsfilters.ReadFilter
text file in the META-INF/services directory. See the Writing Step Plugin for more information:
com.example.MyQualityReadFilter