Releases: HumanCellAtlas/sctools
Metric computation speed-up
Metric computation is done directly on the bam files.
Updates the fastqpreprocess with uint64
We disabled the BGZF file EOF check to accommodate files that do not conform to this requirement.
FastqProcessing
This step adds the fastqprocessing step written in C++
FastqProcessing code provides more information on output
The inputs options and files are checked for valid values and existence of the input files.
SplitBamByCellBarcode writes the intermediate file in SAM instead of BAM
SplitBamByCellBarcode writes the intermediate file in SAM instead of BAM. This speeds up the step but requires machines with more HDD.
Added cell metrics for mitochondrial gene
Added cell metrics related to mitochondrial genes .
The followings metrics related to mitochondrial genes are added
For each cell we add the following metrics:
- n_mitochondrial_genes: the number of mitochondrial genes
- n_mitochondrial_molecules: the number of molecules from mitochondrial genes, i.e., sum of the counts from mitochondrial genes
- pct_mitochondrial_molecules: percentage of n_mitochondrial_molecules in terms of the total number of molecules for the cell across all genes
v0.3.7
Added functionality to ignore multi-gene annotations, introduced by Drop-seq tools 2.3.0, required for snRNA Seq
In order to annotate the intronic alignments Optimus is using a newer version of Drop-Seq tools 2.3.0.. In this new version of Drop-seq tools, the gn tag (which was GE in earlier versions) can have multiple gene names as value, a string with multiple gene names separated by a comma. However, in the CreateCountMatrix command, in sctools, the logic needs to ignore such alignments in the counting of the count matrix. The current release of sctools accomplishes this with appropriate code change.
This corrects the Dockerfile associated with v.0.3.6
v0.3.6
Added functionality to ignore multi-gene annotations, introduced by Drop-seq tools 2.3.0, required for snRNA Seq
In order to annotate the intronic alignments Optimus is using a newer version of Drop-Seq tools 2.3.0.. In this new version of Drop-seq tools, the gn tag (which was GE in earlier versions) can have multiple gene names as value, a string with multiple gene names separated by a comma. However, in the CreateCountMatrix command, in sctools, the logic needs to ignore such alignments in the counting of the count matrix. The current release of sctools accomplishes this with appropriate code change.