layout | title | date | categories | tags |
---|---|---|---|---|
post |
CEABiGR Kallisto Run for lncRNA Count Matrix |
2023-02-17 |
genomics expression |
analysis lncRNA |
Completed
-
Run trimmed RNA-seq data in kallisto using the lncRNA fasta file as reference
-
Code can be found here
-
Follow up at pub-a-thon to merge abundance.tsv files for each library.
-
Create new repo for this using the CEABiGR code as a reference
Using kallisto to get count data for lncRNA...
Need fasta and fq
Grab .fq files
wget -r \
--no-check-certificate \
--quiet \
--no-directories --no-parent \
-P /home/shared/8TB_HDD_02/zbengt/data/oyster-lnc/kallisto \
-A *fq.gz \
https://gannet.fish.washington.edu/Atumefaciens/20220224_cvir_gonad_RNAseq_fastp_trimming/
Downloading lncRNA fasta
wget --no-check-certificate \
-P /home/shared/8TB_HDD_02/zbengt/data/oyster-lnc/01-lncRNA-kallisto/fasta \
https://gannet.fish.washington.edu/Atumefaciens/20220217-cvir-lncRNA_subsetting/GCF_002022765.2_C_virginica-3.0_lncRNA.fa
Creating an index
/home/shared/kallisto/kallisto \
index -i /home/shared/8TB_HDD_02/zbengt/data/oyster-lnc/01-lncRNA-kallisto/index/GCF_002022765.2_C_virginica-3.0_lncRNA.index \
/home/shared/8TB_HDD_02/zbengt/data/oyster-lnc/01-lncRNA-kallisto/fasta/GCF_002022765.2_C_virginica-3.0_lncRNA.fa
find /home/shared/8TB_HDD_02/zbengt/data/oyster-lnc/01-lncRNA-kallisto/fq/*_R1.fastp* \
| xargs basename -s _R1.fastp-trim.20bp-5prime.20220224.fq.gz | xargs -I{} /home/shared/kallisto/kallisto \
quant -i /home/shared/8TB_HDD_02/zbengt/data/oyster-lnc/01-lncRNA-kallisto/index/GCF_002022765.2_C_virginica-3.0_lncRNA.index \
-o /home/shared/8TB_HDD_02/zbengt/github/oyster-lnc/output/01-lncRNA-kallisto/{} \
-t 40 \
/home/shared/8TB_HDD_02/zbengt/data/oyster-lnc/01-lncRNA-kallisto/fq/{}_R1.fastp-trim.20bp-5prime.20220224.fq.gz /home/shared/8TB_HDD_02/zbengt/data/oyster-lnc/01-lncRNA-kallisto/fq/{}_R2.fastp-trim.20bp-5prime.20220224.fq.gz \
2>&1 | tee ../output/01-lncRNA-kallisto/{}.out