Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Main dev branch #2

Open
wants to merge 12 commits into
base: master
Choose a base branch
from
Open

Main dev branch #2

wants to merge 12 commits into from

Conversation

yuxuanlan
Copy link
Collaborator

Update nextflow pipeline from DSL1 to DSL2.
Update README for current version.
Other changes:

  1. For building the gene level expression matrix, the pipeline now loads a prebuilt transcript to geneID lookup table. A biomart DB is no longer needed.
  2. scqc_from_matrix step used to generate a QC_meanexp_vs_freq.pdf file that is often challenging for computer rendering. The current pipeline converts the PDF to PNG to overcome this issue.

@yuxuanlan yuxuanlan requested a review from gemygk April 26, 2023 15:39
Copy link
Contributor

@gemygk gemygk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yuxuanlan
Thanks for the PR 👍
I had a quick look and have added my comments to the files I have reviewed. I will check again once you resolve these.

README.md Outdated
* singularity 2.4.2


* kallisto 0.44.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we are going back to an earlier version here for Kallisto (0.45.1 to 0.44.0). Are we sure this is the case? This should be 0.45.1 or above.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now updated to kallisto 0.48.0

done
fi
done
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bash lines can be converted into a script (python or bash) by taking a sample sheet as input and then moving to the /Scripts, folder for e.g, /Scripts/format_reads.py. The script can then be mentioned here in the README.md

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feature will be included in a later version.

PIP-3144.config Outdated
executor.pollInterval='5 min'
executor.queueStatInterval='8 min'
executor.exitReadTimeout='5 min'
executor.dumpInterval='10 min'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this config required as part of the installation? If yes, you can move it to a new sample_data/ folder

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now moved.

--mail-type=ALL [email protected] \
--wrap "source /ei/cb/common/Scripts/singlecellQC/v1.0/scqc_reqs-1.0 && \
nextflow run scqc_nf.sh -c config_file -with-report -resume"
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The software should be under /ei/software/cb location, i.e, /ei/cb/common/Scripts/singlecellQC/v1.0/scqc_reqs-1.0 should be here /ei/software/cb/singlecellQC/1.0/x86_64/bin

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed.

README.md Outdated
* mtnamefile - In case of non-human species, .rds file for mitochondrial gene. Leave empty ('') if human.
This is a vector in R, containing the list of Ensembl transcript IDs, saved as .rds (using saveRDS())
* species - Kept for legacy code, this parameter is only used if is 'Hsapiens' or 'Bos_taurus'.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change: 'if is' > 'if it is'
Also, what about support for 'Mmusculus'?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed.

* nextflow 19.04.1
* singularity 2.4.2

- kallisto 0.44.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as earlier, the Kallisto version requires an update.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated.

scqc_nf.sh Outdated
k_scrape_script='/ei/cb/development/lany/git/singlecellQC/main_dev_branch/Scripts/kallisto_mapping_scrape.R'
doc_script='/ei/cb/development/lany/git/singlecellQC/main_dev_branch/Scripts/QCreport.Rmd'
plate_matrix_merge_script='/ei/cb/development/lany/git/singlecellQC/main_dev_branch/Scripts/plate_merge.R'
tx2g_script='/ei/cb/development/lany/git/singlecellQC/main_dev_branch/Scripts/est_counts_tx2gene.R'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lines 12-17, the path should be changed from
/ei/cb/development/lany/git/singlecellQC/main_dev_branch
to
/ei/cb/common/Scripts/scqc
OR
/ei/software/cb/singlecellQC/1.0/x86_64/bin.

we should not use /ei/cb/development for hosting production scripts as this location will be cleaned up and files will get removed.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scripts are now under /ei/software/cb/singlecellQC/1.0/x86_64/bin/Scripts

scqc_nf.sh Outdated
singularity exec ${image_scater} Rscript ${plate_matrix_merge_script} ${params.quantificationsoutdir} \'${id_list}\';
echo \$(ls -d -1 ${params.quantificationsoutdir}*.* | grep all) > matrix_location.txt
singularity exec ${image_doc} Rscript ${plate_matrix_merge_script} ${params.quantificationsoutdir} ${q_merge_col} \'${doc_complete_check}\';
ln -s -v -f ${params.quantificationsoutdir}all_plates.tsv all_plates.tsv
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not run the pipeline myself, but here you are using force (-f, with ln -s -v ) to create a symlink. It will overwrite an existing file. Is that safe in this instance?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here it overwrites symbolic link file in the nextflow temporary run directory.

scqc_reqs-1.0 Outdated
#!/bin/bash

PATH=/ei/software/cb/eipap/2.14.2/x86_64/bin:/ei/software/cb/eimap/2.1.4/x86_64/bin:/ei/software/cb/python_miniconda/4.9.2_py3.8_pap/x86_64/bin:/ei/software/cb/git/2.26.2_CBG/x86_64/bin:/ei/cb/common/Scripts:/tgac/software/production/bin/core/../..//hpccore/5/x86_64/bin:/usr/users/ga002/tgacpap/bin:/ei/software/cb/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/tgac/software/production/bin:/tgac/software/production/libraries/bin \
source nextflow-22.04.0_CBG
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not need this file here. The PATH's are from PAP node, with eipap and eimap.

This should be part of the wrapper and sit under /ei/software/cb location.

cd /ei/software/cb/bin

cat > singlecellQC-1.0
#!/bin/bash
tool="singlecellQC/1.0"
location="/ei/software/cb"
echo "${tool} is sourced from ${location} location"

source nextflow-22.04.0_CBG

echo "Usage:"
echo "  nextflow run scqc_nf.sh -c config_file"

export PATH=/ei/software/cb/singlecellQC/1.0/x86_64/bin:$PATH

And scqc_nf.sh should be located under /ei/software/cb/singlecellQC/1.0/x86_64/bin

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants