diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml new file mode 100644 index 0000000..c712c2f --- /dev/null +++ b/.github/workflows/ci.yml @@ -0,0 +1,29 @@ +name: ci +on: + push: + branches: + - master + - main +permissions: + contents: write +jobs: + deploy: + runs-on: ubuntu-latest + steps: + - uses: actions/checkout@v4 + - name: Configure Git Credentials + run: | + git config user.name github-actions[bot] + git config user.email 41898282+github-actions[bot]@users.noreply.github.com + - uses: actions/setup-python@v5 + with: + python-version: 3.x + - run: echo "cache_id=$(date --utc '+%V')" >> $GITHUB_ENV + - uses: actions/cache@v4 + with: + key: mkdocs-material-${{ env.cache_id }} + path: .cache + restore-keys: | + mkdocs-material- + - run: pip install mkdocs-material + - run: mkdocs gh-deploy --force \ No newline at end of file diff --git a/docs/devops.md b/docs/devops.md new file mode 100644 index 0000000..10e30d9 --- /dev/null +++ b/docs/devops.md @@ -0,0 +1,40 @@ +# Build platforms + + +## Configure to use posit package manager + +[source](https://packagemanager.posit.co/client/#/repos/bioconductor/setup?bioconductor_version=3.18) + +``` +# Configure BioCManager to use Posit Package Manager: +options(BioC_mirror = "https://packagemanager.posit.co/bioconductor") +options(BIOCONDUCTOR_CONFIG_FILE = "https://packagemanager.posit.co/bioconductor/config.yaml") + +# Configure a CRAN snapshot compatible with Bioconductor 3.18: +options(repos = c(CRAN = "https://packagemanager.posit.co/cran/2024-05-01")) +``` + +### Installing sc packages in O2 + +### CELLCHAT + +``` +module load miniconda3/23.1.0 + +conda create -p /n/app/bcbio/R4.3.1_python -c conda-forge umap-learn + +BiocManager::install("BiocNeighbors") +install.packages('NMF') +install.packages("circlize") +devtools::install_github("jinworks/CellChat") + +library(reticulate) +# create a new environment +virtualenv_create("/n/app/bcbio/R4.3.1_python_sc_20240522snapshot") +# install umap +virtualenv_install("/n/app/bcbio/R4.3.1_python_sc_20240522snapshot", "umap-learn") +# indicate that we want to use a specific virtualenv +use_virtualenv("/n/app/bcbio/R4.3.1_python_sc_20240522snapshot") +py_module_available(module = 'umap') +reticulate::import(module = "umap", delay_load = TRUE) +``` \ No newline at end of file diff --git a/docs/index.md b/docs/index.md index 5c3ab36..c8a7f6c 100644 --- a/docs/index.md +++ b/docs/index.md @@ -1,15 +1,19 @@ # Welcome to HCBC Platform -## Stable R env +Welcome to our Platform guidelines web-page. Here analysts and developers will find guidelines on how to work with our most common environments. + +## Environments * O2 open OnDemand: * For RNAseq and scRNAseq analysis use `/n/app/bcbio/R4.3.1` - * With these modules: `gcc/9.2.0 imageMagick/7.1.0 geos/3.10.2 cmake/3.22.2 R/4.3.1 fftw/3.3.10 gdal/3.1.4 udunits/2.2.28` + * With these modules on: `gcc/9.2.0 imageMagick/7.1.0 geos/3.10.2 cmake/3.22.2 R/4.3.1 fftw/3.3.10 gdal/3.1.4 udunits/2.2.28 boost/1.75.0` * With no `bcbio` in your `PATH` ## Supported Templates -Install `bcbioR` as indicated here: `https://github.com/bcbio/bcbioR/tree/main` +We used `bcbioR` to deploy folders and code to our project directories to improve robustness in our analysis. + +You can install `bcbioR` as indicated here: `https://github.com/bcbio/bcbioR/tree/main` RNAseq ![](https://img.shields.io/badge/status-alpha-blue) TEASeq ![](https://img.shields.io/badge/status-concept-yellow) diff --git a/docs/pipelines.md b/docs/pipelines.md index c649f7c..6cc1e5f 100644 --- a/docs/pipelines.md +++ b/docs/pipelines.md @@ -1,5 +1,30 @@ # Introduction to HCBC pipelines +## Nextflow in Seqera platform + +- Create an user here: https://cloud.seqera.io/login +- Ask Platform team to add you to HCBC workspace +- Transfer data to HCBC S3: Ask Alex/Lorena. Files will be at our S3 bucket `input/rawdata` folder +- Prepare the CSV file according this [instructions](https://nf-co.re/rnaseq/3.14.0/docs/usage#multiple-runs-of-the-same-sample). File should look like this: + +```csv +sample,fastq_1,fastq_2,strandedness +CONTROL_REP1,s3path/AEG588A1_S1_L002_R1_001.fastq.gz,s3path/AEG588A1_S1_L002_R2_001.fastq.gz,auto +CONTROL_REP1,s3path/AEG588A1_S1_L003_R1_001.fastq.gz,s3path/AEG588A1_S1_L003_R2_001.fastq.gz,auto +CONTROL_REP1,s3path/AEG588A1_S1_L004_R1_001.fastq.gz,s3path/AEG588A1_S1_L004_R2_001.fastq.gz,auto +``` + +Use `bcbio_nfcore_check(csv_file)` to check the file is correct. + +You can add more columns to this file with more metadata, and use this file as the `coldata` file in the templates. + +- Safe the file under `meta` folder +- Upload this file to our `Datasets` in Seqera using the name of the project but starting with `rnaseq-pi_lastname-hbc_code` +- Go to `Launchpad`, select `nf-core_rnaseq` pipeline, and select the previous created `Datasets` in the `input` parameter after clicking in `Browser` + - Select an output directory with the same name used for the `Dataset` inside the `results` folder in S3 +- When pipeline is done, data will be copied to our on-premise HPC in the scratch system under `scratch/groups/hsph/hbc/bcbio/` folder + + ## Nextflow in O2 Example of running in single node Nextflow/nf-core/rnaseq in O2. @@ -23,4 +48,49 @@ export NXF_APPTAINER_CACHEDIR=/n/app/singularity/containers/shared/bcbio/ export NXF_SINGULARITY_LIBRARYDIR=/n/app/singularity/containers/shared/bcbio/ ./nextflow run nf-core/rnaseq -profile singularity,test --outdir here -``` \ No newline at end of file +``` + +## Nextflow in FAS + + +``` +module load jdk/21.0.2-fasrc01 +``` + +Use nextflow at `/n/holylfs05/LABS/hsph_bioinfo/Lab/shared_resources/nextflow` + +Use config file at `/n/holylfs05/LABS/hsph_bioinfo/Lab/shared_resources/nextflow/fas.config` + +Example command to run in an interactive job: + +``` +/n/holylfs05/LABS/hsph_bioinfo/Lab/shared_resources/nextflow run nf-core/rnaseq -profile test,singularity --outdir tmp -c /n/holylfs05/LABS/hsph_bioinfo/Lab/shared_resources/nextflow/fas.config +``` + +For non-test data, this is the head job you need to submit: + + +``` +#!/bin/bash + +#SBATCH --job-name=Nextflow # Job name +#SBATCH --partition=shared # Partition name +#SBATCH --time=0-48:59 # Runtime in D-HH:MM format +#SBATCH --nodes=1 # Number of nodes (keep at 1) +#SBATCH --ntasks=1 # Number of tasks per node (keep at 1) +#SBATCH --mem=16G # Memory needed per node (total) +#SBATCH --error=jobid_%j.err # File to which STDERR will be written, including job ID +#SBATCH --output=jobid_%j.out # File to which STDOUT will be written, including job ID +#SBATCH --mail-type=ALL # Type of email notification (BEGIN, END, FAIL, ALL) + +module load jdk/21.0.2-fasrc01 + +export NXF_APPTAINER_CACHEDIR=/n/holylfs05/LABS/hsph_bioinfo/Lab/shared_resources/nextflow/nfcore-rnaseq +export NXF_SINGULARITY_LIBRARYDIR=/n/holylfs05/LABS/hsph_bioinfo/Lab/shared_resources/nextflow/nfcore-rnaseq + +OUTPUT=path_to_results + +/n/holylfs05/LABS/hsph_bioinfo/Lab/shared_resources/nextflow run nf-core/rnaseq -profile singularity \\ + -c analysis.config \\ + --outdir $OUTPUT -c /n/holylfs05/LABS/hsph_bioinfo/Lab/shared_resources/nextflow/fas.config +```