add info for fas nextflow

hbc · Jun 14, 2024 · 79e34d1 · 79e34d1
1 parent 28929d4
commit 79e34d1
Show file tree

Hide file tree

Showing 4 changed files with 147 additions and 4 deletions.
diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
@@ -0,0 +1,29 @@
+name: ci 
+on:
+  push:
+    branches:
+      - master 
+      - main
+permissions:
+  contents: write
+jobs:
+  deploy:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+      - name: Configure Git Credentials
+        run: |
+          git config user.name github-actions[bot]
+          git config user.email 41898282+github-actions[bot]@users.noreply.github.com
+      - uses: actions/setup-python@v5
+        with:
+          python-version: 3.x
+      - run: echo "cache_id=$(date --utc '+%V')" >> $GITHUB_ENV 
+      - uses: actions/cache@v4
+        with:
+          key: mkdocs-material-${{ env.cache_id }}
+          path: .cache
+          restore-keys: |
+            mkdocs-material-
+      - run: pip install mkdocs-material 
+      - run: mkdocs gh-deploy --force
diff --git a/docs/devops.md b/docs/devops.md
@@ -0,0 +1,40 @@
+# Build platforms
+
+
+## Configure to use posit package manager
+
+[source](https://packagemanager.posit.co/client/#/repos/bioconductor/setup?bioconductor_version=3.18)
+
+```
+# Configure BioCManager to use Posit Package Manager:
+options(BioC_mirror = "https://packagemanager.posit.co/bioconductor")
+options(BIOCONDUCTOR_CONFIG_FILE = "https://packagemanager.posit.co/bioconductor/config.yaml")
+
+# Configure a CRAN snapshot compatible with Bioconductor 3.18:
+options(repos = c(CRAN = "https://packagemanager.posit.co/cran/2024-05-01"))
+```
+
+### Installing sc packages in O2
+
+### CELLCHAT
+
+```
+module load  miniconda3/23.1.0
+
+conda create -p /n/app/bcbio/R4.3.1_python -c conda-forge umap-learn
+
+BiocManager::install("BiocNeighbors")
+install.packages('NMF')
+install.packages("circlize")
+devtools::install_github("jinworks/CellChat")
+
+library(reticulate)
+# create a new environment 
+virtualenv_create("/n/app/bcbio/R4.3.1_python_sc_20240522snapshot")
+# install umap
+virtualenv_install("/n/app/bcbio/R4.3.1_python_sc_20240522snapshot", "umap-learn")
+# indicate that we want to use a specific virtualenv
+use_virtualenv("/n/app/bcbio/R4.3.1_python_sc_20240522snapshot")
+py_module_available(module = 'umap')
+reticulate::import(module = "umap", delay_load = TRUE)
+```
diff --git a/docs/index.md b/docs/index.md
@@ -1,15 +1,19 @@
 # Welcome to HCBC Platform
 
-## Stable R env
+Welcome to our Platform guidelines web-page. Here analysts and developers will find guidelines on how to work with our most common environments.
+
+## Environments
 
 * O2 open OnDemand: 
     * For RNAseq and scRNAseq analysis use `/n/app/bcbio/R4.3.1`
-    * With these modules: `gcc/9.2.0 imageMagick/7.1.0 geos/3.10.2 cmake/3.22.2 R/4.3.1 fftw/3.3.10 gdal/3.1.4 udunits/2.2.28`
+    * With these modules on: `gcc/9.2.0 imageMagick/7.1.0 geos/3.10.2 cmake/3.22.2 R/4.3.1 fftw/3.3.10 gdal/3.1.4 udunits/2.2.28  boost/1.75.0`
     * With no `bcbio` in your `PATH`
 
 ## Supported Templates
 
-Install `bcbioR` as indicated here: `https://github.com/bcbio/bcbioR/tree/main`
+We used `bcbioR` to deploy folders and code to our project directories to improve robustness in our analysis.
+
+You can install `bcbioR` as indicated here: `https://github.com/bcbio/bcbioR/tree/main`
 
 RNAseq ![](https://img.shields.io/badge/status-alpha-blue)
 TEASeq ![](https://img.shields.io/badge/status-concept-yellow)

diff --git a/docs/pipelines.md b/docs/pipelines.md
@@ -1,5 +1,30 @@
 # Introduction to HCBC pipelines
 
+## Nextflow in Seqera platform
+
+- Create an user here: https://cloud.seqera.io/login
+- Ask Platform team to add you to HCBC workspace
+- Transfer data to HCBC S3: Ask Alex/Lorena. Files will be at our S3 bucket `input/rawdata` folder
+- Prepare the CSV file according this [instructions](https://nf-co.re/rnaseq/3.14.0/docs/usage#multiple-runs-of-the-same-sample). File should look like this:
+
+```csv
+sample,fastq_1,fastq_2,strandedness
+CONTROL_REP1,s3path/AEG588A1_S1_L002_R1_001.fastq.gz,s3path/AEG588A1_S1_L002_R2_001.fastq.gz,auto
+CONTROL_REP1,s3path/AEG588A1_S1_L003_R1_001.fastq.gz,s3path/AEG588A1_S1_L003_R2_001.fastq.gz,auto
+CONTROL_REP1,s3path/AEG588A1_S1_L004_R1_001.fastq.gz,s3path/AEG588A1_S1_L004_R2_001.fastq.gz,auto
+```
+
+Use `bcbio_nfcore_check(csv_file)` to check the file is correct.
+
+You can add more columns to this file with more metadata, and use this file as the `coldata` file in the templates.
+
+- Safe the file under `meta` folder
+- Upload this file to our `Datasets` in Seqera using the name of the project but starting with `rnaseq-pi_lastname-hbc_code`
+- Go to `Launchpad`, select `nf-core_rnaseq` pipeline, and select the previous created `Datasets` in the `input` parameter after clicking in `Browser`
+  - Select an output directory with the same name used for the `Dataset` inside the `results` folder in S3
+- When pipeline is done, data will be copied to our on-premise HPC in the scratch system under `scratch/groups/hsph/hbc/bcbio/` folder
+
+
 ## Nextflow in O2
 
 Example of running in single node Nextflow/nf-core/rnaseq in O2.
@@ -23,4 +48,49 @@ export NXF_APPTAINER_CACHEDIR=/n/app/singularity/containers/shared/bcbio/
 export NXF_SINGULARITY_LIBRARYDIR=/n/app/singularity/containers/shared/bcbio/
 
 ./nextflow run nf-core/rnaseq -profile singularity,test --outdir here
-```
+```
+
+## Nextflow in FAS
+
+
+```
+module load jdk/21.0.2-fasrc01
+```
+
+Use nextflow at `/n/holylfs05/LABS/hsph_bioinfo/Lab/shared_resources/nextflow`
+
+Use config file at `/n/holylfs05/LABS/hsph_bioinfo/Lab/shared_resources/nextflow/fas.config`
+
+Example command to run in an interactive job:
+
+```
+/n/holylfs05/LABS/hsph_bioinfo/Lab/shared_resources/nextflow run nf-core/rnaseq -profile test,singularity --outdir tmp -c /n/holylfs05/LABS/hsph_bioinfo/Lab/shared_resources/nextflow/fas.config
+```
+
+For non-test data, this is the head job you need to submit:
+
+
+```
+#!/bin/bash
+
+#SBATCH --job-name=Nextflow      # Job name
+#SBATCH --partition=shared            # Partition name
+#SBATCH --time=0-48:59                 # Runtime in D-HH:MM format
+#SBATCH --nodes=1                      # Number of nodes (keep at 1)
+#SBATCH --ntasks=1                     # Number of tasks per node (keep at 1)
+#SBATCH --mem=16G                     # Memory needed per node (total)
+#SBATCH --error=jobid_%j.err           # File to which STDERR will be written, including job ID
+#SBATCH --output=jobid_%j.out          # File to which STDOUT will be written, including job ID
+#SBATCH --mail-type=ALL                # Type of email notification (BEGIN, END, FAIL, ALL)
+
+module load jdk/21.0.2-fasrc01
+
+export NXF_APPTAINER_CACHEDIR=/n/holylfs05/LABS/hsph_bioinfo/Lab/shared_resources/nextflow/nfcore-rnaseq
+export NXF_SINGULARITY_LIBRARYDIR=/n/holylfs05/LABS/hsph_bioinfo/Lab/shared_resources/nextflow/nfcore-rnaseq
+
+OUTPUT=path_to_results
+
+/n/holylfs05/LABS/hsph_bioinfo/Lab/shared_resources/nextflow run nf-core/rnaseq -profile singularity \\
+  -c analysis.config \\
+  --outdir $OUTPUT -c /n/holylfs05/LABS/hsph_bioinfo/Lab/shared_resources/nextflow/fas.config
+```