Skip to content

Commit

Permalink
add info for fas nextflow
Browse files Browse the repository at this point in the history
  • Loading branch information
lpantano committed Jun 14, 2024
1 parent 28929d4 commit 79e34d1
Show file tree
Hide file tree
Showing 4 changed files with 147 additions and 4 deletions.
29 changes: 29 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
name: ci
on:
push:
branches:
- master
- main
permissions:
contents: write
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Configure Git Credentials
run: |
git config user.name github-actions[bot]
git config user.email 41898282+github-actions[bot]@users.noreply.github.com
- uses: actions/setup-python@v5
with:
python-version: 3.x
- run: echo "cache_id=$(date --utc '+%V')" >> $GITHUB_ENV
- uses: actions/cache@v4
with:
key: mkdocs-material-${{ env.cache_id }}
path: .cache
restore-keys: |
mkdocs-material-
- run: pip install mkdocs-material
- run: mkdocs gh-deploy --force
40 changes: 40 additions & 0 deletions docs/devops.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
# Build platforms


## Configure to use posit package manager

[source](https://packagemanager.posit.co/client/#/repos/bioconductor/setup?bioconductor_version=3.18)

```
# Configure BioCManager to use Posit Package Manager:
options(BioC_mirror = "https://packagemanager.posit.co/bioconductor")
options(BIOCONDUCTOR_CONFIG_FILE = "https://packagemanager.posit.co/bioconductor/config.yaml")
# Configure a CRAN snapshot compatible with Bioconductor 3.18:
options(repos = c(CRAN = "https://packagemanager.posit.co/cran/2024-05-01"))
```

### Installing sc packages in O2

### CELLCHAT

```
module load miniconda3/23.1.0
conda create -p /n/app/bcbio/R4.3.1_python -c conda-forge umap-learn
BiocManager::install("BiocNeighbors")
install.packages('NMF')
install.packages("circlize")
devtools::install_github("jinworks/CellChat")
library(reticulate)
# create a new environment
virtualenv_create("/n/app/bcbio/R4.3.1_python_sc_20240522snapshot")
# install umap
virtualenv_install("/n/app/bcbio/R4.3.1_python_sc_20240522snapshot", "umap-learn")
# indicate that we want to use a specific virtualenv
use_virtualenv("/n/app/bcbio/R4.3.1_python_sc_20240522snapshot")
py_module_available(module = 'umap')
reticulate::import(module = "umap", delay_load = TRUE)
```
10 changes: 7 additions & 3 deletions docs/index.md
Original file line number Diff line number Diff line change
@@ -1,15 +1,19 @@
# Welcome to HCBC Platform

## Stable R env
Welcome to our Platform guidelines web-page. Here analysts and developers will find guidelines on how to work with our most common environments.

## Environments

* O2 open OnDemand:
* For RNAseq and scRNAseq analysis use `/n/app/bcbio/R4.3.1`
* With these modules: `gcc/9.2.0 imageMagick/7.1.0 geos/3.10.2 cmake/3.22.2 R/4.3.1 fftw/3.3.10 gdal/3.1.4 udunits/2.2.28`
* With these modules on: `gcc/9.2.0 imageMagick/7.1.0 geos/3.10.2 cmake/3.22.2 R/4.3.1 fftw/3.3.10 gdal/3.1.4 udunits/2.2.28 boost/1.75.0`
* With no `bcbio` in your `PATH`

## Supported Templates

Install `bcbioR` as indicated here: `https://github.com/bcbio/bcbioR/tree/main`
We used `bcbioR` to deploy folders and code to our project directories to improve robustness in our analysis.

You can install `bcbioR` as indicated here: `https://github.com/bcbio/bcbioR/tree/main`

RNAseq ![](https://img.shields.io/badge/status-alpha-blue)
TEASeq ![](https://img.shields.io/badge/status-concept-yellow)
Expand Down
72 changes: 71 additions & 1 deletion docs/pipelines.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,30 @@
# Introduction to HCBC pipelines

## Nextflow in Seqera platform

- Create an user here: https://cloud.seqera.io/login
- Ask Platform team to add you to HCBC workspace
- Transfer data to HCBC S3: Ask Alex/Lorena. Files will be at our S3 bucket `input/rawdata` folder
- Prepare the CSV file according this [instructions](https://nf-co.re/rnaseq/3.14.0/docs/usage#multiple-runs-of-the-same-sample). File should look like this:

```csv
sample,fastq_1,fastq_2,strandedness
CONTROL_REP1,s3path/AEG588A1_S1_L002_R1_001.fastq.gz,s3path/AEG588A1_S1_L002_R2_001.fastq.gz,auto
CONTROL_REP1,s3path/AEG588A1_S1_L003_R1_001.fastq.gz,s3path/AEG588A1_S1_L003_R2_001.fastq.gz,auto
CONTROL_REP1,s3path/AEG588A1_S1_L004_R1_001.fastq.gz,s3path/AEG588A1_S1_L004_R2_001.fastq.gz,auto
```

Use `bcbio_nfcore_check(csv_file)` to check the file is correct.

You can add more columns to this file with more metadata, and use this file as the `coldata` file in the templates.

- Safe the file under `meta` folder
- Upload this file to our `Datasets` in Seqera using the name of the project but starting with `rnaseq-pi_lastname-hbc_code`
- Go to `Launchpad`, select `nf-core_rnaseq` pipeline, and select the previous created `Datasets` in the `input` parameter after clicking in `Browser`
- Select an output directory with the same name used for the `Dataset` inside the `results` folder in S3
- When pipeline is done, data will be copied to our on-premise HPC in the scratch system under `scratch/groups/hsph/hbc/bcbio/` folder


## Nextflow in O2

Example of running in single node Nextflow/nf-core/rnaseq in O2.
Expand All @@ -23,4 +48,49 @@ export NXF_APPTAINER_CACHEDIR=/n/app/singularity/containers/shared/bcbio/
export NXF_SINGULARITY_LIBRARYDIR=/n/app/singularity/containers/shared/bcbio/
./nextflow run nf-core/rnaseq -profile singularity,test --outdir here
```
```

## Nextflow in FAS


```
module load jdk/21.0.2-fasrc01
```

Use nextflow at `/n/holylfs05/LABS/hsph_bioinfo/Lab/shared_resources/nextflow`

Use config file at `/n/holylfs05/LABS/hsph_bioinfo/Lab/shared_resources/nextflow/fas.config`

Example command to run in an interactive job:

```
/n/holylfs05/LABS/hsph_bioinfo/Lab/shared_resources/nextflow run nf-core/rnaseq -profile test,singularity --outdir tmp -c /n/holylfs05/LABS/hsph_bioinfo/Lab/shared_resources/nextflow/fas.config
```

For non-test data, this is the head job you need to submit:


```
#!/bin/bash
#SBATCH --job-name=Nextflow # Job name
#SBATCH --partition=shared # Partition name
#SBATCH --time=0-48:59 # Runtime in D-HH:MM format
#SBATCH --nodes=1 # Number of nodes (keep at 1)
#SBATCH --ntasks=1 # Number of tasks per node (keep at 1)
#SBATCH --mem=16G # Memory needed per node (total)
#SBATCH --error=jobid_%j.err # File to which STDERR will be written, including job ID
#SBATCH --output=jobid_%j.out # File to which STDOUT will be written, including job ID
#SBATCH --mail-type=ALL # Type of email notification (BEGIN, END, FAIL, ALL)
module load jdk/21.0.2-fasrc01
export NXF_APPTAINER_CACHEDIR=/n/holylfs05/LABS/hsph_bioinfo/Lab/shared_resources/nextflow/nfcore-rnaseq
export NXF_SINGULARITY_LIBRARYDIR=/n/holylfs05/LABS/hsph_bioinfo/Lab/shared_resources/nextflow/nfcore-rnaseq
OUTPUT=path_to_results
/n/holylfs05/LABS/hsph_bioinfo/Lab/shared_resources/nextflow run nf-core/rnaseq -profile singularity \\
-c analysis.config \\
--outdir $OUTPUT -c /n/holylfs05/LABS/hsph_bioinfo/Lab/shared_resources/nextflow/fas.config
```

0 comments on commit 79e34d1

Please sign in to comment.