From a8b11d5381c5c039fe3f4ef60026c7ce470a5664 Mon Sep 17 00:00:00 2001
From: ymoon06 <y.moon@unibas.ch>
Date: Tue, 22 Aug 2023 13:25:13 +0200
Subject: [PATCH] removed unnecessary  README

removed unnecessary  README
---
 src/README.md | 129 --------------------------------------------------
 1 file changed, 129 deletions(-)
 delete mode 100644 src/README.md

diff --git a/src/README.md b/src/README.md
deleted file mode 100644
index f68d951..0000000
--- a/src/README.md
+++ /dev/null
@@ -1,129 +0,0 @@
-# SCINPAS (Single Cell Identification of Novel PolyA Sites)
-
-## Description
-SCINPAS is a nextflow pipeline that identifies previously known and novel polyA sites
-directly from single cell RNA sequencing data.
-
-## Workflow
-  ### general workflow
-  ![](overall_workflow.png)
-  
-  ### read classification into 5 categories
-  ![](classification.png)
-
-  ### analyses
-  There are different layers of analyses. and hence you need to use the relevant parameters for running the pipeline.
-  Please refer to this figure: 
-  ![](analysis.png)
-
-	
-## Requirements
-0) default directory is set as follows
-![](directory.png)
-
-1) installation of nextflow and dependencies.
-
-```bash
-mamba create -n nf-env nextflow
-```
-
-2) Data must be single cell 3'end RNA sequencing data.
-At the moment, the pipeline supports 10X genomics 3'end sequencing data.
-
-3) Make sure all scripts (python, nextflow) are located in the "src" folder.
-
-4) Make sure all mouse data (sample, negative controls, gtf and fasta) are located in "data/mouse" folder
-
-5) Make sure all human data (sample, negative controls, gtf and fasta) are located in "data/human" folder
-
-6) Make sure `canonical_motives.csv` are located in the "data" folder. (common to both human and mouse)
-
-7) Input file format must be: 10X_A_B.bam.(and 10X_A.B.bam.bai), where A and B are sample name parts.
-
-8) gtf file is named as: `genes.gtf`
-
-9) reference genome is named as: `genome.fa` (and `genome.fa.fai`)
-
-10) raw negative control must be structured as: *10X_A_BUmiRaw.bam (and *10X_A_BUmiRaw.bam.bai)
-There should be at least 1 letter before 10X to differentiate between input file, by default "A" is used.
-
-11) deduplicated negative control must be structured as: *10X_A_BUmiDedup.bam (and *10X_A_BUmiDedup.bam.bai)
-There should be at least 1 letter before 10X to differentiate between input file, by default "A" is used.
-
-raw negative control refers to the raw bam file of one of sample data. (same data but named differently).
-deduplicated negative control refers to the UMI-tools deduplicated version of one of sample data. 
-
-12) type1 and type2 parameter in nextflow.config file refers to cell type 1 and cell type 2 in the dataset you used.
-Type1 is the default cell type. e.g. spermatocyte.
-Type2 is cell type that you expect changes in average terminal exon length and/or the number of intronic polyA sites. e.g. elongating spermatid.
-This is only relevant if you do "cell_type_analysis". 
-
-13) catalog.bed is needed for computing overlap between SCINPAS PAS and existing, known pA catalog.
-
-**Note: folder structures/locations, gtf file, catalog, reference genome and result folder can be changed in the nextflow.config file. 
-However, input file format, negative control format variable names in nextflow.config should not be changed because
-downstream processes expect that name.**
-
-**Note: if you do not have some input files (e.g. control, celltype annotation, catalog.bed), processes which need those files will not be executed. The rest of the processes will run.**
-
-## Command line
-
-> Note: execution shown for slurm cluster. 
-> Create and select other profile as fit.
-
-Once you made a conda environment and activated the environment (conda activate nf-env), traverse into src folder and run the nextflow command as follows:
-
-1. Running mouse samples:
-
-	1.1. if you do not want to run analysis:
-	nextflow run main.nf -profile slurm -resume --sample_type "mouse"
-
-	1.2. if you want to do analysis but not (cell type specific and overlap analysis, gene_coverage): 
-	nextflow run main.nf -profile slurm -resume --sample_type "mouse" --analysis "yes" 
-
-	1.3. if you want to do analysis including cell type specific analysis: 
-	nextflow run main.nf -profile slurm -resume --sample_type "mouse" --analysis "yes"  --cell_type_analysis "yes"
-
-	1.4. if you want to do analysis including analysis related to overlap (comparison between SCINPAS-induced PAS and pre-exsting catalog): 
-	nextflow run main.nf -profile slurm -resume --sample_type "mouse" --analysis "yes"  --overlap "yes"
-
-	1.5. if you want to do analysis including analysis related to gene coverage: 
-	nextflow run main.nf -profile slurm -resume --sample_type "mouse" --analysis "yes"  --g_coverage "yes"
-
-	1.6. if you want to do all analysis: 
-	nextflow run main.nf -profile slurm -resume --sample_type "mouse" --analysis "yes" --cell_type_analysis "yes" --overlap "yes" --g_coverage "yes"
-
-2. Running human samples:
-
-	2.1. if you do not want to run analysis:
-	nextflow run main.nf -profile slurm -resume --sample_type "human"
-
-	2.2. if you want to do analysis but not (cell type specific and overlap analysis, gene_coverage): 
-	nextflow run main.nf -profile slurm -resume --sample_type "human" --analysis "yes" 
-
-	2.3. if you want to do analysis including cell type specific analysis: 
-	nextflow run main.nf -profile slurm -resume --sample_type "human" --analysis "yes"  --cell_type_analysis "yes"
-
-	2.4. if you want to do analysis including analysis related to overlap (comparison between SCINPAS-induced PAS and pre-exsting catalog): 
-	nextflow run main.nf -profile slurm -resume --sample_type "human" --analysis "yes"  --overlap "yes"
-
-	2.5. if you want to do analysis including analysis related to gene coverage: 
-	nextflow run main.nf -profile slurm -resume --sample_type "human" --analysis "yes"  --g_coverage "yes"
-
-	2.6. if you want to do all analysis: 
-	nextflow run main.nf -profile slurm -resume --sample_type "human" --analysis "yes" --cell_type_analysis "yes" --overlap "yes" --g_coverage "yes"
-
-3. background running of the pipeline:
-	
-	By default, nextflow displays progression report to the screen. If you do not want that,
-	you can run "nohup" parameter so that progresison report is saved in the log file. Example command line is: 
-
-	nohup nextflow run main.nf -profile slurm -resume --sample_type "mouse" --analysis "yes" --cell_type_analysis "yes" --overlap "yes" --g_coverage "yes"
-
-4. Note:
-	
-	Running SCINPAS pipeline on the login node is not recommended despite it assign jobs to computing node.
-	This is because nexflow displays progression report on the screen which can consume i/o extensively on the login node.
-	Hence, it is recommended to login to computing node and run the pipeline there.
-
-For more nextflow commandline parameter options, refer to this website: https://www.nextflow.io/docs/latest