Skip to content

Commit

Permalink
Merge branch 'master' into master
Browse files Browse the repository at this point in the history
  • Loading branch information
jluebeck authored Jul 12, 2022
2 parents eb5d935 + 446950f commit d0db133
Show file tree
Hide file tree
Showing 5 changed files with 14 additions and 4 deletions.
11 changes: 8 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ To use with an existing installation please extract and place the mm10 directory
Upstream and downstream tools (PrepareAA, AmpliconClassifier, CycleViz) are also enabled to accept the `--ref mm10`
argument.


**[Older update descriptions are available here.](https://docs.google.com/document/d/1jqnCs46hrpYGBGrZQFop31ezskyludxNJEQdZONwFdc/edit?usp=sharing)**

## Introduction
Expand Down Expand Up @@ -46,7 +47,7 @@ AmpliconArchitect was developed by Viral Deshpande, and is maintained by Jens Lu
* `export MOSEKLM_LICENSE_FILE=<Parent directory of mosek.lic> >> ~/.bashrc && source ~/.bashrc`
3. Download AA data repositories and set environment variable AA_DATA_REPO:
* Download from ` https://datasets.genepattern.org/?prefix=data/module_support_files/AmpliconArchitect/`

#### Usage:

`$AA --bam {input_bam} --bed {bed file} --out {prefix_of_output_files} <optional arguments>`
Expand All @@ -71,6 +72,7 @@ AA can be installed in 2 ways:
* `export MOSEKLM_LICENSE_FILE=<Parent directory of mosek.lic> >> ~/.bashrc && source ~/.bashrc`
3. Download AA data repositories and set environment variable AA_DATA_REPO:
* Download from ` https://datasets.genepattern.org/?prefix=data/module_support_files/AmpliconArchitect/`

* Set enviroment variable AA_DATA_REPO to point to the data_repo directory:
```bash
mkdir data_repo && cd data_repo
Expand Down Expand Up @@ -119,6 +121,7 @@ sudo python2 get-pip.py


Note that 0.15.2 is the last version of pysam which appears to support pip2 installation, however AA itself supports the more recent versions.

4. Mosek optimization tool version 8.x (https://www.mosek.com/). **Due to breaking changes in the newer versions of Mosek, we require version 8 to be used**:
```bash
wget http://download.mosek.com/stable/8.0.0.60/mosektoolslinux64x86.tar.bz2
Expand All @@ -135,6 +138,7 @@ cd $PWD/mosek/8/tools/platform/linux64x86/python/3/
# sudo python2 setup.py install #(--user)

sudo python3 setup.py install #(--user) [can also build locally with "pip install -e ."]

cd -
source ~/.bashrc
```
Expand Down Expand Up @@ -176,7 +180,7 @@ AA can also be run through Nextflow, using the [nf-core/circdna pipeline](https:

#### PrepareAA:
We provide a wrapper for jumping off at any intermediate step including generating the prerequisite BAM alignments with BWA, CNV calls for seeding and CNV seed selection. PrepareAA is available at https://github.com/jluebeck/PrepareAA. We recommmend this for users who are less experienced with AA as it greatly simplifies the process of selecting CNV seed regions to feed to AA.
PrepareAA can directly invoke AA if installed.
PrepareAA can directly invoked AA if installed.


### 1) Input data:
Expand All @@ -194,8 +198,9 @@ AA requires 2 input files:
- CNVs from CNV caller ReadDepth (with parameter file `$AA_SRC/src/read_depth_params`), Canvas and CNVkit
- Select CNVs with copy number > 5x and size > 100kbp (default) and merge adjacent CNVs into a single interval using:

`python $AA_SRC/amplified_intervals.py --bed {read_depth_folder}/output/alts.dat --out {outFileNamePrefix} --bam {BamFileName}`
`python2 $AA_SRC/amplified_intervals.py --bed {read_depth_folder}/output/alts.dat --out {outFileNamePrefix} --bam {BamFileName} --ref {ref}`
- ***Note that this preprocessing step is critical to AA as it removes low-mappability and low-complexity regions.
- Optional argument `--ref` should match the name of the folder in `data_repo` which corresponds to the version of human reference genome used in the BAM file.

### 2) Usage:
`$AA --bam {input_bam} --bed {bed file} --out {prefix_of_output_files} <optional arguments>`
Expand Down
1 change: 1 addition & 0 deletions docker/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@ RUN fc-cache -f

RUN pip2 install -r /home/requirements/pip2_requirements.txt


RUN cd /home/programs && wget http://download.mosek.com/stable/8.0.0.60/mosektoolslinux64x86.tar.bz2
RUN cd /home/programs && tar xf mosektoolslinux64x86.tar.bz2
# ADD mosek.lic /home/programs/mosek/8/licenses/mosek.lic
Expand Down
3 changes: 2 additions & 1 deletion src/AmpliconArchitect.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
#!/usr/bin/env python


# This software is Copyright 2017 The Regents of the University of California. All Rights Reserved. Permission to copy, modify, and distribute this software and its documentation for educational, research and non-profit purposes, without fee, and without a written agreement is hereby granted, provided that the above copyright notice, this paragraph and the following three paragraphs appear in all copies. Permission to make commercial use of this software may be obtained by contacting:
#
# Office of Innovation and Commercialization
Expand Down Expand Up @@ -42,7 +43,6 @@
else:
from cStringIO import StringIO


import global_names

__version__ = "1.3_r1"
Expand Down Expand Up @@ -188,6 +188,7 @@ def process(self, msg, kwargs):
else:
logging.debug("#TIME " + '%.3f\t'%(time() - TSTART) + "cstats not found, generating coverage statistics... ")


coverage_windows=None
if cbed is not None:
coverage_windows=hg.interval_list(cbed, 'bed')
Expand Down
2 changes: 2 additions & 0 deletions src/amplified_intervals.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@
GAIN = 4.5
CNSIZE_MIN = 50000


parser = argparse. \
ArgumentParser(description="Filter and merge amplified intervals")
parser.add_argument('--bed', dest='bed',
Expand Down Expand Up @@ -111,6 +112,7 @@

coverage_stats_file.close()


bamFileb2b = b2b.bam_to_breakpoint(bamFile, coverage_stats=cstats)
pre_int_list = []
for r in rdList:
Expand Down
1 change: 1 addition & 0 deletions src/bam_to_breakpoint.py
Original file line number Diff line number Diff line change
Expand Up @@ -1037,6 +1037,7 @@ def edge_has_high_entropy(self, read_list):
bp2_entropy = max([stats.entropy(np.unique([x for x in rr[1].query_alignment_sequence.upper() if x != 'N'], return_counts=True)[1]) for rr in read_list])

logging.debug("#TIME " + '%.3f\t'%(time() - TSTART) + " breakpoint_entropy: %.3f %.3f" % (bp1_entropy, bp2_entropy))

if bp1_entropy < self.breakpoint_entropy_cutoff:
return False
if bp2_entropy < self.breakpoint_entropy_cutoff:
Expand Down

0 comments on commit d0db133

Please sign in to comment.