Skip to content

Commit

Permalink
Merge pull request #3 from simpsonlab/add_cyto_fix_script
Browse files Browse the repository at this point in the history
Add cyto fix script
  • Loading branch information
rdeborja authored Feb 2, 2022
2 parents 2176b7b + b031312 commit c3f487d
Show file tree
Hide file tree
Showing 3 changed files with 54 additions and 0 deletions.
5 changes: 5 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,11 @@ There are two required files for running breakpoint analysis:
* `metadata` which is a tab separated file containing `sample`, band for `region1` and band for `region2`
* `cytobands` which contains the genomic regions and their corresponding cytogenetic bands

The cytobands file that is downloaded from the UCSC site requires an
additional column to work with the analysis pipeline. The script
`workflow/scripts/fix_cytoband_file.py` can be used to append the
additional column.

### Run the workflow
The basecaller uses `guppy` and a GPU to convert FAST5 files to FASTQ files.
```
Expand Down
Empty file modified workflow/scripts/createivf_breakpoints.py
100644 → 100755
Empty file.
49 changes: 49 additions & 0 deletions workflow/scripts/fix_cytoband_file.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
#!/usr/bin/env python

"""
Append a column to the UCSC cytoband file
"""

import os
import sys
import argparse
import csv


def init_args():
"""
Initialize command line arguments
"""
description = 'Append chromosome identifier column to UCSC cytoband file'
parser = argparse.ArgumentParser(description=description)
parser.add_argument('-f', '--file', required=True,
help='full path to UCSC cytoband file')
return parser.parse_args()


def main():
"""
Main program
"""
args = init_args()
fieldnames = ['chrom', 'start', 'end', 'band', 'desc']
with open(args.file, 'r') as ifh:
reader = csv.DictReader(ifh, delimiter='\t', fieldnames=fieldnames)
for line in reader:
chr_cyto = ''.join(['chr', line['band']])
print('\t'.join([
line['chrom'],
line['start'],
line['end'],
line['band'],
line['desc'],
chr_cyto
]))
ifh.close()


if __name__ == '__main__':
main()


#__END__

0 comments on commit c3f487d

Please sign in to comment.