Skip to content

lumc-python/day3_assignments

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 

Repository files navigation

Finding the most common 7-mer in a FASTA file

Your task

Write a script to print out the most common 7-mer and its GC percentage from all the sequences in data/records.fa. You are free to reuse your existing toolbox.

  • The example FASTA file was adapted from: Genome Biology DNA60 Bioinformatics Challenge.

Hints

Challenges

  • Find out how to change your script so that it can read from data/challenge.fa.gz without unzipping the file first (hint: check standard library).
  • Can you add a command line argument parser such that you are able to specify the path towards the input file from the command line?
  • Can you change the parser so that there is an option flag to tell the program whether the input file is gzipped or not?
  • Can you change your script so that it works for any N-mers instead of for just 7-mers?

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published