Skip to content

Latest commit

 

History

History
executable file
·
53 lines (31 loc) · 2.35 KB

README.mkd

File metadata and controls

executable file
·
53 lines (31 loc) · 2.35 KB

Photosystem genes in cyanophages

Background

This project was based on the results published in the following article: Photosystem I gene cassettes are present in marine virus genomes by Sharon et al. The authors identified a number of photosystem I genes in marine virus genomes by performing a tBLASTx search for photosynthesis genes in the contigs of the Global Ocean Sampling project.

Data

The analysis performed in the article was done on publicly available data from the Global Ocean Sampling project. The sequences used were contigs, so the assembly was already done by the GOS researchers. The data can be downloaded from the CAMERA portal. You can obviously use your own data if you like.

Goal

Here we will try to reproduce the results from the published article.

Run the analysis

If you want to run the analysis yourself, on the same or on your own data, you can clone this repository.

Dependencies

  • Python (I use 2.7)
  • A local version of the NCBI BLAST+ tools with the executables in your local PATH variable
  • gnu make

What is in this repository?

  • results/ : directory containing results of experiments. You'll find experiment-specific code there.
  • README file: this file
  • notes: notes taken while developing this project
  • Makefile: contains all commands needed to run this analysis. To run all commands, do make all.
  • data/: directory containing data. This mainly includes a collection of photosynthesis genes that are used as query for BLAST searches.

What is not in this repository?

  • The Global Ocean Sampling dataset, as these files are too large to put them here. You can download them yourself from the CAMERA portal and create a simlink in the data/ folder of this repo called input.fasta referencing to the Global Ocean Sampling fastafile. The one I use contains the assembled sequences. There is a rule in the makefile that will create a blastable database from that file.

Get the code

git clone thisrepo

Run the analysis

make all

View the Makefile for more documentation.

Contribute

Please feel free to contribute to this project. You can track my progress and you can fork this repository or propose changes if you like.