Pipeline for extending conventional genome-scale models

Pranas Grigaitis, 2019-2020

Dept. Modelling of Biological Processes, BioQuant/COS Heidelberg, Heidelberg University, Heidelberg, Germany & Systems Biology Lab, AIMMS, Vrije Universiteit Amsterdam, Amsterdam, the Netherlands

SBML3 model

Fully working SBML3 FBCv2 model generated with this pipeline is available via this link or per request via email (the file is too large to be uploaded here).

Dependencies

NCBI BLAST+ for search of homologous protein sequences using the SwissProt protein sequence DB
CBMPy and its dependencies.

For a fresh user, the following commands should be applied to install the required packages through apt (Debian-based GNU/Linux), as well as pip (Python packages):

# apt install ncbi-blast+ python-pip

$ update_blastdb --decompress swissprot

$ pip install --user numpy pandas scipy sympy cbmpy python-libsbml matplotlib xlrd xlwt

Also, a LP solver is required. Either GLPK or CPLEX are supported by CBMPy. GLPK is open-source, and academic customers can acquire CPLEX for free. Note: for using GLPK, please follow the instructions on installation, provided in the CBMPy repository.

Setup

The main.py pipeline also contains all the information, needed to get the proteome information and kinetic data. The following organism-dependent parameters must be specified:

organism =: the name of biological species; proteomeID =: the proteome ID from the UniProt proteome database. Note: the kcats.py file uses the proteome ID to form a string proteome. This might be needed to be modified when working with different species than Escherichia coli.

Also, the pipeline uses BRENDA database to fetch the kinetic data. For use, one has to register (free of charge) and provide the user e-mail and password as the following variables:

brendaEmail =: the registration email of the user; brendaPassword =: the password. Note: due the fact that one has to submit the password as plain text, using these scripts should be performed in a strictly-local directory.

Usage

The main pipeline is started using: $ python main.py <modelLocation>, where <modelLocation> is the location of the stoichiometric model in SBML (.xml) format. The output will be provided in the same folder.

Depending on the machine, the process might take some hours, but has to be run once to generate a working model.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
code		code
publicationData		publicationData
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Pipeline for extending conventional genome-scale models

SBML3 model

Dependencies

Setup

Usage

About

Releases

Packages

Languages

License

pranasag/extendedEcoliGEM

Folders and files

Latest commit

History

Repository files navigation

Pipeline for extending conventional genome-scale models

SBML3 model

Dependencies

Setup

Usage

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages