The first thing you probably want to do is build a model for your genome. Because of the tight interplay between PyFBA, PATRIC, RAST, SEED, and Model SEED, the easiest way to get started is to run your genome through RAST or PATRIC.
We are going to assume that you have annotated your genome through the PATRIC annotation
pipeline. Once you have your annotated genome, download the features
file that will
be called _<GENOME NAME>_.features.txt
This file should have five columns: feature_id location type function aliases protein_md5
If you have done that and built the model using the Model SEED, you can download the SBML file from RAST, and try either run_fba_sbml.py or sbml_to_fba.py. Both of these should give similar, but not identical answers to the answer that you got from the model SEED1.
If you have downloaded the annotation, there are two essential steps that you need to take to create a model:
- Convert the genome annotation to reactions
- Gapfill the reactions on different media.
You can use the example code to get started. First, we will create a set of reactions from your genome. First, download the assigned_functions file from the genome directory, or get a list of all funcational roles in your genome. Next, convert those to a list of reactions, using one of two commands:
If you have an assigned_functions file that has [peg, functional role]:
python example_code/assigned_functions_to_reactions.py -a assigned_functions > reactions.list
or, if you just have a list of functional roles, one per line:
python example_code/assigned_functions_to_reactions.py -r functional_roles > reactions.list
We build a model from the reactions and then try and gap fill it using all of our approaches.
python scripts/gapfill_from_reactions.py -r reactions.list -m MOPS_NoC_Alpha-D-Glucose.txt > out.txt 2> out.err
This will use the media file MOPS_NoC_Alpha-D-Glucose.txt
and try and gap fill the model so that it grows on this
media.
1 The reason that the answers are similar, but not identical, is because the linear solvers give slightly different answers. The Model SEED uses a commercial linear solver, but you are probably using GLPK. ↩