-
Notifications
You must be signed in to change notification settings - Fork 1
SBGN Creation and Validation
The web was carefully searched for available, re-useable SBGN map representing the Bachmann model. First, we consulted the model chosen by group 2 and the model belonging to the original publication on the Biomodels database for SBGN maps or SBGN-ML files.
Since neither of the models supplied an SBGN map or SBGN-ML we consulted the SBGN website to find databases and archives of SBGN maps.
Figure 1: Screenshot list of SBGN databases and collections
All of the databases/collections mentioned on the SBGN website featured the chosen Process Description. As we were interested in modifiable maps, we only queried databases that supported SBGN-ML export. The AsthmaMap, Metabolism Regulation Mpas, MetaCrop and Rheumatoid Arthritis Map were omitted, since the original publication didn't mention any connection to the focus of the aforementioned databases/collections.
In the Atlas of Cancer Signalling Networks we checked the Cell Survical Map for the involved proteins JAK2, STAT5, EpoR, SOC3, and CIS, but couldn't find any of these entities on the provided map.
The PANTHER Pathway's pathway section was searched for the terms Bachmann, JAK2, STAT5, EpoR, SOCS3 and CIS yielding more general representations or other aspects of the JAK-STAT-pathway excluding the dual feedback of interest.
The Reactome data base led similar results with the search terms Bachmann, JAK2, STAT5, EpoR, SOCS3 and CIS. The search results included some maps highlighting other aspects of the JAK-STAT-Pathway, e.g. erythropoetin activating STAT5.
Next, we searched the PathWiz with the aforementioned key words, and once again, got some maps featuring other aspects like EPO Signaling Pathway.
Last, we queried the Pathway Commons with the following terms: Bachmann, JAK2, STAT5, EpoR, SOCS3 and CIS. Digging through the various pathways, we could neither identify the map of the Bachmann model itself nor a satisfying map as starting point for adding the key message of the Bachmann model. For example, the map 'EPO signaling pathway' featured important compartments and phosphorylation stages, but lacked to transport the modulation of reactions and was rather confusing.
Since the search was not providing any useful results, we tried to re-use the SBML representation of the Bachmann model researched by group 2 by importing it into several SGBN editors.
Figure 2: Screenshot SBML Import in Newt Editor
As shown above, the import into Newt Editor yielded a complex and confusing network. Amongst other drawbacks, there was no distinction between the different types of modulation. The nuclear compartment was not integrated into the cytoplasma and the source for production and the sink for degradation weren't associated with neither of the compartments.
Next we tried the import into VANTED using the SGBN-ED add-on producing an even worse starting point. As shown below, some of the problems were the display of certain macromolecules as unspecified entities or process nodes displayed as rounded rectangles. Correcting or specifying each node and each arc did seemed error-prone.
Figure 3: Screenshot SBML Import in VANTED Editor using SBGN-ED add-on
Also, the import of a Biopax Level 3 model found in the Biomodels Database to different SBGN-compliant tools did not produce a comprehensive SBGN map (results not shown). Since this step did not yield any satisfying results, the SBGN network was developed from scratch.
The web tool Newt was used to develop the SBGN network representing the Bachmann model [1, 2]. To facilitate the process and to improve the readability and reusability the 10 tips published by Touré et al. were used as a guideline [3].
To create the first draft of the Bachmann model as an SBGN map, all necessary biological components were identified from the differential equations provided in the supplement to the Bachmann model publication and subseqently, they were added to the map. Then, all important reactions where represented by adding appropriate arcs.
While creating the diagram we encountered the problem that membrane-spanning macromolecules can only be assigned to one cell compartment, whereas their parts are actually located in different compartments. The current level 1 specification for Process Description language does not provide a solution and postponed this issue to a future specification level. However, the level 1 specification states three workarounds all coming with a trade-off [4]. We decided to assign the receptor complexes to the cytoplasma compartment. Unfortunately, Newt didn't allow to place it on the compartment boundary.
To check the validity and integrity of the drafted SBGN map, we first used the semantic validation feature of Newt. This feature is based on the LibSBGN javascript library [1, 5]. The validation feature declared the draft map as valid. Initial approaches to export an validate the SBGN map in SBGN-ML version 0.3 have been of no avail, because import into any of the chosen tool was not possible. Thus, we decided to work with the ealier SBGN-ML specification and exported the SBGN map in SBGN-ML 0.2 format. Remarkably, despite declaring the SBGN map as valid, Newt was not able to export the map in CellDesigner format, returning the exception that the conversion service was not available.
To ensure the interoperability of the exported SBGN-ML file, we aimed at importing the file into three different tools supporting SBGN-ML (compare section on tools).
The import of the Newt-exported SBGN-ML into VANTED with SBGN-ED plugin failed on the first attempt, returning that the file was not a valid SBGN file. The exception (cvc-datatype-valid.1.2.1) claimed that an issued value was not a valid value for 'NCName'. We found, that the issued value was a glyph ID automatically assigned by Newt. Therefore, we hypothesized that there is an issue with XML standards between the two platforms. As workaround, we renamed the corresponding glyph ID in the SBGN-ML code with manualID1
using Notepad++ (compare Toolbox). Iteratively, we renamed 20 glyph IDs manually to manualIDn
(with n meaning an incremental number) which was thereafter allowing the import into VANTED. The 20 glyph IDs returning the exception had in common, that the first digit was a number. Apparently, our replaced IDs had a character as first digit.
Next, the modified SBGN-ML was re-imported into Newt and the map was visually checked for correctness. As the SBGN map was correctly reproduced despite the manual changes of some glyph IDs, we exported the map as Scalable Vector Graphic and Portable Network Graphic files. Afterwards we imported the manually changed SBGN-ML into VANTED/SBGN-ED, Krayon for SBGN and SBGNViz. Except from some minimal errors not adversely affecting the biological correct representation, our SBGN map was reproducible and editable in three different tools using the SBGN-ML file created.
We recognized differences in the SBGN-ML code depending on the tool from which the SBGN-ML file exported. Especially, Krayon uses a specific notation with format extensions. A more precise comparison of tool specific characteristics of SBGN-ML would be of worth for the community, but was clearly above the scope of this work.
After the created SBGN PD map successfully passed validation with various tools (cf. above) visual attractiveness was still limited.
Figure 4: SGBN map in Newt Editor after validation
We decided to keep the comprehensive structure of the SBGN map without further reduction of biological components or reactions to enable readers retracing the complex Ordinary Differential Equation (ODE) model. In order to make the SBGN map visually appealing and improve readibility we manually enhanced it by the following steps:
- Removal of ports
- Rearrangement of proteins and compartements, creation of submaps where meaningful
- Horizontal and/or vertical alignment of entity pool nodes, process nodes and logical operators
- Adaptation of label sizes
- Redirection of connecting arcs
As pointed out by Touré et al. [3] the network design should be in line with the message and the scientific question it aims to communicate. After alignment with the full team we decided to highlight the roles of the two transcriptional negative feedback regulators of the suppressor of cytokine signaling (SOCS) family, CIS and SOCS3, with color.
Figure 5: SGBN map beautified in Newt Editor
[1] Balci, H. et al. Newt: a comprehensive web-based tool for viewing, constructing and analyzing biological maps. Bioinformatics 37, 1475–1477 (2021). https://doi.org/10.1093/bioinformatics/btaa850
[2] Sari, M. et al. SBGNViz: A Tool for Visualization and Complexity Management of SBGN Process Description Maps. PLoS ONE 10, e0128985 (2015). https://doi.org/10.1371/journal.pone.0128985
[3] Touré, V., Le Novère, N., Waltemath, D. & Wolkenhauer, O. Quick tips for creating effective and impactful biological pathways using the Systems Biology Graphical Notation. PLoS Comput Biol 14, e1005740 (2018). https://doi.org/10.1371/journal.pcbi.1005740.g001
[4] Rougny, A. et al. Systems Biology Graphical Notation: Process Description language Level 1 Version 2.0. Journal of Integrative Bioinformatics 16, (2019). https://dx.doi.org/10.1515/jib-2019-0022
[5] van Iersel, M.P., Villéger, A. C., Czauderna, T. et al. Software support for SBGN maps: SBGN-ML and LibSBGN. Bioinformatics 28, 2016-2021 (2012). https://doi.org/10.1093/bioinformatics/bts270
Bioinformatics & Systems biology SS 2021
- Synopsis Group 1
- Sources of Bachmann model
- Software tools for simulation
- How to build a Fully Featured COMBINE Archive?
- Communication channels
- Provision of a template for documentation
- Schedule (draft)
- Review of results
- COMBINE Archive (Testversion!)
- Synopsis Group 2
- Finding of SBML models
- Comparison of SBML models
- The chosen one
- Simulation tools
- Metadata
- Improving metadata annotations
- Synopsis Group 3
- SBGN Maps for Bachmann model
- Choice of SBGN language
- Tool to draw the SBGN Map
- SBGN-Map Drawing, Validation & Beautification
- Integration into COMBINE Archive
- Synopsis Group 4
- Selection of experiments
- Selection of SED-ML tool(s)
- Generation of SED-ML file(s)
- Integration into COMBINE Archive
- Test of SED-ML files and COMBINE Archive