-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathchangelog.txt
188 lines (156 loc) · 8.87 KB
/
changelog.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
Version 1.5.1
- Improved the Gene report feature
- Improved the installer
- Changed how multi-SAVs are processed, will no longer generate mutant protein objects (like indels still do)
- Adjustments for newest StructGuy version 1.0.0
Version 1.5.0
- Gene report function is now functional
- Reorganized features into classes, making adding new features in the future easier
This how features are stored and retrieved from the database, thus a new major version change was necessary.
- Improved the installer
Version 1.4.1
- Paralellized th MicroMiner lookup pipeline
- Continued implementation of the gene report function
- Multiple minor gub fixes
Version 1.4.0
- Major update preparing the public version installable outside of a container
- Implemented a installation script to fully setup structman in a conda environment
- New outputs that support the direct usage of structguy after using structman
- Added support for sqlite3 as a local database option, hence removed the lite version
- Integrated MicroMiner into the pipeline (usage requires to configure an executable)
- Switched to ray 2.9.1
- Many smaller bug fixes
Version 1.3.0
- Needed to add an Interface hashing system to the database to retrace different interfaces from the same protein
- Updated the automated Modelling pipeline, improved the template selection to focus on mutated sites
- New functionality: now one can give tags to proteins that annotate a gene identifier (for example: "gene:P53"). Multiple protein isoforms with same gene tag will then lead to an isoform analysis.
- switched to ray 2.2.0
- Added support for Ensembl transcript identifiers
Version 1.2.2
- added the feature to parse HGVS protein mutation identifiers (for example: p.Arg12His)
Version 1.2.1
- updated Uniprot ID mapping service to REST API
Version 1.2.0
- added database support for interfaces, aggregated interface, protein-protein-interaction, position-position-interactions
- switched to ray 1.13.0
Version 1.1.1
- added aggregated contact matrix calculation
- added aggregated interface calculation
Version 1.1.0
- database makes now version checks
- switched to ray 1.11.0
- added model database support (specifically alphafold DB)
Version 1.0.3
- database create command now checks if already an instance exists first
- if database is not properly configured and structman got still called in db mode, now it switches automatically to lite mode
Version 1.0.2
- fasta input parsing checks now for irregular characters
- improved some bug logging
- added a possibility to run the alignment in serial mode without ray for debugging purposes
- switched to ray 1.10.0
Version 1.0.1
- removed some commented code
- slight changes of file structure to prevent circular imports
- multiple small bug fixes
Version 1.0.0
- first main version
- switched to ray 1.9.0
- Multiple smaller bug fixes
Version 0.9.2
- changed how imports are organized
- switched to ray 1.8.0
Version 0.9.1
- Various smaller bug fixes
- Multiple model structures (not NMR) are now fused internally into one model
- obsolete uniprot entries are now supported
Version 0.9.0
- Moved Modeller to version 10.1
- Fixed multiple modelling bugs
- Implemented aggregated indel analysis
- Indel analysis results are now a compressed row in the database
- Added automated script that creates/updates the mapping/sequence database
- mapping/sequence database got a new structure
- default mode now doesn't model indel structures. Can be activated by config or flag
- Updated disclaimers
Version 0.8.1
- The size of the database row for position data and alignment is now bigger
- Fixed a bug that prevented disordered region annotation to happen
- Upgraded disorder prediction method from IUpred2A to IUpred3
- Unpacking of position data in the ouput creation is now paralellized
Version 0.8.0
- Bigger data fields are now stored as binaries in the database
Version 0.7.3
- The update script now addtionally loads the .../data/status directory
- The RINdb update functions checks and writes (into meta.txt in the RINdb main directory) the date of the last update
- The RINdb compares the date of the last update with the dates of structure modifications in the /data/status directory of the PDB
- Added a function to split sequences of structures with unknown portions; to align them individually; and to fuse the alignments afterwards
- Configured custom output format of MMseqs2
- Use sequence length of the structure to better estimate the cost for an alignment
Version 0.7.2
- Added a nested parallelization for the classifcation. Triggers only, when we have many cores available and the classifcation task is large enough.
- updated the docker update script
Version 0.7.1
- Fixed a bug that kept the program using a local instance of the PDB
- Improved the protein splitting scheduling in the alignment parallelization, now works better with few proteins that are mapped to many structures
- Improved the performance of the packaging in the parallelized classification
Version 0.7.0
- Fixed some bugs regarding the parsing of asymmetric unit structures for cases them being given as input
- Parallelization if filter_structures is now chunked
- Improved chunking of alignment tasks
- Ray reinit for each major core loop to prevent memory leaks
- added custom object serialization functions to utils
- Fixed bugs regarding the parsing of PDB-type inputs
- added '--mem_limit [any%]' flag to limit maximal memory usage for testing purposes
- extensive overhaul of the parallelization of the annotation and the classification
Version 0.6.1
- Added output util 'RAS' that Retrieves all Annotated Structures (and their protein interaction partner) from a specific session.
Version 0.6.0:
- Added output util `suggest` command that ranks structural annotation templates
- Added associated database table `Suggestion` for rankings
- Add pandas requirement to structman conda environment file
Version 0.5.1
- increased radius for intra chain interface score sphere
Version 0.5.0:
- Location (formerly surface/core) gets now divided into surface/buried/core and into total/mainchain/sidechain
- Classification is based on sidechain location (means also a new class 'buried')
- Changed location and surface values had to be stored in the database -> version change to 0.5.x
- Added all types of locations and surface values to classification and feature table outputs
- Implemented an interface milieu analysis (inspired by core/rim interface distinction, but goes into more detail)
Version 0.4.1:
- Unspecified fasta inputs now add all positions as query
Version 0.4.0:
* Major Changes
- Add input checksums to see if input file has changed, triggering new session if so
- Adds new Checksum column to Session table in DB
- Separate output.py into multiple modules
* Enhancements
- Allow C++ style // line and trailing comments in smlf format
- Search user home folder for config file when no others found
- Added new structman.utils module (less structman specific than structman.lib.sdsc.utils)
* Misc Changes/Fixes
- Fix Mutated Sequence off-by-1 indexing error
- Move several functions from structman.lib.sdsc.utils to structman.utils
- Check that database_source_path exists before creating DB to avoid missing tables and ensuing failure of structman database create
- Delete duplicated sql script from /lib (resides in /scripts)
- Prefer os.path.join to avoid errors with double slashes in path
- Use package paths set in settings.py for ray_init()
- Use surface_threshold from config in database.getClassWT, database.getClass, database.diffSurfs
- Rename class Output_generator to OutputGenerator to conform to pep8
- Fix input/output issues when using relative paths
- Import fixes
- Autopep8 fixes
- Updated Readme
Version 0.3.2:
- Modelling now builds up a database of models that can be used to save computation costs for later runs
- Positions filtered by sanity checks now do not delete the whole protein
- Indel analysis can now be used with stored information from the database
- Added '--only_wt' option to run the pipeline without the generation of mutated proteins
- Multiple smaller bug fixes
- Improved remote classification
Version 0.3.1:
- Improved the store handling of remote classification
- Implemented a modelling output framework. Currently produced one model per input protein and model for each given mutation.
- Multiple smaller bug fixes
Version 0.3.0:
- Begin of changelog
- Acts as first master version of structman as pip package