- Correct stage-based conditions mentioned in notebook tutorials #92
- Add stage-based conditions to
setup
inProteinDataModule
#72 - Improves support for datamodules with multiple test sets. Generalises this to support GO and FOLD. Also adds multiple seq ID.-based splits for GO. #72
- Add redownload checks for already downloaded datasets and harmonise pdb download interface #86
- Remove remaining errors from PDB dataset change
- Add option to create pdb datasets with sequence-based splits #88 as well as time-based splits #89
- Adds missing
pos
attribute to GearNetrequired_batch_attributes
(fixes #73) #74 - Fixes PDB download failure due to missing protein data #77
- Add support for handling training/validation OOMs gracefully #81
- Add support for handling backward OOMs gracefully #83
- Update GCPNet paper link #85
- Adds
InverseSquareRoot
LR scheduler #71
- Adds
--force-cuda-version
toworkshop install
#78
- Fix
sequence_edges
behaviour when argumentb
is aData
object #80
- Update ICLR paper link and citation #82
- Add an optional group for installing plotting and analysis specific libraries to lighten the install of the core framework #90
- Adds to antibody-specific datasets using the IGFold corpuses for paired OAS and Jaffe 2022 #53
- Set
in_memory=True
as default for most (small) datasets for improved performance #53 - Fix
num_classes
for GO datamodules * Setin_memory=True
as default for most (downstream) datasets for improved performance #53 - Fixes GO labelling #53
- Improves positional encoding performance by adding a
seq_pos
attribute onData/Protein
objects in the base dataset getter. #53 - Ensure correct batched computation of orientation features. #58
- Implement ESM embedding encoder (#33, #41)
- Adds CDConv implementation #53
- Adds tuned hparams for models #53
- Refactors beartype/jaxtyping to use latest recommended syntax #53
- Adds explainability module for performing attribution on a trained model #53
- Change default finetuning features in config:
ca_base
->ca_seq
#53 - Add optional hparam entry point to finetuning config #53
- Fixes GPU memory accumulation for some metrics #53
- Updates zenodo URL for processed datasets to reflect upstream API change #53
- Adds multi-hot label encoding transform #53
- Fixes auto PyG install for
torch>2.1.0
#53 - Adds
proteinworkshop.model_io
containing utils for loading trained models #53 - Add script for plotting UMAP embeddings of any dataset given a pre-trained encoder model
- Fixes error in Metal3D processed download link (#28)
- Fixes typo in wandb run name setting (#30)
- Fixes paths for models and datasets when testing instantiation of each module (#32)
- Improvements to TFN, MACE and EGNN models and layers, including DiffDock-style intermediate edge feature creation (TFN), dropout, gaussian RBF, mean global pooling (#38)
- Minor patch; adds missing
overwrite
attribute toCATHDataModule
,FoldClassificationDataModule
andGeneOntologyDataModule
. (#25)
- Fixes raw data download triggered by absence of PDB when using pre-processed datasets (#24)
- Fixes bug where batches created from
in_memory=True
data were not correctly formatted (#24) - Consistently exposes the
overwrite
argument for datamodules to users (#24) - Fixes bug where downloading FoldComp datasets into directories with the same name as the dataset throws an error (#24)
- Increments
graphein
dependency to1.7.3
(#24)
- Fixes incorrect lookup of
DATA_PATH
env var (#19)
- First public release