Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mesoscale cif and bcif clean-up #565

Open
wants to merge 17 commits into
base: main
Choose a base branch
from
Open

Conversation

corredD
Copy link
Contributor

@corredD corredD commented Aug 7, 2024

Use biotite cif and bcif parser
Add support for some color file, as a json dictionary
Need a patch on biotite to support YASARA petworld file ( attached )
cif.txt
Replace Blender\4.2\extensions.local\lib\python3.11\site-packages\biotite\structure\io\pdbx\cif.py
Default node show a clipped fraction of the model as point in the viewer

image

@BradyAJohnston
Copy link
Owner

Thanks for putting this together! I won't be able to do a thorough review for a week or so. Outside of that though, the fix for the biotite parsing will have to go upstream into biotite. The add-on ships with the .whl that is built and supplied on pypi.

If the fix is not suitable to go into biotite completely, then we can try and implement it in a different way for parsing those files

@corredD
Copy link
Contributor Author

corredD commented Aug 8, 2024

Please refer to the Biotite pull request #633. You will need to update the Biotite dependency to their next released version once it becomes available.

In the meantime, the workaround is to either use the patched CIF file or manually modify the PetWorld CIF files by removing the space character in front of the category names.

example :

loop_
_pdbx_struct_assembly.id
_pdbx_struct_assembly.details

become

loop_
_pdbx_struct_assembly.id
_pdbx_struct_assembly.details

@BradyAJohnston
Copy link
Owner

I've revisited this to get it merged without relying on Biotite's upstream change. Even after I strip all whitespace from the lines and re-write it, it fails to load some example files I downloaded from their website. Are you able to give specific example files?

tities\ensemble\ui.py", line 81, in load_cellpack
    ensemble = CellPack(file_path)
               ^^^^^^^^^^^^^^^^^^^
  File "C:\Users\BradyJohnston\AppData\Roaming\Blender Foundation\Blender\4.2\extensions\vscode_development\molecularnodes\en               ^^^^^^^^^^^^^^^^^^^
  File "C:\Users\BradyJohnston\AppData\Roaming\Blender Foundation\Blender\4.2\extensions\vscode_development\molecularnodes\en  File "C:\Users\BradyJohnston\AppData\Roaming\Blender Foundation\Blender\4.2\extensions\vscode_development\molecularnodes\entities\ensemble\cellpack.py", line 19, in __init__
    self.data = self._read(self.file_path)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\BradyJohnston\AppData\Roaming\Blender Foundation\Blender\4.2\extensions\vscode_development\molecularnodes\entities\ensemble\cellpack.py", line 91, in _read
    data = CIF(file_path)
           ^^^^^^^^^^^^^^
  File "C:\Users\BradyJohnston\AppData\Roaming\Blender Foundation\Blender\4.2\extensions\vscode_development\molecularnodes\entities\ensemble\cif.py", line 18, in __init__
    if "PDB_model_num" in categories["pdbx_struct_assembly_gen"]:
                          ~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\BradyJohnston\AppData\Roaming\Blender Foundation\Blender\4.2\extensions\.local\lib\python3.11\site-packages\biotite\structure\io\pdbx\cif.py", line 647, in __getitem__
    category = self._categories[key]
               ~~~~~~~~~~~~~~~~^^^^^
KeyError: 'pdbx_struct_assembly_gen'

@corredD
Copy link
Contributor Author

corredD commented Sep 6, 2024

I am confused, what parser are you using then in data = CIF(file_path) ? is it pdbx.CIFFile.read ?
Also you said you strip out the white space, how does look like your file after the changes ? In the original YAsara files there is two spaces in front of the keywords :
loop_
_pdbx_struct_assembly_gen.assembly_id
_pdbx_struct_assembly_gen.oper_expression
_pdbx_struct_assembly_gen.asym_id_list
pdbx_struct_assembly_gen.PDB_model_num
After there should be none
loop

_pdbx_struct_assembly_gen.assembly_id
_pdbx_struct_assembly_gen.oper_expression
_pdbx_struct_assembly_gen.asym_id_list
_pdbx_struct_assembly_gen.PDB_model_num

If you can share with me how you strip the files, I can look at it.

…r other cif files that is not formatted properly.
@corredD
Copy link
Contributor Author

corredD commented Sep 6, 2024

I remove the upstream biotite cif fix, and added a function to manually rewrite the file without the space. I tested it and it seems to work well.

# Conflicts:
#	molecularnodes/entities/molecule/pdbx.py
#	molecularnodes/entities/molecule/ui.py
@corredD
Copy link
Contributor Author

corredD commented Sep 6, 2024

After the merge I had to manually install the package tqdm, a dependancy of MDAnalysis.

@corredD
Copy link
Contributor Author

corredD commented Sep 6, 2024

image

@BradyAJohnston
Copy link
Owner

Okay thanks for the update! I'll go in and fix the tests and should be able to get this merged :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants