Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Read gzipped graphml files #1315

Open
wants to merge 12 commits into
base: main
Choose a base branch
from
Open

Read gzipped graphml files #1315

wants to merge 12 commits into from

Conversation

fabmazz
Copy link

@fabmazz fabmazz commented Nov 15, 2024

Create python function to load a graphml file that has been compressed with gzip without necessarily unzipping it before.
This makes reading large graphs in graphml much faster, and they can use less space

  • I ran rustfmt locally
  • I ran cargo clippy and it suggests no changes
  • I have added the tests to cover my changes.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING document.

@CLAassistant
Copy link

CLAassistant commented Nov 15, 2024

CLA assistant check
All committers have signed the CLA.

@IvanIsCoding
Copy link
Collaborator

Thanks for submitting this. I think this is a very welcoming addition.

However, I don’t think we should have a new mehod to support compressed files. I’d refactor the existing method to try to decompress the file based on the file extension. And perhaps an optional argument like “force_decompression” to decompress anyway even if the file extension is not what we expect

@fabmazz
Copy link
Author

fabmazz commented Nov 15, 2024

Ok I have now included everything in a single function, with the optional argument compression to force.

Copy link
Collaborator

@IvanIsCoding IvanIsCoding left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome this is looking good. I left a minor comment.

Lastly, check CONTRIBUTING.md for how to update the type stubs now that you added a new argument to an existing function. Our CI should fail you in the “Stubs” section until you update

pub fn read_graphml(py: Python, path: &str) -> PyResult<Vec<PyObject>> {
let graphml = GraphML::from_file(path)?;
#[pyo3(signature=(path, compression=""),text_signature = "(path, compression=\"\", /)")]
pub fn read_graphml(py: Python, path: &str, compression: &str) -> PyResult<Vec<PyObject>> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer Option<&str> over an empty string. For Python the default would become None which gets converted to None in Rust as well

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I thought about that too, but apparently it's being phased out:
https://pyo3.rs/main/function/signature#trailing-optional-arguments

Copy link
Collaborator

@IvanIsCoding IvanIsCoding Nov 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I thought about that too, but apparently it's being phased out: https://pyo3.rs/main/function/signature#trailing-optional-arguments

You just need to specify the None in the signature like you specified the empty string. We still use that in lots of places. What is deprecated is None being inserted by default without declaration

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants