Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: data clustering workflow #94

Merged
merged 20 commits into from
Apr 29, 2024
Merged

Feat: data clustering workflow #94

merged 20 commits into from
Apr 29, 2024

Conversation

d-schindler
Copy link
Collaborator

I've implemented a DataClustering class that constructs a graph from data and then applies PyGenStability. The implementation follows sklearn conventions. An example with synthetic data sampled from two multiscale circles is provided.

Proper docstrings are missing because I wanted to wait for your first review @arnaudon and looking forward to your feedback.

@d-schindler d-schindler added the enhancement New feature or request label Apr 4, 2024
@d-schindler
Copy link
Collaborator Author

d-schindler commented Apr 4, 2024

Later we will also have to update the readme and documentation, and the tests.

Copy link
Collaborator

@arnaudon arnaudon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can also run tox -e format and tox -e lint to check if the style is correct, I'll try to run the code later today

src/pygenstability/data_clustering.py Outdated Show resolved Hide resolved
src/pygenstability/data_clustering.py Outdated Show resolved Hide resolved
src/pygenstability/data_clustering.py Outdated Show resolved Hide resolved
src/pygenstability/data_clustering.py Outdated Show resolved Hide resolved
src/pygenstability/data_clustering.py Outdated Show resolved Hide resolved
src/pygenstability/data_clustering.py Show resolved Hide resolved
src/pygenstability/data_clustering.py Outdated Show resolved Hide resolved
src/pygenstability/data_clustering.py Outdated Show resolved Hide resolved
src/pygenstability/data_clustering.py Outdated Show resolved Hide resolved
@d-schindler
Copy link
Collaborator Author

I'm working on the revised code and will commit later. I use black for formatting, which style is required to pass the tests?

@arnaudon
Copy link
Collaborator

you need tox -e format, and tox -e lint (first improves formation), second gives you errors it could not fix, it has black inside and a bit more, you'll see

@d-schindler
Copy link
Collaborator Author

I addressed all your requests and will look into tox later. Will we also have to write tests for this part of the software?

@arnaudon
Copy link
Collaborator

I addressed all your requests and will look into tox later. Will we also have to write tests for this part of the software?

No worries, I can run tox when I'll try the code. For the tests, we could write some to improve coverage, yes. What is your plan? You are ok to use this branch, or you prefer to merge it asap to share it more easily? We can polish it a bit more, merge, and re-open PR if there is something to fix later

@d-schindler
Copy link
Collaborator Author

I think we can work a bit longer on this branch and polish it, write tests, update documentation etc. No need to merge ASAP.

@d-schindler
Copy link
Collaborator Author

d-schindler commented Apr 19, 2024

I've added plot_sankey as a method and started to update the readme. Will work on improving the documentation later.

@arnaudon , could you help with the test?

@d-schindler d-schindler requested a review from arnaudon April 24, 2024 06:38
@d-schindler
Copy link
Collaborator Author

Open tasks are now improving the docstrings and documentation, and writing tests. Once this is done we could merge this branch and assign a new version number and release updated online documentation?

@d-schindler
Copy link
Collaborator Author

The documentation is ready. Only the tests are left now.

@d-schindler
Copy link
Collaborator Author

I accidentally changed the formatting of pygenstability.py when I improved a small bit of the docstring

@arnaudon arnaudon changed the title data clustering with PyGenStability Feat: data clustering workflow Apr 29, 2024
@arnaudon arnaudon merged commit 88b8122 into master Apr 29, 2024
2 checks passed
@arnaudon arnaudon deleted the data-clustering branch April 29, 2024 12:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants