Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The great refactoring #221

Open
MatteoMagnini opened this issue Apr 9, 2023 · 0 comments
Open

The great refactoring #221

MatteoMagnini opened this issue Apr 9, 2023 · 0 comments
Assignees

Comments

@MatteoMagnini
Copy link
Collaborator

Some actions that should be taken to make the repository more usable and maintainable:

  1. Test. Currently there is a dependency on an external repo for tests. The repo should be dismissed and the test rewritten. Tests should be standard among the extractors. Suggestion: define a class TestExtractor that has all the method needed to properly test an extractor, then for each extractor there should be one file test_extractor_name.py and the related test class that extends from TestExtractor. We can use a fake predictor that simply outputs the class or the real value of a sample with a given accuracy. We should avoid the usage of complex predictors in tests such as NN. What to test? Generation of rules and reproducibility for sure. Possibly other properties.

  2. Demo. Avoid the use of jupyter notebook in this repository. There will be another one that use the package psyke from Pypi similarly to https://github.com/psykei/demo-psyki-python.

  3. GUI. Following the separation of concerns, just like for point 2., we can use another repository to create a version of psyke that makes use of graphical interfaces.

  4. API. Extractors should all have defaults values for the parameters. Obviously the predictor is the only parameter that has not default value. However all other algorithm-specific parameters should have one. In this way tests and usage are straightforward.

  5. Preprocessing. All operations that affect the dataset used during the extraction could be done in other places and not inside the extractor itself. For instance, it may be useful to have a class dedicated for this work that handles data transformations. And this is done before the creation of the extractor. The common use case of psyke is the following scenario: a scientist who has a trained predictor want to extract symbolic knowledge from it to better understand its behavior. In this case he probably already have dane some preprocessing to the dataset used to train the predictor, so we should allow the extraction of knowledge without the necessity to specify further data processing operations.

If you need some help, or some parts are not clear, fell free to write to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants