-
Notifications
You must be signed in to change notification settings - Fork 162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rebase staging/seed-validation onto dev #967
Closed
jacob-buehler
wants to merge
5
commits into
capitalone:staging/dev/seed-validation
from
jacob-buehler:rebase_seed_validation_to_dev
Closed
rebase staging/seed-validation onto dev #967
jacob-buehler
wants to merge
5
commits into
capitalone:staging/dev/seed-validation
from
jacob-buehler:rebase_seed_validation_to_dev
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…apitalone#954) A method signature that uses *args: Any, **kwargs: Any is compatible with any set of arguments in mypy, despite being an LSP violation. This lets us assert that subclasses of BaseDataProcessor should have some process() method with an arbitrary signature. We also add to the return type of BaseDataPreprocessor so that it is inclusive of all of its subclasses. Co-authored-by: JGSweets <[email protected]>
Inside the BaseDataProcessor class definition, references to __subclasses are automatically replaced with _BaseDataProcessor__subclasses. This remains the case even in static methods _register_subclass() and get_class(). Same with BaseModel and its __subclasses field. So we do not have to write out the full name mangled identifiers inside the class definitions. Also, mypy doesn't seem to be able to handle the return type of BaseDataProcessor.get_class() being a typevar, so that was changed to type[BaseDataProcessor]. This does not affect the functionality of get_class() since it always returns a subclass of BaseDataProcessor.
The mypy errors addressed here occur because variables label_mapping (in CharPreprocessor), unstructured_labels, and unstructured_label_set (in StructCharPreprocessor.process()) have optional types when they're used. This is fixed by checking that they are not None prior to the operation, which mypy recognizes as removing the None type from them. This should have no effect on functionality because we are already checking that labels is not None, and the variables above all depend on labels such that they are None only if labels is None.
capitalone#965) * Changed release option to only release branches named \'release/<version-tag>\'. * Reverted types
capitalone#959) * abstracted rng creation 23/07/11 14:32 * updated profile_builder random number generation * renamed dp_rng() to get_random_number_generator() * updated data_utils random number generation, added warning back to get_random_number_generator() * removed erroneous print statement * added tests of get_random_number_generator() to test_data_utils and test_utils * removed unnecessary int dtype conversion * edited seed declaration statement * added setUp function to get_random_number_generator() testing * fixed duplicate variable declaration in test_data_utils.py and test_utils.py * moved generator function to root of dataprofiler dir; added test_generator.py; reverted test_data_utils and test_utils * moved and renamed utils_global; cleaned up unused imports * additional tests of get_random_number_generator() * added test of utils_global for DATAPROFILER_SEED not in os.environ and settings._seed==None * added the last four unit tests in Taylors requested changes to test_utils_global.py * removed unneeded tests and declarations; changed to relative imports; updated assertWarnsRegex in test_utils_global * changed two more imports to relative imports * updated rng.integers call * removed unnecessary slicing/indexing * removed unnecessary slicing/indexing * cleaned up os.environ mocks in test_utils_global * mocked expected values in unit tests * simplified mocks * removed unnecessary test * added more descriptive mock names; ensured that rng generator uses proper seed * cleaned up mock names; improved docstrings * removed unnecessary clear=True clauses; removed duplicate assert statement * made clear=True statements consistent * removed one variable declaration; added clear=True to one mock * removed clear=True statement * removed unused imports and variable declarations * renamed utils_global -> rng_utils and corresponding test; renamed utils.py -> profiler_utils.py and corresponding test * fixed import error * renamed utils.py and utils_global.py * replaced imports of profilers.utils with profilers.profiler_utils
jacob-buehler
requested review from
JGSweets,
ksneab7,
taylorfturner,
micdavis and
tyfarnan
as code owners
July 24, 2023 16:14
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.