-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Little refactor and minor bug #41
base: main
Are you sure you want to change the base?
Conversation
change pandas loading engine='c'
# Conflicts: # folktables/load_acs.py
…fillna of pandas to solve multiple nan encodings; creating directory for cached data when not available.
Could the proposed fix for multiple nan encoding also help in the generate_categories function? Currently the code uses a placeholder value (e.g. -99999999999999.0) because using a nan value in a python dictionary doesn't really work as each nan is seen as a different key. |
I'm sorry, but given that I'm external to the project, I'm fixing only the things that are giving problems to my experiments. |
@@ -65,6 +65,9 @@ def get_definitions(self, download=False): | |||
return load_definitions(root_dir=self._root_dir, year=self._survey_year, horizon=self._horizon, | |||
download=download) | |||
|
|||
def fillna_safe(x, value=-1): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you describe what the intended behavior of this function should be?
Note that np.nan_to_num
will convert nans to 0.0
unless you set the keyword nan=value
. See here.
So, currently, I believe this function will first convert all nans to zero and then pass an array without any nans to pandas.
Am I getting this wrong?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there may be an issue with fillna_safe
.
See also discussion in #39 about the |
fix: refactor fillna and mobility_filter lambda functions with functions, fillna_safe also applies fillna from pandas to solve multiple nan encodings; creating directory for cached data when not available.