-
Notifications
You must be signed in to change notification settings - Fork 162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Polars feature updates #1119
Polars feature updates #1119
Conversation
Head branch was pushed to by a user without write access
6e2c2fa
to
c176d53
Compare
8fb2da9
to
fdf3171
Compare
len_df = len(df_series) | ||
|
||
is_row_datetime = pd.Series(np.full((len(df_series)), False)) | ||
is_row_datetime = pd.Series(np.full((len_df), False)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice catch
if type(data) is pl.DataFrame: | ||
words = ( | ||
[ | ||
w.strip(string.punctuation) | ||
for w in row.str.to_lowercase().str.split(by=" ") | ||
] | ||
for row in data | ||
) | ||
else: | ||
words = ( | ||
[w.strip(string.punctuation) for w in row.lower().split()] | ||
for row in data | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pre-commit made it like this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep
if type(data) is pl.DataFrame: | ||
words = ( | ||
[w.strip(string.punctuation) for w in row.str.split(by=" ")] | ||
for row in data | ||
) | ||
else: | ||
words = ( | ||
[w.strip(string.punctuation) for w in row.split()] for row in data | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
90% of this is similar ... could make this more modular. follow-up PR maybe or if you can crank it out quick can do this
Head branch was pushed to by a user without write access
6a21fde
to
7e342dd
Compare
da23777
to
c176d53
Compare
7e342dd
to
0904ba5
Compare
c176d53
to
b1f656e
Compare
0904ba5
to
8884488
Compare
8884488
to
b3633eb
Compare
Added polars to datetime_column_profile and unstructured_labeler