-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Python][C++] Calling Table.from_pandas with a dataframe that contains a map column of sufficient size causes SIGABRT and process crash #44643
Comments
really weird:
I'll have to debug a bunch to understand where this mismatch is coming from :) |
Intuitively, I think what happens is that the |
It looks like the rewind-on-overflow in |
Thanks @pitrou that was also what I thought but I am confused after some debugging because even if I decide to not rewind (just for testing purposes): - length_ -= 1;
- offset -= 1;
+ // length_ -= 1;
+ // offset -= 1;
+ std::cout << "Let's not rewind" << '\n'; The two builders still have a mismatch, below some logging from my tests when not rewinding:
|
Yes, I don't think the explicit rewinding is the problem. The problem is that, when a field of a struct builder fails appending, the other fields are not rewinded. |
The error is coming from here https://github.com/apache/arrow/blob/main/python/pyarrow/src/arrow/python/python_to_arrow.cc#L1059-L1062 |
Describe the bug, including details regarding any error messages, version, and platform.
Related to #44640
When attempting to convert a pandas dataframe that has a dict type column to a pyarrow table with a map column, if the dataframe and column are of sufficient size, the conversion fails with:
This is immediately followed by SIGABRT and the process crashing.
When the dataframe is of a smaller size, the conversion succeeds without error. See below for reproduction code, when
dataframe_size
is set to a small value (eg 1M rows) there is no error, but at a certain size (eg, 10M rows) the error condition occurs.Environment Details:
Component(s)
Python
The text was updated successfully, but these errors were encountered: