-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pyarrow.lib.ArrowInvalid: Failed casting from large_string to string: input array too large #48
Comments
The problem was that I accidentally tried to create a 1-shard dataset. Perhaps we should put a warning somewhere to prevent it. |
@rvandewater Can you provide a reproducible example using a dataset I could get access to? This is a weird error message that should get fixed. |
I think if you would try to map the entirety of MIMIC-IV to one shard you would get the same error. Will see if I have the time to try |
@rvandewater I'll try that. That's a good suggestion. |
@EthanSteinberg and @rvandewater, is this fixed, or still an outstanding issue? |
IMO it is not that important of a bug as I was trying to map everything to one shard (which is highly unadvisable). Adding warnings to doing this that are dependent on the dataset could be helpful. |
Ok, then I'm going to close this issue for now, and we can re-open it or create a new one more targeted to documentation as needed pending the resolution plan for #51 |
Getting this error after my adjustments at https://github.com/rvandewater/meds_etl
The text was updated successfully, but these errors were encountered: