-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The issue with the performance of unflattening process #89
Comments
it's a single-threaded implementation so doesn't surprise me that it's not super scalable. off the top of my head, you might be able to do a parallel map-reduce type of computation. first, in parallel apply happy to take a closer look if you share a snippet of your data or at least sth that looks very similar to it. |
The issue is related to the performance in the large datasets.
The method 'unflatten_list' has been used for the unflattening process.
The dataset has 80750 rows and 1051 columns.
The process of unflattening took 6 hours and 5 minutes.
Have you faced with this issue? How the unflattening process might be optimized?
The text was updated successfully, but these errors were encountered: