-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Doubts] Custom spark physical operator Just like AlreadySortedExec #5
Comments
Can you put together a minimal example to reproduce this problem? E.g. as a gist at gist.github.com? |
Hi, apologies for my late response. so when i wrote my doExecute() method like this then in shuffle stage, only 1 record was taking part ( seems like hashAggregate was getting same hash value for every row)
but when i changed my code and used .copy() method, it started working correctly, below are few changes which made above code working
|
Hello Vladimir Prus, I read your blog on medium, it was very much interesting, i learned alot from it. By reading so, I was trying to write a custom operator which will derived an extra column ( i.e add an extra UTF8 string at the end of InternalRow), after that I am applying groupBy aggregation, it seems like everything is working fine but i am seeing that only 1 element is taking part in groupBy aggregation whereas when i am just using mapPartitions to derive that columns, lots of elements taking part in shuffling stage and giving correct output.
I need your help and suggestion to resolve this issue.
The text was updated successfully, but these errors were encountered: