-
Notifications
You must be signed in to change notification settings - Fork 173
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add parquet-mr test #56
base: master
Are you sure you want to change the base?
Conversation
@@ -9,3 +9,11 @@ deploy: | |||
tags: true | |||
npm_api_key: | |||
secure: HK/tFvgj/TtYTJ3s2Bszc1/yJWvbSkLcfY3ki3GEuudMpfzcq134/2fbdZLb+B7Ukg31rdRVFCrSg8k6a1KhztkRr9SnMts5WO2ZGulmzNQ+XsBwdd0Bf7KYamAtqft5qBnSvh+ypBloQJQqq5qazb31971Fwvg5pdkYTQgCQxyIfZlH8nUbOxcYyl4w6Mvz5zsQp2c4OKOdq0FgeU3OqJ05i5lWL/CZWRO9L7+f0Uih5Jr9CuRzBUcVVxIopn1uOX1czug+OudIuUMLxbJwJt69ZpWdTbywLg6wVvA58ozbyialuEx8S1UaehsqHFj29JJWcOw+6TCi5+512DrBZMguiyTkjq5I5kmRcPNPY8dcqJUZUD6eDpKYQemFeg+6vKIvT3spK53VXNoEOIqAAiNTpmfY6JQ17S31gy1TqZldMtWr1HXf95LGlLC+czgMHPi1m6YiUgdDx5N7MFXumdOxiyHNdoitQFyyyS57RS7BG8/5ZMeKIXEfhQ9KU/D5L3KpgNCBmwVR72vF3nb89aVETrvNIbZEgc/cTdYWquezfPibGoGjWVJ4c38nd30s6rmoMBwoDwznaDg87ameoHUKSCSMx3uVXRZ5uR2C4SmTqVbWNKLXszL4iIW54EaLf3M+AYjoAb+EupaPMuEonJukdzkalp03RekYVeIY23U= | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
0094133
to
8764ac4
Compare
Here is a failing branch: https://github.com/ZJONSSON/parquetjs/tree/parquet-mr-fail |
* bitpacking should work for any length of data, not just multiple of 8 (last packed is padded if less than 8) * Improve runs estimation - only start a new run if we are at a mod 8 === 0, otherwise use bitpacking
This PR has been rebased on #57 to include fixes for RLE in dlevels and rlevels + more test added to verify that the results are correct as seen from parquet-mr |
I seem to be running into this issue as well. Are there any outstanding items on this PR that I might be able to help with to get it merged in? |
Do your problems go away when you use this branch? The only outstanding thing here is a code review afaik. |
NPM install per this comment does the trick for me: |
Here is a very basic example of how we can use dockerized parquet-tools (from parquet-mr) to test on travis whether files created by parquetjs can be read by parquet-mr (and therefore spark etc)
The basic test succeeds but more advanced tests fail. I will add a failing branch that we can use as a guide for fixing any errors.