Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Writer conversion #80

Open
wants to merge 30 commits into
base: master
Choose a base branch
from

Conversation

mehtaishita
Copy link

  • Added Int64 or int32 for relevant number types
  • Created StreamOptions and other config types to replace anonymous objects
  • Several Buffer readers
  • Parquet Schema file type to serialize and write metadata

wilwade and others added 30 commits April 27, 2021 16:49
feat(bloom-filter): add support

Add ability to create bloom filters on columns according to the apache [specs](https://github.com/apache/parquet-format/blob/master/BloomFilter.md).


Co-authored-by: shannonwells <[email protected]>
Co-authored-by: Wil Wade <[email protected]>
* Remove travis

* Fix tiny syntax bug

* Add org to name in package.json

* Add templates for PRs and issues

* Add GitHub Actions for testing and publishing

* Linting not yet implemented

* Commit the package-lock file for consistent builds
chore(README): update to include bloom filter

Update README file to include bloom filter feature.

Co-authored-by: Shannon Wells <[email protected]>
* use xxhash-wasm
* updates after rebase
* update tests due to new xxhash-wasm package
* remove reference to old xxhash
* Cleanup package.json packages

* Fix readme up

* Thrift types

* Update parquet thrift

* Class requires new

* Export Parquetjs Types
Add browser/node support for fetching http request.
* Configuration for bundling parquetjs that works in browser, via esbuild
* Add to build targets
* Fix all the things that got broken
* Example server showing how it can be used
* Updates to README
* Disable LZO and Brotli in browser
* Disable LZO in Node as a consequence of this attempt (see notes)
* Use wasm-brotli:
  * Use async import to load wasm before loading compression.js
  * requires async loading to get the wasm instanc
  * bubble up all the asyncs
  * make the tests pass again
* Webpack config file for possibly later
* check in example parquet files
Co-authored-by: enddy <[email protected]>
…ck (#19)

* Add parquet.js explicitly to the "include" list in tsconfig.
* Also regenerated package-lock.json to bump some patch versions.
Update bson package from 2.0.8 to 4.4.0 due to a vulnerability in v2. The interface changed, so update the files that use the package.
Remove an unused TypeArray import.
Remove a stray comment.
Co-authored-by: acruikshank <[email protected]>
* Added test cases for types and fixed bigint precision

* missed one check on if statement

* Added more test cases for invalid values to throw

* Added test cases for real this time

* Reorganized test cases and added helper function for readability and try catch for better error message

* Fixed spacing on function

* Added final tests to int64 and 96

* changed throw error code and fixed type on tests

Co-authored-by: Wayland Li <[email protected]>
* Converted Independent files into typescript

Co-authored-by: Wayland Li <[email protected]>
Co-authored-by: Sidney Keese <[email protected]>
* Converted schema.js into typescript without using type any

Co-authored-by: Wayland Li <[email protected]>
Co-authored-by: Wayland Li <[email protected]>
* Converted shred.js into typescript without using type any

Co-authored-by: Wayland Li <[email protected]>
Co-authored-by: Wayland Li <[email protected]>
Converted util.js into typescript

Co-authored-by: Wayland Li <[email protected]>
Co-authored-by: Wayland Li <[email protected]>
* Converted reader file to typescript and made fixes dependent on it

Co-authored-by: Wayland Li <[email protected]>

Wip

wip dump

4 errors left

Wip

parquet envelope writer and bloom filter placeholder type

number conversion

wip dump

user meta data is a keyvalue array across the board
so cleaning up how metadata values are pushed/concat
simplified encoding enum calls

next: clean up some function signatures and calls
next: add options for column chunk data

re commit?

draft commit
more consistent parameters, number types, buffer types
* Converted reader file to typescript and made fixes dependent on it

Co-authored-by: Wayland Li <[email protected]>
@wilwade wilwade deleted the writer-conversion branch May 12, 2022 12:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants