Demo: Add feature for Spark ORC writer to not persist field ids in files, using a new table property #133

rzhang10 · 2022-12-12T04:26:26Z

Adds a new table property "write.orc.no-field-ids.enabled" to control the Spark ORC writer behavior to not persist field-ids in the written file schema. This feature will be useful for Gobblin to ingest custom Hive/Iceberg hybrid table that share underlying files

rzhang10 added 2 commits December 11, 2022 21:17

[LI] DEMO: Add Spark-ORC writer feature to write files without field-ids

5888d5a

Style fix

7970ace

github-actions bot added CORE ORC SPARK labels Dec 12, 2022

Enhance unit test

44d2181

rzhang10 force-pushed the spark_orc_do_not_persist_field_ids branch from c3c4a87 to 44d2181 Compare December 12, 2022 23:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Demo: Add feature for Spark ORC writer to not persist field ids in files, using a new table property #133

Demo: Add feature for Spark ORC writer to not persist field ids in files, using a new table property #133

rzhang10 commented Dec 12, 2022

Demo: Add feature for Spark ORC writer to not persist field ids in files, using a new table property #133

Are you sure you want to change the base?

Demo: Add feature for Spark ORC writer to not persist field ids in files, using a new table property #133

Conversation

rzhang10 commented Dec 12, 2022