feat(table): add conversion from Arrow Schema to Iceberg #155

zeroshade · 2024-09-26T23:11:21Z

In preparation for adding Data Reading, I'm splitting up the PRs to make them more reasonable to review.

This PR updates the version of the Arrow Go library being utilized and implements the conversions from Arrow Schemas to Iceberg Schemas, including name mapping for handling field IDs properly.

Tests are also included for this.

.github/workflows/go-ci.yml

nastra · 2024-10-02T14:27:22Z

table/scanner_test.go

 		{"test_partitioned_by_hours", iceberg.GreaterThanEqual(iceberg.Reference("ts"), "2023-03-05T00:00:00+00:00"), 8},
 		{"test_partitioned_by_truncate", iceberg.GreaterThanEqual(iceberg.Reference("letter"), "e"), 8},
 		{"test_partitioned_by_bucket", iceberg.GreaterThanEqual(iceberg.Reference("number"), int32(5)), 6},
-		// for some reason when I run the provisioning locally i get 5 data files
-		// but GHA CI running spark provisioning ends up with only 4 files?
-		// anyone know why?


I forgot but what's correct here, 4 or 5 data files?

It looks like my local runs now match the CI runs in generating 4 files so that seems correct to me.

Co-authored-by: Eduard Tudenhoefner <[email protected]>

zeroshade added 2 commits September 26, 2024 19:10

feat(table): add conversion from Arrow Schema to Iceberg

92839b0

cleanup

f10722a

github-actions bot added the INFRA label Sep 26, 2024

zeroshade added 5 commits September 27, 2024 09:26

use appropriate staticcheck

c944ad9

cleanups

5d813e4

update go.mod

d2665ef

fix some tests

3f70186

fix test

188b431

nastra approved these changes Oct 2, 2024

View reviewed changes

zeroshade and others added 2 commits October 7, 2024 16:36

Update .github/workflows/go-ci.yml

e84d4b9

Co-authored-by: Eduard Tudenhoefner <[email protected]>

Merge branch 'main' into convert-to-arrow-schema

014a765

nastra merged commit 4929eea into apache:main Oct 9, 2024
13 checks passed

zeroshade mentioned this pull request Oct 9, 2024

feat(table): Implement converting Iceberg schema and types to Arrow #168

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(table): add conversion from Arrow Schema to Iceberg #155

feat(table): add conversion from Arrow Schema to Iceberg #155

zeroshade commented Sep 26, 2024

nastra Oct 2, 2024

zeroshade Oct 9, 2024

feat(table): add conversion from Arrow Schema to Iceberg #155

feat(table): add conversion from Arrow Schema to Iceberg #155

Conversation

zeroshade commented Sep 26, 2024

nastra Oct 2, 2024

Choose a reason for hiding this comment

zeroshade Oct 9, 2024

Choose a reason for hiding this comment