Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: Add destinations support #257

Merged
merged 142 commits into from
Jul 30, 2024
Merged
Show file tree
Hide file tree
Changes from 136 commits
Commits
Show all changes
142 commits
Select commit Hold shift + click to select a range
3d0dce4
refactor: move common code from Source class to ConnectorBase
aaronsteers May 22, 2024
0957f7b
add Destinations class
aaronsteers May 22, 2024
2cab67c
refactor common executable creation code from get_source() to get_con…
aaronsteers May 22, 2024
25c1b61
add get_destination() factory function
aaronsteers May 22, 2024
da2eaab
feat: load data and finalize state after each "stream complete" signa…
aaronsteers May 24, 2024
c0fa516
remove unused 'force_full_refresh' flag
aaronsteers May 24, 2024
0ad2af8
slight increment (wip)
aaronsteers May 30, 2024
6b4db6a
add message generator classes
aaronsteers Jun 18, 2024
604b7f5
fix type
aaronsteers Jun 18, 2024
ad41ff4
clean up destination implementation
aaronsteers Jun 18, 2024
6c0a160
improve stdin message generator
aaronsteers Jun 18, 2024
e89cc9f
initial iceberg writer scaffold
aaronsteers Jun 18, 2024
ad96af3
chore: multiple lint and typing fixes
aaronsteers Jul 6, 2024
1bca531
chore: more type fixes
aaronsteers Jul 6, 2024
fd3081d
merge from main
aaronsteers Jul 8, 2024
3c45195
update from main
aaronsteers Jul 9, 2024
07a141d
poetry add pyiceberg
aaronsteers Jul 9, 2024
d3cee30
type as str instead of bytes
aaronsteers Jul 9, 2024
80c32e3
add overrides decorators
aaronsteers Jul 9, 2024
135df49
explicit utf-8
aaronsteers Jul 15, 2024
68d4306
Merge remote-tracking branch 'origin/main' into aj/feat/add-destinati…
aaronsteers Jul 15, 2024
ed66c69
merge in `main` changes for get_connector_executor()
aaronsteers Jul 15, 2024
2c2c5f1
fix: state file path call
aaronsteers Jul 15, 2024
6c484ad
chore: added new destination integration tests (failing), refactored …
aaronsteers Jul 15, 2024
c2a040d
chore: replace deprecated 'parse_raw' with 'model_validate_json'
aaronsteers Jul 15, 2024
6891727
fix: improve file and pipe handling
aaronsteers Jul 16, 2024
e59e0b8
remove commented-code
aaronsteers Jul 16, 2024
8c31883
chore: lint fixes
aaronsteers Jul 16, 2024
e9c9e86
chore: lint fixes, remove commented code
aaronsteers Jul 16, 2024
52d9a27
refactor: move logic to _peek_airbyte_message()
aaronsteers Jul 16, 2024
e3c6196
chore: clean up types
aaronsteers Jul 16, 2024
c1b54f5
perf: optimize `_ensure_schema_exists()`
aaronsteers Jul 16, 2024
e3dc4c4
chore: add viztracer dev dependency
aaronsteers Jul 16, 2024
374b54f
chore: add viztracer config
aaronsteers Jul 16, 2024
0569bd4
chore: add perf-test example scripts
aaronsteers Jul 16, 2024
15225bb
feat(perf): improve precision of progress prints
aaronsteers Jul 17, 2024
88e522d
perf: cache string normalization (2.5x speed boost! 🔥)
aaronsteers Jul 17, 2024
30b8300
perf: increase batch size from 10K to 100K
aaronsteers Jul 17, 2024
613327f
perf(snowflake): upload using up to 8 concurrent threads
aaronsteers Jul 17, 2024
2963b4e
fix(drive-by): 'dataset_id' no longer part of secret config for bigquery
aaronsteers Jul 17, 2024
4ff3efb
feat(perf): robust test script for perf
aaronsteers Jul 17, 2024
c189ce7
chore(perf): reasonable max entries for viztracer
aaronsteers Jul 17, 2024
d3a191e
add comment
aaronsteers Jul 17, 2024
56c252b
chore: remove unneeded cortex declarations
aaronsteers Jul 17, 2024
057ac13
update from `main`
aaronsteers Jul 17, 2024
c0c09a5
drive-by: improve Snowflake example script
aaronsteers Jul 17, 2024
56f1c8f
Merge remote-tracking branch 'origin/main' into aj/feat/add-destinati…
aaronsteers Jul 18, 2024
d97a380
chore: tidy test
aaronsteers Jul 18, 2024
529104c
feat: working load to destinations
aaronsteers Jul 19, 2024
3a620ce
chore: format logs
aaronsteers Jul 19, 2024
668b7b2
chore: update example script
aaronsteers Jul 19, 2024
4f6fef6
chore: misc fixes
aaronsteers Jul 19, 2024
a4d1548
massive refactor
aaronsteers Jul 20, 2024
617a626
chore: compile tests as part of `poe check`
aaronsteers Jul 20, 2024
746c96e
chore: update cortex script
aaronsteers Jul 20, 2024
bc00f11
feat: new state backend for destinations
aaronsteers Jul 20, 2024
a991cf4
fix some tests
aaronsteers Jul 20, 2024
41c89ee
checkpoint: working sync tests
aaronsteers Jul 20, 2024
4ec619a
chore: fix test, add back perf tracer
aaronsteers Jul 21, 2024
ffa61dc
revert back to prune_extra_fields=True
aaronsteers Jul 21, 2024
89df31a
add missing progress_tracker refs
aaronsteers Jul 21, 2024
3da33de
fix tests
aaronsteers Jul 21, 2024
5218922
skip destination tests on windows (requires docker)
aaronsteers Jul 21, 2024
55b9062
add 'test-fast' poe task
aaronsteers Jul 21, 2024
faf0820
remove unused test fixtures
aaronsteers Jul 21, 2024
69976f3
remove stale comment
aaronsteers Jul 21, 2024
9543726
fix tests, update viztracer filenames
aaronsteers Jul 21, 2024
f946a86
delete redundant example file
aaronsteers Jul 21, 2024
8b87ed3
remove iceberg code (moved to other branch)
aaronsteers Jul 21, 2024
d29dbff
improve arg check
aaronsteers Jul 21, 2024
97c82bd
refactor `ReadProgress` as `ProgressTracker`
aaronsteers Jul 22, 2024
3a2389f
refactor StateWriter with `StateProvider` as superclass
aaronsteers Jul 22, 2024
7a725af
chore: print reason for skipped tests
aaronsteers Jul 22, 2024
a29956f
ci[fix]: docker tests were always being skipped
aaronsteers Jul 22, 2024
91ddf04
fix: stream name resolution
aaronsteers Jul 22, 2024
150fdd3
fix: sql state backend update
aaronsteers Jul 22, 2024
3433d0f
fix: excess log statement
aaronsteers Jul 22, 2024
2c009ea
fix: don't send connector logs to stderr
aaronsteers Jul 22, 2024
b1d08e1
add connector write error
aaronsteers Jul 22, 2024
3e2ae0c
update test and examples
aaronsteers Jul 22, 2024
cbdaf90
chore: extend ruff lint rule exclusions across 'tests' and 'examples'…
aaronsteers Jul 22, 2024
00fdcb0
chore: bring back 'preview=True' in ruff subdir settings
aaronsteers Jul 22, 2024
486b600
begin work to tally pending and confirmed writes
aaronsteers Jul 22, 2024
a4f28f4
chore: remove accidentally committed file
aaronsteers Jul 22, 2024
3f44f76
fix: logic for elapsed seconds
aaronsteers Jul 22, 2024
6d4f11d
fix: restore prev logic for elapsed_read_seconds
aaronsteers Jul 22, 2024
09b94fc
fix format
aaronsteers Jul 22, 2024
a83f9b9
apply pr suggestion
aaronsteers Jul 22, 2024
cb6ccc2
rename: `log_source_check_result()` -> `log_connector_check_result()`
aaronsteers Jul 23, 2024
4723798
adding destination support to progress tracker, move logging methods …
aaronsteers Jul 23, 2024
70d21ed
refactor: `AirbyteMessageIterator`
aaronsteers Jul 23, 2024
5189ed1
add TODO for emulated executor
aaronsteers Jul 23, 2024
37c9dce
feat(QOL): print the log file path when exception is created.
aaronsteers Jul 23, 2024
6ce705d
fix: live view shutdown and log_success()
aaronsteers Jul 23, 2024
1fb116f
test: mark as broken
aaronsteers Jul 23, 2024
a272e3f
fix progress bugs
aaronsteers Jul 23, 2024
945b4f5
all the bugfixes!
aaronsteers Jul 23, 2024
2bc72c7
fix more tests
aaronsteers Jul 24, 2024
46faf80
chore: ruff fix
aaronsteers Jul 24, 2024
281d2a0
Merge branch 'main' into aj/feat/add-destinations-support
aaronsteers Jul 24, 2024
eefd6c5
fix telemetry calls
aaronsteers Jul 24, 2024
eb452ae
fix docker executable command
aaronsteers Jul 24, 2024
0050663
fix docker tests
aaronsteers Jul 24, 2024
5501980
update docker tests
aaronsteers Jul 24, 2024
5fa6093
simplified example scripts
aaronsteers Jul 24, 2024
88d816b
add destination support to perf testing script
aaronsteers Jul 24, 2024
038a56d
fix: don't fail if streams are missing a state message
aaronsteers Jul 24, 2024
6e94f24
fix type check
aaronsteers Jul 24, 2024
c8f75e6
add perf profiling note
aaronsteers Jul 24, 2024
c639b33
add stream-level read counts
aaronsteers Jul 24, 2024
36fa532
improve stream-level progress prints
aaronsteers Jul 24, 2024
426b705
clean up stream-level progress prints
aaronsteers Jul 24, 2024
bfa9419
fix list type
aaronsteers Jul 24, 2024
f7f8f1b
make _executors a private module again
aaronsteers Jul 24, 2024
0284e4b
feat: add 'processed_records' property to WriteResult class
aaronsteers Jul 24, 2024
a891cf9
update example scripts
aaronsteers Jul 24, 2024
6ac4939
add WriteResult to top-level module
aaronsteers Jul 24, 2024
b57055e
fix missing import
aaronsteers Jul 24, 2024
43b834a
update state writer docstrings
aaronsteers Jul 25, 2024
f476c26
fix lint warning
aaronsteers Jul 26, 2024
4d30825
chore: pin 'ruff', fast-fail on 'test-fast' poe task
aaronsteers Jul 26, 2024
6401b2f
fix tests
aaronsteers Jul 26, 2024
235c1c5
fix imports
aaronsteers Jul 26, 2024
2a7fead
lint fixes
aaronsteers Jul 26, 2024
39c4269
fix(source-test fixture): undeclared primary keys
aaronsteers Jul 26, 2024
5d2848b
fix stale test
aaronsteers Jul 26, 2024
8451e20
remove unused emulated class
aaronsteers Jul 26, 2024
7ba3080
fix tests
aaronsteers Jul 26, 2024
65fe39e
lint auto-fix
aaronsteers Jul 26, 2024
1debaeb
fix test
aaronsteers Jul 26, 2024
98415f7
update 'experimental' docs
aaronsteers Jul 26, 2024
b895fb9
fix state handling
aaronsteers Jul 29, 2024
6ca7652
remove commented code
aaronsteers Jul 29, 2024
daef1ce
lint fixes
aaronsteers Jul 29, 2024
8c5df82
switch to log in tmp folder
aaronsteers Jul 30, 2024
9835373
remove unnecessary class factory
aaronsteers Jul 30, 2024
51e849b
lint cleanup
aaronsteers Jul 30, 2024
d1eb8ff
fix tests
aaronsteers Jul 30, 2024
1893d38
fix postgres test fixture
aaronsteers Jul 30, 2024
33ec03d
fix xkcd classification
aaronsteers Jul 30, 2024
1d1c482
skip cloud tests (temporary)
aaronsteers Jul 30, 2024
5be2bd8
skip docker destinations in windows
aaronsteers Jul 30, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,10 @@
# temp files
temp
.temp

# logs
logs/

# Viztracer log files
viztracer_report.json

Expand Down
9 changes: 8 additions & 1 deletion airbyte/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
caches,
cloud,
datasets,
destinations,
documents,
exceptions, # noqa: ICN001 # No 'exc' alias for top-level module
experimental,
Expand All @@ -25,8 +26,10 @@
from airbyte.caches.duckdb import DuckDBCache
from airbyte.caches.util import get_default_cache, new_local_cache
from airbyte.datasets import CachedDataset
from airbyte.destinations.base import Destination
from airbyte.destinations.util import get_destination
from airbyte.records import StreamRecord
from airbyte.results import ReadResult
from airbyte.results import ReadResult, WriteResult
from airbyte.secrets import SecretSourceEnum, get_secret
from airbyte.sources import registry
from airbyte.sources.base import Source
Expand All @@ -39,6 +42,7 @@
"cloud",
"caches",
"datasets",
"destinations",
"documents",
"exceptions",
"experimental",
Expand All @@ -50,17 +54,20 @@
# Factories
"get_available_connectors",
"get_default_cache",
"get_destination",
"get_secret",
"get_source",
"new_local_cache",
# Classes
"BigQueryCache",
"CachedDataset",
"Destination",
"DuckDBCache",
"ReadResult",
"SecretSourceEnum",
"Source",
"StreamRecord",
"WriteResult",
]

__docformat__ = "google"
6 changes: 3 additions & 3 deletions airbyte/_batch_handles.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,15 +19,15 @@ def __init__(
stream_name: str,
batch_id: str,
files: list[Path],
file_opener: Callable[[Path], IO[bytes]],
file_opener: Callable[[Path], IO[str]],
) -> None:
"""Initialize the batch handle."""
self._stream_name = stream_name
self._batch_id = batch_id
self._files = files
self._record_count = 0
assert self._files, "A batch must have at least one file."
self._open_file_writer: IO[bytes] = file_opener(self._files[0])
self._open_file_writer: IO[str] = file_opener(self._files[0])

# Marker for whether the batch has been finalized.
self.finalized: bool = False
Expand Down Expand Up @@ -57,7 +57,7 @@ def increment_record_count(self) -> None:
self._record_count += 1

@property
def open_file_writer(self) -> IO[bytes] | None:
def open_file_writer(self) -> IO[str] | None:
"""Return the open file writer, if any, or None."""
return self._open_file_writer

Expand Down
Loading
Loading