An AirbyteCatalog
is a struct that is produced by the discover
action of a source. It is a list of AirbyteStream
s. Each AirbyteStream
describes the data available to be synced from the source. After a source produces an AirbyteCatalog
or AirbyteStream
, they should be treated as read only. A ConfiguredAirbyteCatalog
is a list of ConfiguredAirbyteStream
s. Each ConfiguredAirbyteStream
describes how to sync an AirbyteStream
.
- The cursor is how sources track which records are new or updated since the last sync.
- A "cursor field" is the field that is used as a comparable for making this determinations.
- If a configuration requires a cursor field, it requires an array of strings that serves as a path to the desired field. e.g. if the structure of a stream is
{ value: 2, metadata: { updated_at: 2020-11-01 } }
thedefault_cursor_field
might be["metadata", "updated_at"]
.
- If a configuration requires a cursor field, it requires an array of strings that serves as a path to the desired field. e.g. if the structure of a stream is
This section will document the meaning of each field in an AirbyteStream
json_schema
- This field contains a JsonSchema representation of the schema of the stream.supported_sync_modes
- The sync modes that the stream supports. By default, all sources supportFULL_REFRESH
. Even if this array is empty, it can be assumed that a source supportsFULL_REFRESH
. The allowed sync modes areFULL_REFRESH
andINCREMENTAL
.source_defined_cursor
- If a source supports theINCREMENTAL
sync mode, and it sets this field to true, it is responsible for determining internally how it tracks which records in a source are new or updated since the last sync. It is an array of keys to a field in the schema.default_cursor_field
- If a source supports theINCREMENTAL
sync mode, it may, optionally, set this field. If this field is set, and the user does not override it with thecursor_field
attribute in theConfiguredAirbyteStream
(described below), this field will be used as the cursor.
This section will document the meaning of each field in an ConfiguredAirbyteStream
stream
- This field contains theAirbyteStream
that it is configured.sync_mode
- The sync mode that will be used to sync that stream. The value in this field MUST be present in thesupported_sync_modes
array for the discoveredAirbyteStream
of this stream.cursor_field
- This field is an array of keys to a field in the schema that in theINCREMENTAL
sync mode will be used to determine if a record is new or updated since the last sync.- If an
AirbyteStream
hassource_defined_cursor
set totrue
, then thecursor_field
attribute inConfiguredAirbyteStream
will be ignored. - If an
AirbyteStream
defines adefault_cursor_field
, then thecursor_field
attribute inConfiguredAirbyteStream
is not required, but if it is set, it will override the default value. - If an
AirbyteStream
does not define acursor_field
or adefault_cursor_field
, thenConfiguredAirbyteStream
must define acursor_field
.
- If an
This section lays out how a cursor field is determined in the case of a Stream that is doing an incremental
sync.
- If
source_defined_cursor
inAirbyteStream
is true, then the source determines the cursor field internally. It cannot be overriden. If it is false, continue... - If
cursor_field
inConfiguredAirbyteStream
is set, then the source uses that field as the cursor. If it is not set, continue... - If
default_cursor_field
inAirbyteStream
is set, then the sources use that field as the cursor. If it is not set, continue... - Illegal - If
source_defined_cursor
,cursor_field
, anddefault_cursor_field
are all falsey, this is an invalid configuration.