-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(plugin-server): Remove Postgres-based plugin error logging in favor of existing ClickHouse-based approaches #18764
Conversation
…improves the chances we'll hit our delivery targets (and also makes this easier to test.)
…ith `APP_METRICS_FLUSH_FREQUENCY_MS=0`.
Size Change: 0 B Total Size: 2 MB ℹ️ View Unchanged
|
plugin-server/README.md
Outdated
@@ -47,7 +47,7 @@ testing: | |||
|
|||
1. run docker `docker compose -f docker-compose.dev.yml up` (in posthog folder) | |||
1. setup the test DBs `pnpm setup:test` | |||
1. start the plugin-server with `CLICKHOUSE_DATABASE='default' DATABASE_URL=postgres://posthog:posthog@localhost:5432/test_posthog RELOAD_PLUGIN_JITTER_MAX_MS=0 pnpm start:dev` | |||
1. start the plugin-server with `APP_METRICS_FLUSH_FREQUENCY_MS=0 CLICKHOUSE_DATABASE='default' DATABASE_URL=postgres://posthog:posthog@localhost:5432/test_posthog RELOAD_PLUGIN_JITTER_MAX_MS=0 pnpm start:dev` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having this here (and in plugin-server/bin/ci_functional_tests.sh
) feels like a hack.
I think the ideal approach would be to have the functional test setup/teardown control the server itself and provide any ad hoc test-specific configurations like this — but without taking the time to better increase test isolation here, I'm not sure that's worth doing. Not sure if there is a better compromise approach here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with you, this smells but we can live with it a bit more, then do a spring cleaning.
BTW PLUGINS_DEFAULT_LOG_LEVEL=0
should also be added here, as it's in the script, and needed by waitForPluginToLoad
, could you please add it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added this in 3392e15.
const appMetric = await waitForExpect(async () => { | ||
const appMetrics = await fetchPluginAppMetrics(pluginConfig.id) | ||
expect(appMetrics.length).toEqual(1) | ||
return appMetrics[0] | ||
}) | ||
|
||
expect(error.message).toEqual('error thrown in plugin') | ||
}) | ||
expect(appMetric.successes).toEqual(1) | ||
expect(appMetric.failures).toEqual(0) | ||
|
||
const errorLogEntry = await waitForExpect(async () => { | ||
const errorLogEntries = (await fetchPluginLogEntries(pluginConfig.id)).filter( | ||
(entry) => entry.type == PluginLogEntryType.Error | ||
) | ||
expect(errorLogEntries.length).toBe(1) | ||
return errorLogEntries[0] | ||
}) | ||
|
||
expect(errorLogEntry.message).toContain('error thrown in plugin') | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a subtle distinction here that I hadn't originally considered until making this change: it's possible for a plugin execution to succeed (and be marked as such in app metrics), but also log an error at some later time in a case similar to this one where an un-awaited promise is later rejected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All plugins will be non-async in the future ... that's my current plan at least.
export async function processEvent(event) { | ||
void new Promise((_, rejects) => { rejects(new Error('error thrown in plugin')) }).then(() => {}) | ||
return event | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding of the Promise
API is that this test and the one below it ("plugin method tests: creates error on unhandled promise errors") are functionally identical, so I consolidated them into a single test.
expect(appMetric.successes).toEqual(0) | ||
expect(appMetric.failures).toEqual(1) | ||
expect(appMetric.error_type).toEqual('Error') | ||
expect(JSON.parse(appMetric.error_details as string)).toMatchObject({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like there has to be a cleaner way to write this JSON.parse(…)
to avoid getting a TypeScript error.
In Python, I'd do something similar to this to get both the explicit runtime check validation as well as type narrowing, rather than relying on JSON.parse
to do something sensible on undefined
:
assert error_details is not None
assert json.loads(error_details) == ...
Is there some equivalent to that here?
Edit: I guess in this case, JSON.parse(appMetric.error_details!)
is a more terse way to express the same idea — but that still doesn't really have the same runtime behavior.
await hub.db.postgres.query( | ||
PostgresUse.COMMON_WRITE, | ||
'UPDATE posthog_pluginconfig SET error = $1 WHERE id = $2', | ||
// NOTE: In theory `onEvent` shouldn't be seeing events that still have the null byte, but | ||
// it's better to be safe than sorry and sanitize the value here as well. | ||
[sanitizeJsonbValue(pluginError), typeof pluginConfig === 'object' ? pluginConfig?.id : pluginConfig], | ||
'updatePluginConfigError' | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the important part that was removed.
(The queuePluginLogEntry
call below was moved up to the original call site for setError
in plugin-server/src/utils/db/error.ts
.)
@@ -53,6 +53,21 @@ interface QueuedMetric { | |||
metric: AppMetricIdentifier | |||
} | |||
|
|||
/** An aggregated AppMetric, as written to/read from a ClickHouse row. */ | |||
export interface RawAppMetric { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw some other cases of Raw
- prefix in the codebase for this type of intermediate representation, so just followed that pattern. ClickHouseAppMetric
or AppMetricRow
or something like that would also work.
if (now - this.lastFlushTime > this.flushFrequencyMs || this.queueSize > this.maxQueueSize) { | ||
await this.flush() | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was moved down to the end of this method to ensure that this block would reliably be entered during the test suite.
def get_error(self, plugin_config: PluginConfig) -> None: | ||
# Reporting the single latest error is no longer supported: use app | ||
# metrics (for fatal errors) or plugin log entries (for all errors) for | ||
# error details instead. | ||
return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This just turns this block effectively into a constant with the non-error state which is probably good enough for now but also not ideal.
If we wanted to preserve the existing behavior, we could check the log entries table in a similar fashion to how delivery rate is added to this serializer. We wouldn't have the same degree of detail here from log entry data (though we weren't actually using the data we were writing other than as a truthy value), so the type of this would need to change on the frontend to accommodate that.
I also didn't start to remove the error
field on PluginConfigType
and all the related bits (PluginErrorType
, etc.) since I don't really know my way around the frontend yet. Happy to do that as a follow up to learn more.
Moved this to draft, still a couple of tests that are using the plugin config error that I didn't come across during my previous searching around. |
…or` to appease the ESLint gods.)
…what is happening here.
3061bd0
to
afbe6b7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This ended up being a lot more than I expected. 🙂
beforeEach(async () => { | ||
// Use fake timers to ensure that we don't need to wait on e.g. retry logic. | ||
jest.useFakeTimers({ advanceTimers: true }) | ||
}) | ||
|
||
beforeAll(async () => { | ||
jest.useFakeTimers({ advanceTimers: true }) | ||
;[hub, closeHub] = await createHub() | ||
redis = await hub.redisPool.acquire() | ||
await hub.postgres.query(PostgresUse.COMMON_WRITE, POSTGRES_DELETE_TABLES_QUERY, null, 'deleteTables') // Need to clear the DB to avoid unique constraint violations on ids | ||
}) | ||
|
||
afterEach(() => { | ||
jest.clearAllTimers() | ||
jest.useRealTimers() | ||
jest.clearAllMocks() | ||
}) | ||
|
||
afterAll(async () => { | ||
afterEach(async () => { | ||
await hub.redisPool.release(redis) | ||
await teardownPlugins(hub) | ||
await closeHub() | ||
jest.clearAllTimers() | ||
jest.useRealTimers() | ||
jest.restoreAllMocks() | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's some sort of state leakage happening between these tests due to the Kafka producer that is mocked over:
posthog/plugin-server/tests/main/ingestion-queues/run-async-handlers-event-pipeline.test.ts
Lines 109 to 111 in e96da9d
jest.spyOn(hub.kafkaProducer.producer, 'produce').mockImplementation( | |
(topic, partition, message, key, timestamp, headers, cb) => cb(error) | |
) |
I couldn't get to the bottom of the actual cause in a reasonable time so I'm not exactly sure what happened. Resetting everything seems to fix it, though.
import { PluginEvent } from '@posthog/plugin-scaffold' | ||
|
||
import { waitForExpect } from '../../functional_tests/expectations' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was already imported like this in plugin-server/tests/main/ingestion-queues/session-recording/session-recordings-consumer.test.ts
so I don't feel too bad about it, but it seems like this helper could use a different home? (Or is there some generic retry that we could use instead?)
// verify the teardownPlugin code runs -- since we're reading from | ||
// ClickHouse, we need to give it a bit of time to have consumed from | ||
// the topic and written everything we're looking for to the table | ||
await waitForExpect(async () => { | ||
const logEntries = await getLogEntriesForPluginConfig(hub!, pluginConfig39.id) | ||
|
||
const systemErrors = logEntries.filter( | ||
(logEntry) => | ||
logEntry.source == PluginLogEntrySource.System && logEntry.type == PluginLogEntryType.Error | ||
) | ||
expect(systemErrors).toHaveLength(1) | ||
expect(systemErrors[0].message).toContain('Plugin failed to unload') | ||
|
||
const pluginErrors = logEntries.filter( | ||
(logEntry) => | ||
logEntry.source == PluginLogEntrySource.Plugin && logEntry.type == PluginLogEntryType.Error | ||
) | ||
expect(pluginErrors).toHaveLength(1) | ||
expect(pluginErrors[0].message).toContain('This Happened In The Teardown Palace') | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test (and the one below) was kind of weird before in how it uses the error state of the plugin to check that some code was/was not run, but it's even more weird now that it relies on the whole Kafka and ClickHouse plugin log pipeline. There's surely some more reasonable approach here, but I didn't take the time to look into what that would be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These tests (moreso the ones below, I guess) are also going to be impacted by #18816, probably worth figuring out a less brittle approach at this point. (Ignore this, missed this part: d61ae7b#diff-60e0eee9e3c167ae9c1edd68c3e7fbc753e2eb3b37b5aa0441f89240a972b9a7R19)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good on a backend side. Happy with the stubbing you're doing in django, you'll want to sync with @tiina303 about how we'll surface it in the new app management screens
plugin-server/README.md
Outdated
@@ -47,7 +47,7 @@ testing: | |||
|
|||
1. run docker `docker compose -f docker-compose.dev.yml up` (in posthog folder) | |||
1. setup the test DBs `pnpm setup:test` | |||
1. start the plugin-server with `CLICKHOUSE_DATABASE='default' DATABASE_URL=postgres://posthog:posthog@localhost:5432/test_posthog RELOAD_PLUGIN_JITTER_MAX_MS=0 pnpm start:dev` | |||
1. start the plugin-server with `APP_METRICS_FLUSH_FREQUENCY_MS=0 CLICKHOUSE_DATABASE='default' DATABASE_URL=postgres://posthog:posthog@localhost:5432/test_posthog RELOAD_PLUGIN_JITTER_MAX_MS=0 pnpm start:dev` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with you, this smells but we can live with it a bit more, then do a spring cleaning.
BTW PLUGINS_DEFAULT_LOG_LEVEL=0
should also be added here, as it's in the script, and needed by waitForPluginToLoad
, could you please add it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice 🥳
I'll take care of the frontend part ;)
const appMetric = await waitForExpect(async () => { | ||
const errorMetrics = await fetchPluginAppMetrics(pluginConfig.id) | ||
expect(errorMetrics.length).toEqual(1) | ||
return errorMetrics[0] | ||
}) | ||
|
||
test.concurrent(`plugin method tests: creates error on unhandled rejection`, async () => { | ||
const plugin = await createPlugin({ | ||
organization_id: organizationId, | ||
name: 'test plugin', | ||
plugin_type: 'source', | ||
is_global: false, | ||
source__index_ts: ` | ||
export async function processEvent(event) { | ||
void new Promise((_, rejects) => { rejects(new Error('error thrown in plugin')) }).then(() => {}) | ||
return event | ||
} | ||
`, | ||
}) | ||
const teamId = await createTeam(organizationId) | ||
const pluginConfig = await createAndReloadPluginConfig(teamId, plugin.id) | ||
expect(appMetric.successes).toEqual(0) | ||
expect(appMetric.failures).toEqual(1) | ||
expect(appMetric.error_type).toEqual('Error') | ||
expect(JSON.parse(appMetric.error_details!)).toMatchObject({ | ||
error: { message: 'error thrown in plugin' }, | ||
event: { properties: event.properties }, | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've learned to really like doing a single check, because then if the test fails I see directly what it was instead and it's quicker for me to know what broke, i.e.
await waitForExpect(async () => {
const errorMetrics = await fetchPluginAppMetrics(pluginConfig.id)
expect(errorMetrics).toBe([
successes: 0
failures: 1
error_type: 'Error'
error_details: ...
])
})
Does this make sense? Not expecting you to change an already merged PR more for maybe let's do this in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, that makes sense! (I do need to spend some time getting more familiar with the different expect
matchers so that this comes to mind more naturally.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure my code there is valid btw - I let copilot write it for me normally 😅
const appMetric = await waitForExpect(async () => { | ||
const appMetrics = await fetchPluginAppMetrics(pluginConfig.id) | ||
expect(appMetrics.length).toEqual(1) | ||
return appMetrics[0] | ||
}) | ||
|
||
expect(error.message).toEqual('error thrown in plugin') | ||
}) | ||
expect(appMetric.successes).toEqual(1) | ||
expect(appMetric.failures).toEqual(0) | ||
|
||
const errorLogEntry = await waitForExpect(async () => { | ||
const errorLogEntries = (await fetchPluginLogEntries(pluginConfig.id)).filter( | ||
(entry) => entry.type == PluginLogEntryType.Error | ||
) | ||
expect(errorLogEntries.length).toBe(1) | ||
return errorLogEntries[0] | ||
}) | ||
|
||
expect(errorLogEntry.message).toContain('error thrown in plugin') | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All plugins will be non-async in the future ... that's my current plan at least.
…or of existing ClickHouse-based approaches (#18764)
…18949) * fix: re-add background to tertiary button * final color * Update UI snapshots for `chromium` (1) * Fix button styles to allow secondary to be transparent * Update UI snapshots for `chromium` (1) * Update UI snapshots for `chromium` (2) * all styles in default state * button active states * fix full width * remove custom nav styling as buttons now 3px shorter * tertiary danger hover highlight * muted stealth when not active * button sizes * move min-height to chrome * side buttons in primary * change sidepanel buttons to tertiary * Update UI snapshots for `chromium` (1) * Update UI snapshots for `chromium` (2) * fix border radius * dark mode colors * remove unused styles * fix: set metadata when level change (#18899) * set user analytics metadata when membership level is changed * also set whenever user is updated since that includes updating current org * fix tests * fix other tests * style(3000): Align border radius to 6px in 3000 (#18866) * style: Align border radius to 6px in 3000 * Only affect .posthog-3000 * Also tune some non-4px radii * Add radius comment Co-authored-by: David Newell <[email protected]> * Fix missing warehouse mocks and strictly prevent toasts * Don't affect utility classes --------- Co-authored-by: David Newell <[email protected]> * fix(3000): various ui fixes (#18861) * subtle links in the funnel legend * tooltip styles * funnel glyphs and line * retention table * gray lettermark * breakdown tag * Update UI snapshots for `webkit` (2) * Update UI snapshots for `chromium` (1) * consistent bg naming * Update UI snapshots for `chromium` (2) * Update UI snapshots for `chromium` (1) * Update UI snapshots for `chromium` (1) * fix retention table css --------- Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com> * fix(plugin-server): Remove Postgres-based plugin error logging in favor of existing ClickHouse-based approaches (#18764) * fix: delay usage report sending on sunday (#18900) delay usage report sending on sunday * feat(hogql): run filter based insights via hogql (undo revert) (#17645) * chore(data-warehouse): fix close button (#18918) fix close button * feat: Only show other recordings if there are no pinned ones by default (#18915) * fix(trends): convert any missing property operators to the default of exact (#18895) * Convert any missing property operators to the default of exact * Handle undefined values * Check for undefined values * feat(3000): Make nav usable on mobile (#18912) * chore(3000): updated tag colors (#18786) * updated tag colors * darkened warning text * Enable 3000 snapshots * Update UI snapshots for `webkit` (2) * Update UI snapshots for `chromium` (1) * Update UI snapshots for `chromium` (2) * Lint CSS --------- Co-authored-by: Michael Matloka <[email protected]> Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com> * chore(plugin server): add OS version to person properties (#18924) * feat: Concise titles for the top nav (#18856) * fix(groups): add debounce to search (#18927) * style(3000): Clean up segmented buttons (#18897) * style(3000): Make segmented buttons feel like IRL radio buttons * Fix font size * Revert full press * Update query snapshots * Update UI snapshots for `chromium` (1) * Update UI snapshots for `chromium` (2) --------- Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com> * chore: update es build (#18917) * chore: update ESBuild * chore: update ESBuild * more logging collapsing * Fix * fiddling * Update UI snapshots for `chromium` (2) * Update UI snapshots for `chromium` (2) * end when finished if not --dev --------- Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com> * chore(plugin-server): add deprecated imports list, log if they are us… (#18726) * chore(3000): More styling tweaks (#18923) * Align more rounding * Align switch. select, input sizing with buttons * Resolve extra UI snags * Fix 3000-dependency * Fix one more stray divider * Remove legacy `description` prop of `PageHeader` * Update UI snapshots for `webkit` (2) * Update UI snapshots for `chromium` (1) * Update UI snapshots for `chromium` (2) * Make "New insight" button regular size * Update UI snapshots for `chromium` (2) --------- Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com> * feat(surveys): be able to drag and reorder questions (#18763) * reorder questions wip * reorder questions * address comments and open question box to where it's dragged * import linter fix * key prop error * rotate drag handle icon * add back auto disappear * chore(plugin-server): only inject imports that are used by the plugin and remove unused plugin imports (#18934) * fix: Swap canvas state var and remove it on unmount (#18922) * fix: Support modal url changes (#18937) * feat(3000): Relabeled apps to data pipeline, re-ordered nav (#18919) * relabeled apps to data pipeline, re-ordered nav * Update UI snapshots for `chromium` (1) * RIP connectors → apps * Update UI snapshots for `chromium` (1) * Update UI snapshots for `chromium` (1) * Update UI snapshots for `chromium` (2) * Update UI snapshots for `chromium` (2) --------- Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com> * chore(plugin-server): bump node-rdkafka to 2.18.0 (librdkafka 2.3.0) (#18420) * revert(plugin-server): bump node-rdkafka to 2.18.0 (librdkafka 2.3.0) (#18420) (#18941) * feat: Fixed up feature previews panel (#18931) * fix: PersonDisplay links should be the new subtle style (#18932) * fix: Added missing 3000 parts for exporter (#18913) * fix: e2e tests flakiness with unit image (#18933) * Change celery metrics port so it doesn't clash with webserver * Change back to unit image * chore(deps): bump aiohttp from 3.8.6 to 3.9.0 (#18920) * chore(deps): bump aiohttp from 3.8.6 to 3.9.0 Bumps [aiohttp](https://github.com/aio-libs/aiohttp) from 3.8.6 to 3.9.0. - [Release notes](https://github.com/aio-libs/aiohttp/releases) - [Changelog](https://github.com/aio-libs/aiohttp/blob/master/CHANGES.rst) - [Commits](aio-libs/aiohttp@v3.8.6...v3.9.0) --- updated-dependencies: - dependency-name: aiohttp dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> * update properly --------- Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Paul D'Ambra <[email protected]> * reset snapshots * reset button snapshots * merge cleanup * final merge * remove old snapshots * Update UI snapshots for `chromium` (1) * Update UI snapshots for `chromium` (2) * reset snapshots * change onboaring to primary-alt in 3000 * add back in upwards motion on hover * fix segmented button styles * restore button heights * padding adjacent to icons in buttons * side action hover state * Update UI snapshots for `chromium` (1) * reset snapshots * danger colors and unused color cleanup * darken icons on hover * restore all chrome hover / press depths * increase navbar padding * fix clearable select * fix inspector button in session replay * Update UI snapshots for `chromium` (1) * Update UI snapshots for `chromium` (2) * fix rebase issues * retain legacy lemon select styles * improve active / hover states on segmented button * Update UI snapshots for `chromium` (1) * Update UI snapshots for `chromium` (2) * Update UI snapshots for `chromium` (1) * Update UI snapshots for `chromium` (2) * Update UI snapshots for `chromium` (1) * Avoid extra margin between 3D buttons This allows for a slightly nicer overlaying effect * Update UI snapshots for `chromium` (1) * Update query snapshots * Update query snapshots * Update UI snapshots for `chromium` (1) * Update query snapshots * Update UI snapshots for `chromium` (1) * Update query snapshots * reset snapshots * always show danger icon opacity as 1 * stop content overflow * set full width horizontal padding * Update UI snapshots for `chromium` (1) * Update UI snapshots for `chromium` (2) * rotate shadows on sidepanel buttons * Update UI snapshots for `chromium` (1) --------- Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Ben White <[email protected]> Co-authored-by: Raquel Smith <[email protected]> Co-authored-by: Thomas Obermüller <[email protected]> Co-authored-by: ted kaemming <[email protected]> Co-authored-by: Eric Duong <[email protected]> Co-authored-by: Tom Owers <[email protected]> Co-authored-by: Michael Matloka <[email protected]> Co-authored-by: Cory Watilo <[email protected]> Co-authored-by: Michael Matloka <[email protected]> Co-authored-by: Juraj Majerik <[email protected]> Co-authored-by: Paul D'Ambra <[email protected]> Co-authored-by: Brett Hoerner <[email protected]> Co-authored-by: Li Yi Yu <[email protected]> Co-authored-by: Frank Hamand <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Problem
When a plugin fails repeatedly, the large volume of rapid updates to a single row caused by updating the
error
column can lead to lock contention which negatively impacts the performance of the ingestion pipeline.Fatal errors in plugin execution are already recorded in app metrics, and all errors (including unhandled promise rejections) are already recorded in plugin logs. These destinations are more tolerant of this type of write workload, so we can use them instead.
How did you test this code?
Updated functional tests.