SOR: downward-compatible `bulkUpdate` #165434

pgayvallet · 2023-09-01T08:26:06Z

In #152807 / #161822 we implemented the BWC update algorithm on the single-object update API.

We now need to do the same for bulkUpdate.

Ideally, the logic should be as factorized and reused between the two APIs as possible (similar to what we do with internalBulkResolve). ~~It may imply to refactor how we implemented update in #161822~~ We cannot implement an internalBulkUpdate for a single object update because the bulk API doesn't support upsert. However, we should share code as far as possible between the two API's where they perform similar/identical operations (calls to get docs etc).

Tasks

Give feedback

~~factorize internal logic for reuse~~
~~refactor 161822 for reuse~~
extract internal helpers from 161822 where needed for an identical task in bulkUpdate
implement bulkUpdate BWC
Options

The text was updated successfully, but these errors were encountered:

elasticmachine · 2023-09-01T08:26:09Z

Pinging @elastic/kibana-core (Team:Core)

TinaHeiligers · 2023-09-01T20:59:03Z

Roger!
let's discuss after testing week.

## Summary We've fixed the update limitation (#152807), so now it only applies to bulkUpdates #165434 ### Checklist Delete any items that are not applicable to this PR. - [ ] Any text added follows [EUI's writing guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses sentence case text and includes [i18n support](https://github.com/elastic/kibana/blob/main/packages/kbn-i18n/README.md) - [ ] [Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html) was added for features that require explanation or tutorials - [ ] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios - [ ] Any UI touched in this PR is usable by keyboard only (learn more about [keyboard accessibility](https://webaim.org/techniques/keyboard/)) - [ ] Any UI touched in this PR does not create any new axe failures (run axe in browser: [FF](https://addons.mozilla.org/en-US/firefox/addon/axe-devtools/), [Chrome](https://chrome.google.com/webstore/detail/axe-web-accessibility-tes/lhdoppojpmngadmnindnejefpokejbdd?hl=en-US)) - [ ] If a plugin configuration key changed, check if it needs to be allowlisted in the cloud and added to the [docker list](https://github.com/elastic/kibana/blob/main/src/dev/build/tasks/os_packages/docker_generator/resources/base/bin/kibana-docker) - [ ] This renders correctly on smaller devices using a responsive layout. (You can test this [in your browser](https://www.browserstack.com/guide/responsive-testing-on-local-server)) - [ ] This was checked for [cross-browser compatibility](https://www.elastic.co/support/matrix#matrix_browsers) ### Risk Matrix Delete this section if it is not applicable to this PR. Before closing this PR, invite QA, stakeholders, and other developers to identify risks that should be tested prior to the change/feature release. When forming the risk matrix, consider some of the following examples and how they may potentially impact the change: | Risk | Probability | Severity | Mitigation/Notes | |---------------------------|-------------|----------|-------------------------| | Multiple Spaces—unexpected behavior in non-default Kibana Space. | Low | High | Integration tests will verify that all features are still supported in non-default Kibana Space and when user switches between spaces. | | Multiple nodes—Elasticsearch polling might have race conditions when multiple Kibana nodes are polling for the same tasks. | High | Low | Tasks are idempotent, so executing them multiple times will not result in logical error, but will degrade performance. To test for this case we add plenty of unit tests around this logic and document manual testing procedure. | | Code should gracefully handle cases when feature X or plugin Y are disabled. | Medium | High | Unit tests will verify that any feature flag or plugin combination still results in our service operational. | | [See more potential risk examples](https://github.com/elastic/kibana/blob/main/RISK_MATRIX.mdx) | ### For maintainers - [ ] This was checked for breaking API changes and was [labeled appropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)

Part of #165434 Moves the unit tests to their own file Co-authored-by: Kibana Machine <[email protected]>

Part of elastic#165434 Moves the unit tests to their own file Co-authored-by: Kibana Machine <[email protected]>

TinaHeiligers · 2023-10-25T18:41:45Z

@pgayvallet bukkUpdate seems to only use _index. Does that mean that every saved object will be upserted if it doesn't exist when using bulkUpdate?
I'm working through the changes needed to invoke bulkUpdate for updating a single object.

pgayvallet · 2023-10-26T08:24:58Z

bukkUpdate seems to only use _index

We're using the bulk API with update operations:

kibana/packages/core/saved-objects/core-saved-objects-api-server-internal/src/lib/apis/bulk_update.ts

Lines 226 to 244 in 3b6b7ad

    
           bulkUpdateParams.push( 
        
             { 
        
               update: { 
        
                 _id: serializer.generateRawId(getNamespaceId(objectNamespace), type, id), 
        
                 _index: commonHelper.getIndexForType(type), 
        
                 ...versionProperties, 
        
               }, 
        
             }, 
        
             { 
        
               doc: { 
        
                 ...documentToSave, 
        
                 [type]: await encryptionHelper.optionallyEncryptAttributes( 
        
                   type, 
        
                   id, 
        
                   objectNamespace || namespace, 
        
                   documentToSave[type] 
        
                 ), 
        
               }, 
        
             }

(The bulk API is kinda particular, each operation is based on 2 parts in the payload, the first one being the metadata (operation, _id, _index...) and the second one being the actual doc the operation has to be performed on)

I'm working through the changes needed to invoke bulkUpdate for updating a single object.

Great! Please ping me if I can be of any help 🙂

TinaHeiligers · 2023-10-26T16:15:12Z

We're using the bulk API with update operations

Yea, I realized later I was looking at the mget request params.

TinaHeiligers · 2023-11-02T00:16:36Z

@pgayvallet I've worked on a few approaches and thought I'd double check with you before attempting yet another.

Why do you suggest using bulkUpdate for updating a single object? The underlying es API may be more performant but we're only dealing with a single object. We'd also need to interpret the result from the bulk request response before responding to the caller. I'm not against it, but I'd like to hear your rationale on that approach first.
I'm interpreting bulkUpdate to do the client-side object updates in memory. We'd need to restrict the API to only allow a certain payload size or number of docs to update, depending on the number of fields requested to change. Alternatively, should we consider restricting the payload size?
How would we handle situations where Kibana suddenly shuts down or loses connection with ES? We might be halfway through client-side updates and lose all of it. I guess this isn't necessarily a data loss but it's not a great UX.
How do we handle a situation where several updates to the same object come from different browsers? Last in first out or first in last out?

I'm sure more questions are gonna come to me but for now, this is what comes to mind.

pgayvallet · 2023-11-03T08:48:37Z

Why do you suggest using bulkUpdate for updating a single object? The underlying es API may be more performant but we're only dealing with a single object. We'd also need to interpret the result from the bulk request response before responding to the caller. I'm not against it, but I'd like to hear your rationale on that approach first.

Code factorization, mostly. I assumed the overall logic of single update and bulk update were close enough so that factorizing by having the two APIs executing the same underlying logic may be the best approach in term of maintainability.

Now, if there are things I overlooked that make this factorization somehow harder than I thought, I'm totally fine keeping the two implementations dissociated as it is the case today. If so, I'd just like to understand the reasons.

If it's "only" about the additional level of effort of refactoring update (once again), then I would also be totally fine keeping those two tasks (implementing BWC bulk update AND refactoring the BWC update to use the same code path) separate and doing the later one as a follow-up.

I'm interpreting bulkUpdate to do the client-side object updates in memory. We'd need to restrict the API to only allow a certain payload size or number of docs to update, depending on the number of fields requested to change. Alternatively, should we consider restricting the payload size?

This is a very good point, and we absolutely need to put a safeguard in place somewhere.

However, given our current usages of the bulkUpdate API, which AFAIK is only used to update a few, of dozens maximum, of documents, I don't think the risks are very high, so I think it would be acceptable to keep that as a follow-up enhancement (but please tell me if you think I'm wrong).

How would we handle situations where Kibana suddenly shuts down or loses connection with ES? We might be halfway through client-side updates and lose all of it. I guess this isn't necessarily a data loss but it's not a great UX.

As long as we're performing a single _bulk call to perform the update of all documents "atomically", then I don't think it would be any different of any other scenario where that kind of outages could occur?

Unless you're thinking of something I missed?

How do we handle a situation where several updates to the same object come from different browsers? Last in first out or first in last out?

Not sure to follow you. We don't have a "queue" of operations to perform from the repository. If multiple update calls are performed concurrently, they will be executed concurrently too.

We just need to make sure we're properly using the version properties in the update payload to protect in case of conflicts, as it is done today in bulkUpdate:

kibana/packages/core/saved-objects/core-saved-objects-api-server-internal/src/lib/apis/bulk_update.ts

Lines 227 to 233 in 3b6b7ad

    
           { 
        
             update: { 
        
               _id: serializer.generateRawId(getNamespaceId(objectNamespace), type, id), 
        
               _index: commonHelper.getIndexForType(type), 
        
               ...versionProperties, 
        
             }, 
        
           },

and as we did in update:

kibana/packages/core/saved-objects/core-saved-objects-api-server-internal/src/lib/apis/update.ts

Lines 283 to 284 in ddaed42

    
           // using version from the source doc if not provided as option to avoid erasing changes in case of concurrent calls 
        
           ...decodeRequestVersion(version || migrated!.version),

Note that you made me doubt, so I checked if the index bulk operation was supporting version fields as it does for update bulk operations, and if I trust the client typings, it's the case.

TinaHeiligers · 2023-11-06T01:33:19Z

Now, if there are things I overlooked that make this factorization somehow harder than I thought, I'm totally fine keeping the two implementations dissociated as it is the case today. If so, I'd just like to understand the reasons.

bulkUpdate doesn't appear to support the upsert action, which we need for update. Now I may be overlooking some hidden option somewhere but it's not obvious to me. We would need to extend the options to handle upsert for each object to support a single doc update, which should be doable.
bulkUpdate also doesn't appear to check if an object is shared, at least not obviously either.

These were the initial questions I had. I'll follow up again shortly to your other responses

pgayvallet · 2023-11-06T13:54:34Z

bulkUpdate doesn't appear to support the upsert action, which we need for update

I absolutely overlooked that, and that seems like a very good reason to not want to factorize the implementation of the two APIs

TinaHeiligers · 2023-11-06T21:41:03Z

I absolutely overlooked that, and that seems like a very good reason to not want to factorize the implementation of the two APIs

It's decided then: factorize update internals as far as needing to reuse the internal helpers in bulkUpdate, rather than try to work around bulkUpdate not supporting upsert.

TinaHeiligers · 2023-11-20T22:27:30Z

Progress update:

Based on the discussions in this issue, #171245 only required a few minor changes to ensure BWC for the SO bulkUpdate API.

The main changes are:

Fetch all documents, regardless of the namespace type
migrate to the client version
Update each document client-side
migrate updated documents to the version the server is on
index the updated docs

Other minor changes include type updates and adapting unit test helpers to take the additional changes into account.

Test status for these changes to `bulkUpdate` are as follows:

bulkUpdate unit tests refactored & passing. The PR includes additional tests for migrations
src/core/server/integration_tests/saved_objects/routes/bulk_update.test.ts passing locally
src/core/server/integration_tests/saved_objects/service/lib/bulk_update.test.ts added (version changes)
test/api_integration/apis/saved_objects/bulk_update.ts,
x-pack/test/saved_object_api_integration/security_and_spaces (basic & trial license configs),
x-pack/test/saved_object_api_integration/spaces_only
(TODO): Ensure Kibana runs and updates documents in all supported environments

Open questions:

How can we monitor how much memory is consumed by client-side updates?
How can we prevent (or minimize) the chance of data loss if Kibana can't index updated documents (e.g. during temporary connectivity issues or ES being unavailable)?
...other unknowns

pgayvallet added Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc Feature:Saved Objects Epic:ZDTmigrations Zero downtime migrations labels Sep 1, 2023

pgayvallet mentioned this issue Sep 1, 2023

SOR: implement downward-compatible update #152807

Closed

TinaHeiligers self-assigned this Sep 10, 2023

rudolf mentioned this issue Sep 25, 2023

Update ZDT update limitation to only bulkUpdate #167200

Merged

9 tasks

rudolf mentioned this issue Oct 12, 2023

Adapt the SOR for managed upgrades #150312

Closed

3 tasks

TinaHeiligers mentioned this issue Oct 13, 2023

moves bulk_update unit tests to separate file #168762

Merged

TinaHeiligers added a commit that referenced this issue Oct 16, 2023

moves bulk_update unit tests to separate file (#168762)

9827aec

Part of #165434 Moves the unit tests to their own file Co-authored-by: Kibana Machine <[email protected]>

dej611 pushed a commit to dej611/kibana that referenced this issue Oct 17, 2023

moves bulk_update unit tests to separate file (elastic#168762)

b161b86

Part of elastic#165434 Moves the unit tests to their own file Co-authored-by: Kibana Machine <[email protected]>

This was referenced Nov 16, 2023

[Core] [SOR] BWC bulkUpdate #171245

Merged

Moves SOR bulkCreate unit tests to dedicated file #171431

Merged

TinaHeiligers closed this as completed in #171245 Nov 28, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SOR: downward-compatible `bulkUpdate` #165434

SOR: downward-compatible `bulkUpdate` #165434

pgayvallet commented Sep 1, 2023 •

edited by TinaHeiligers

Loading

Tasks

elasticmachine commented Sep 1, 2023

TinaHeiligers commented Sep 1, 2023

TinaHeiligers commented Oct 25, 2023

pgayvallet commented Oct 26, 2023 •

edited

Loading

TinaHeiligers commented Oct 26, 2023

TinaHeiligers commented Nov 2, 2023

pgayvallet commented Nov 3, 2023 •

edited

Loading

TinaHeiligers commented Nov 6, 2023

pgayvallet commented Nov 6, 2023

TinaHeiligers commented Nov 6, 2023

TinaHeiligers commented Nov 20, 2023 •

edited

Loading

SOR: downward-compatible bulkUpdate #165434

SOR: downward-compatible bulkUpdate #165434

Comments

pgayvallet commented Sep 1, 2023 • edited by TinaHeiligers Loading

Tasks

elasticmachine commented Sep 1, 2023

TinaHeiligers commented Sep 1, 2023

TinaHeiligers commented Oct 25, 2023

pgayvallet commented Oct 26, 2023 • edited Loading

TinaHeiligers commented Oct 26, 2023

TinaHeiligers commented Nov 2, 2023

pgayvallet commented Nov 3, 2023 • edited Loading

TinaHeiligers commented Nov 6, 2023

pgayvallet commented Nov 6, 2023

TinaHeiligers commented Nov 6, 2023

TinaHeiligers commented Nov 20, 2023 • edited Loading

Progress update:

The main changes are:

Test status for these changes to bulkUpdate are as follows:

Open questions:

SOR: downward-compatible `bulkUpdate` #165434

SOR: downward-compatible `bulkUpdate` #165434

pgayvallet commented Sep 1, 2023 •

edited by TinaHeiligers

Loading

pgayvallet commented Oct 26, 2023 •

edited

Loading

pgayvallet commented Nov 3, 2023 •

edited

Loading

TinaHeiligers commented Nov 20, 2023 •

edited

Loading

Test status for these changes to `bulkUpdate` are as follows: