Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update UpdateForV9 in AttachmentProcessor #118186

Conversation

PeteGillinElastic
Copy link
Member

We are not going to make this change in V9. We may do it in V10. This change just bumps the annotation to remind us to revisit.

Since we are living with this for a while, it seems worth improving the documentation. This now encourages explicitly setting the option one way or the other, since you get a warning if you omit it. It also changes the existing examples to use true rather than false, as that's our recommendation. And it adds a new section with an example where it's true, and moves the content previously in a note into that section.

Copy link
Contributor

github-actions bot commented Dec 6, 2024

Documentation preview:

@PeteGillinElastic PeteGillinElastic force-pushed the v9-attachment-processor-remove-binary-cleanup branch from ffe8647 to c13b7c7 Compare December 9, 2024 09:11
@PeteGillinElastic PeteGillinElastic added >non-issue :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP v8.18.0 labels Dec 9, 2024
We are not going to make this change in V9. We may do it in V10. This
change just bumps the annotation to remind us to revisit.

Since we are living with this for a while, it seems worth improving
the documentation. This now encourages explicitly setting the option
one way or the other, since you get a warning if you omit it. It also
changes the existing examples to use true rather than false, as that's
our recommendation. And it adds a new section with an example where
it's true, and moves the content previously in a note into that
section.
@PeteGillinElastic PeteGillinElastic force-pushed the v9-attachment-processor-remove-binary-cleanup branch from c13b7c7 to ebc64bc Compare December 9, 2024 09:37
@PeteGillinElastic PeteGillinElastic marked this pull request as ready for review December 9, 2024 10:42
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

@elasticsearchmachine elasticsearchmachine added the Team:Data Management Meta label for data/management team label Dec 9, 2024
| `indexed_chars_field` | no | `null` | Field name from which you can overwrite the number of chars being used for extraction. See `indexed_chars`.
| `properties` | no | all properties | Array of properties to select to be stored. Can be `content`, `title`, `name`, `author`, `keywords`, `date`, `content_type`, `content_length`, `language`
| `ignore_missing` | no | `false` | If `true` and `field` does not exist, the processor quietly exits without modifying the document
| `remove_binary` | encouraged | `false` | If `true`, the binary `field` will be removed from the document. This option is not required, but setting it explicitly is encouraged, and omitting it will result in a warning.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't sure of the best way to convey this state of affairs... Alternative suggestions welcome!

@UpdateForV9(owner = UpdateForV9.Owner.DATA_MANAGEMENT)
// update the [remove_binary] default to be 'true' assuming enough time has passed. Deprecated in September 2022.
@UpdateForV10(owner = UpdateForV10.Owner.DATA_MANAGEMENT)
// Revisit whether we want to update the [remove_binary] default to be 'true' - would need to find a way to do this safely
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

N.B. I'm not currently changing the deprecation message (which says that the property "will be set to 'true' in a future release", when actually we're not committed to that) because we don't know for sure what we're going to do in V10 and it seems better to (a) stay with the status quo, and (b) err on the side of making an overly-strong warning. I'm open to suggestions for rewording, though.

Copy link
Member

@masseyke masseyke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish we could change the default, but given that we can't, LGTM.

@PeteGillinElastic
Copy link
Member Author

Thanks @masseyke!

@PeteGillinElastic PeteGillinElastic merged commit bc25a73 into elastic:main Dec 9, 2024
16 checks passed
@PeteGillinElastic PeteGillinElastic deleted the v9-attachment-processor-remove-binary-cleanup branch December 9, 2024 14:28
@PeteGillinElastic
Copy link
Member Author

💚 All backports created successfully

Status Branch Result
8.x

Questions ?

Please refer to the Backport tool documentation

PeteGillinElastic added a commit to PeteGillinElastic/elasticsearch that referenced this pull request Dec 9, 2024
We are not going to make this change in V9. We may do it in V10. This
change just bumps the annotation to remind us to revisit.

Since we are living with this for a while, it seems worth improving
the documentation. This now encourages explicitly setting the option
one way or the other, since you get a warning if you omit it. It also
changes the existing examples to use true rather than false, as that's
our recommendation. And it adds a new section with an example where
it's true, and moves the content previously in a note into that
section.

(cherry picked from commit bc25a73)

# Conflicts:
#	modules/ingest-attachment/src/main/java/org/elasticsearch/ingest/attachment/AttachmentProcessor.java
@PeteGillinElastic
Copy link
Member Author

💚 All backports created successfully

Status Branch Result
8.x

Questions ?

Please refer to the Backport tool documentation

PeteGillinElastic added a commit that referenced this pull request Dec 9, 2024
Improve the docs around `remove_binary` in `attachment`

Since we are living with this for a while, it seems worth improving
the documentation. This now encourages explicitly setting the option
one way or the other, since you get a warning if you omit it. It also
changes the existing examples to use true rather than false, as that's
our recommendation. And it adds a new section with an example where
it's true, and moves the content previously in a note into that
section.

(cherry picked from commit bc25a73)

# Conflicts:
#	modules/ingest-attachment/src/main/java/org/elasticsearch/ingest/attachment/AttachmentProcessor.java
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP >non-issue Team:Data Management Meta label for data/management team v8.18.0 v9.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants