Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] OpenSearch metadata log doesn't sanitize index name before generating log entry id #555

Closed
seankao-az opened this issue Aug 9, 2024 · 1 comment
Assignees
Labels
0.5 bug Something isn't working

Comments

@seankao-az
Copy link
Collaborator

What is the bug?
Metadata log doesn't sanitize index name before generating log entry id. This will cause SHOW FLINT INDEX and FlintSpark.describeIndexes not able to find some index with special characters.

This will mostly happen when the custom datasource table is allowed to have special characters. For example, if a custom datasource has a table test/special, then its skipping index and covering index will both contain the special character /. #215 allows for such index to be created in OpenSearch, by percent encoding the special characters.

In FlintSpark.describeIndexes, when trying to getAllIndexMetadata, we get a Map of (sanitized index name -> metadata). Then we fetch metadata log entry using the returned index name. However, when creating metadata log entry for an index, the entry id is generated using unsanitized index name. Therefore it couldn't find the corresponding log entry.

How can one reproduce the bug?
Spark doesn't allow us to create table name with special characters so we can't directly test it. We can simulate above behavior by creating covering index or materialized view with special characters in their name, then try to SHOW FLINT INDEX. The returned index status will be unavailable

  1. CREATE INDEX `test/special` ON mys3.default.http_logs (status)
  2. SHOW FLINT INDEX IN mys3
[
	"flint_mys3_default_http_logs_test/special_index",
	"covering",
	"default",
	"http_logs",
	"test/special",
	false,
	"unavailable" // should be "active"
],
@seankao-az seankao-az added bug Something isn't working untriaged labels Aug 9, 2024
@seankao-az seankao-az self-assigned this Aug 9, 2024
@seankao-az seankao-az added 0.5 and removed untriaged labels Aug 9, 2024
@seankao-az seankao-az changed the title [BUG] Metadata log doesn't sanitize index name before generating log entry id [BUG] OpenSearch metadata log doesn't sanitize index name before generating log entry id Aug 9, 2024
@dai-chen
Copy link
Collaborator

dai-chen commented Aug 26, 2024

@seankao-az Is this already fixed in the PR above?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0.5 bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants