Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pluggable alternatives to translog functionality using existing market solutions #1277

Closed
anasalkouz opened this issue Sep 23, 2021 · 3 comments
Labels
discuss Issues intended to help drive brainstorming and decision making distributed framework feature New feature or request Indexing & Search

Comments

@anasalkouz
Copy link
Member

Instead of having the translog saved into the file system and care of replicating it to other replica nodes, or care of backing up files (Once segement replication introduced). we can make an extend of the functionality and provide the flexibility to OpenSearch's clients to choose one of the existing solutions in the market to backup or store OpenSearch operations. For example we can use Kineses or Kafka.

@anasalkouz anasalkouz changed the title Extend translog functionality to use other technologies Extend translog functionality to existing solutions in the market Sep 23, 2021
@nknize nknize changed the title Extend translog functionality to existing solutions in the market Pluggable alternatives to translog functionality using existing market solutions Sep 29, 2021
@nknize nknize added discuss Issues intended to help drive brainstorming and decision making feature New feature or request labels Sep 29, 2021
@muralikpbhat
Copy link

Isn't object store (like S3) also an option? Queues can be really expensive at scale, especially when these messages are throw away unless the primary fails and then we need it for recovery. A cheaper storage for logging is better choice. However making it work for every single document write is challenging and we might have to suggest a bigger bulk size or grouping of documents before translog is committed (to avoid object storage write for every single doc)

@muralikpbhat
Copy link

We should also consider whether we want to just replace the current translog with other solutions from market or rethink about the whole durability story. It might be useful to have a persistent stream in-front so that clients can just keep streaming the data. We can internally consume from the queue and index at the possible rate. This would also avoid the requirement for the current translog since we can do checkpointing on this queue and re-drive the messages incase it is failed to persist the index durably.

@anasalkouz
Copy link
Member Author

Closing this issue, since we are tracking pluggable translog on this issue #1319

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discuss Issues intended to help drive brainstorming and decision making distributed framework feature New feature or request Indexing & Search
Projects
None yet
Development

No branches or pull requests

3 participants