-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Store size APIs should be updated to reflect when all data is not local #7332
Comments
There are several possible REST APIs changes mentioned in issue #6528:
They are different demands, so needs to be treated separately. The below are my thoughts for the requirement mentioned by @andrross in this issue, revealing the local file size usage in REST APIs. There are 2 possible directions on how to calculate the file size
Because the cache file is stored in a fixed location, and each segment file of each shard is directly shown from the file path, the requirement can be satisfied by directly getting the file size for each file or directory. List of all the path of file cache can be collected through the method "List collectFileCacheDataPath(NodePath fileCacheNodePath)", where the file directory structure is also stated: An example to illustrate what the cache folder looks like:
The folder
There are 2 possible directions on how to get the statistic for the REST APIs:
My opinion is to calculate the required file size when required (calling REST API), and there is little doubt to choose this way for the following reasons:
Additional context:
ImplementationCache size calculation for each shard and its segments The variable to store the local cache size for each shard The variable to store the local cache size for each segment The location for the codes to calculate file cache size for each shard and segment |
For more context, see #6528 (comment)
Also partially related to #7033
Existing APIs that report on "store size" will need to be expanded to account for the fact that data isn't all resident in local storage. It is still useful to know the total size of indexes and shards, but a new dimension will be needed to report the size of the data resident in local storage. This includes APIs like _cat/indices, _cat/shards, _cat/segments. Ultimately the goal here is to give users visibility into the resources being consumed by an index or shard.
The text was updated successfully, but these errors were encountered: