Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Index size on opensearch is bigger then elasticsearch #3769

Open
oded-dd opened this issue Jul 5, 2022 · 13 comments
Open

[BUG] Index size on opensearch is bigger then elasticsearch #3769

oded-dd opened this issue Jul 5, 2022 · 13 comments

Comments

@oded-dd
Copy link

oded-dd commented Jul 5, 2022

Describe the bug
Opensearch - _.fdt lucene file storage is much bigger then the same index on Elasticsearch

To Reproduce

While comparing between docs size on opensearch (2.0.1) vs elasticsearch (7.10.2) we came with the following results:

Scenario: Indexing identical dataset.

The dataset is comprised of the same data, the same index mapping, and the same settings

In this scenario there was a big increase in index sizing on opensearch (2.0.1) compared to elasticsearch (7.10.2)

with source (_source)                  
  index name index uuid version lucene version segments shards size in bytes docs _.fdt file size
opensearch index_with_source_16 9qxTja-gTg22GPuhutmSig 2.0.1 9.1.0 1 1(p) 33945392 400000   (32.3 mb)
elasticsearch index_with_source_16 KJbVbHWRSEGY9T7ctDPvqg 7.10.2 8.7.0 1 1(p) 21926748 400000 (20.9MB)

We noticed that the _.fdt file is larger in opensearch 2.0.1

See attached the filesystem statistics and file sizes

also I add my script that ingest data to elasticsearch & opensearch .

we also take snapshot from elasticsearch 7.10.2 and restore to opensearch 2.0.1 and the size is close to each other .

Index with source stats on elasticsearch 7.10.2

_cat/indices/index_with_source_16?format=json&bytes=b

[
{
"health": "green",
"status": "open",
"index": "index_with_source_16",
"uuid": "KJbVbHWRSEGY9T7ctDPvqg",
"pri": "1",
"rep": "0",
"docs.count": "400000",
"docs.deleted": "0",
"store.size": "21926748",
"pri.store.size": "21926748"
}
]
/usr/share/elasticsearch/data/nodes/0/indices/KJbVbHWRSEGY9T7ctDPvqg/0/index
sh-4.4# ls -l --block-size=M
total 21M
-rw-rw-r-- 1 elasticsearch root 1M Jul  5 10:47 _u.fdm
-rw-rw-r-- 1 elasticsearch root 8M Jul  5 10:47 _u.fdt
-rw-rw-r-- 1 elasticsearch root 1M Jul  5 10:47 _u.fdx
-rw-rw-r-- 1 elasticsearch root 1M Jul  5 10:47 _u.fnm
-rw-rw-r-- 1 elasticsearch root 3M Jul  5 10:47 _u.kdd
-rw-rw-r-- 1 elasticsearch root 1M Jul  5 10:47 _u.kdi
-rw-rw-r-- 1 elasticsearch root 1M Jul  5 10:47 _u.kdm
-rw-rw-r-- 1 elasticsearch root 1M Jul  5 10:47 _u.nvd
-rw-rw-r-- 1 elasticsearch root 1M Jul  5 10:47 _u.nvm
-rw-rw-r-- 1 elasticsearch root 1M Jul  5 10:47 _u.si
-rw-rw-r-- 1 elasticsearch root 1M Jul  5 10:47 _u_Lucene80_0.dvd
-rw-rw-r-- 1 elasticsearch root 1M Jul  5 10:47 _u_Lucene80_0.dvm
-rw-rw-r-- 1 elasticsearch root 6M Jul  5 10:47 _u_Lucene84_0.doc
-rw-rw-r-- 1 elasticsearch root 3M Jul  5 10:47 _u_Lucene84_0.pos
-rw-rw-r-- 1 elasticsearch root 2M Jul  5 10:47 _u_Lucene84_0.tim
-rw-rw-r-- 1 elasticsearch root 1M Jul  5 10:47 _u_Lucene84_0.tip
-rw-rw-r-- 1 elasticsearch root 1M Jul  5 10:47 _u_Lucene84_0.tmd
-rw-rw-r-- 1 elasticsearch root 1M Jul  5 10:47 segments_5
-rw-rw-r-- 1 elasticsearch root 0M Jul  5 10:37 write.lock

Index with source stats opensearch 2.0.1

_cat/indices/index_with_source_16?format=json&bytes=b

[
{
"health": "green",
"status": "open",
"index": "index_with_source_16",
"uuid": "9qxTja-gTg22GPuhutmSig",
"pri": "1",
"rep": "0",
"docs.count": "400000",
"docs.deleted": "0",
"store.size": "33945392",
"pri.store.size": "33945392"
}
]
/usr/share/opensearch/data/nodes/0/indices/9qxTja-gTg22GPuhutmSig/0/index
sh-4.2# ls -l --block-size=M
total 33M
-rw-rw-r-- 1 opensearch root  1M Jul  5 10:46 _g.fdm
-rw-rw-r-- 1 opensearch root 21M Jul  5 10:46 _g.fdt
-rw-rw-r-- 1 opensearch root  1M Jul  5 10:46 _g.fdx
-rw-rw-r-- 1 opensearch root  1M Jul  5 10:46 _g.fnm
-rw-rw-r-- 1 opensearch root  1M Jul  5 10:46 _g.kdd
-rw-rw-r-- 1 opensearch root  1M Jul  5 10:46 _g.kdi
-rw-rw-r-- 1 opensearch root  1M Jul  5 10:46 _g.kdm
-rw-rw-r-- 1 opensearch root  1M Jul  5 10:46 _g.nvd
-rw-rw-r-- 1 opensearch root  1M Jul  5 10:46 _g.nvm
-rw-rw-r-- 1 opensearch root  1M Jul  5 10:46 _g.si
-rw-rw-r-- 1 opensearch root  7M Jul  5 10:46 _g_Lucene90_0.doc
-rw-rw-r-- 1 opensearch root  1M Jul  5 10:46 _g_Lucene90_0.dvd
-rw-rw-r-- 1 opensearch root  1M Jul  5 10:46 _g_Lucene90_0.dvm
-rw-rw-r-- 1 opensearch root  3M Jul  5 10:46 _g_Lucene90_0.pos
-rw-rw-r-- 1 opensearch root  2M Jul  5 10:46 _g_Lucene90_0.tim
-rw-rw-r-- 1 opensearch root  1M Jul  5 10:46 _g_Lucene90_0.tip
-rw-rw-r-- 1 opensearch root  1M Jul  5 10:46 _g_Lucene90_0.tmd
-rw-rw-r-- 1 opensearch root  1M Jul  5 10:46 segments_5
-rw-rw-r-- 1 opensearch root  0M Jul  5 10:30 write.lock

Expected behavior
Index on opensearch should be identical in size to the same index on elasticsearch

Additional Files
indicesstats.zip

Host/Environment:
Ubuntu 20.04

@oded-dd oded-dd added bug Something isn't working untriaged labels Jul 5, 2022
@andrross
Copy link
Member

andrross commented Jul 8, 2022

Can you confirm the index codec being used in both cases? I'm not sure if it is related, but there was a reduction in block size in Lucene that was introduced in 8.10 that improved performance at the cost of not as good of a compression ratio. The upshot is that Elasticsearch 7.10/OpenSearch 1.0 would have the better compression ratio but slower queries as compared to newer versions of OpenSearch if using "best_speed" (or "default") index codec.

@shamzeer
Copy link

+1. We observed the same issue .. there is 10 to 15% increase in size when OpenSearch is used for the same dataset indexed to ES. I think this can be easily reproduced by using a common dataset and index to both.

We also noticed an equal amount of regression in time as well. When it comes to GBs of data the time aspect is critical as well.

@backslasht
Copy link
Contributor

Tagging @sarthakaggarwal97 to take a look.

@sarthakaggarwal97
Copy link
Contributor

Took a dive into Lucene to understand what has changed between ES 7.10 and OS 2.x for the stored fields (.fdt files)
ES 7.10 uses Lucene v8.7.0, while OS 2.x is currently upgraded to 9.9.1.

The prominent difference in (.fdt) files can be explained in few of the changes that were made by Lucene in terms of block size since Lucene v8.7.0

In the Lucene 8.7.0 release, the block size that was used to compress looks to be 60kb. Since Lucene 8.10, the block size has changed to 8kb for BEST_SPEED codec. As mentioned by @andrross, the change was made in the interest of performance with the trade offs in storage
This was the commit in lucene where we changed the block size.

Sometime back, we had benchmarked the performance of different block sizes: #7475 (comment)

It was noticed that we observed some improvements in performance and store size with 16K block size. I'll take a stab at it again with different block sizes to find the optimal configuration.

@sarthakaggarwal97
Copy link
Contributor

@oded-dd

I am able to reproduce roughly 7-10% of the difference in the size of the stored fields between ES 7.10 & OS 2.x using the nyc_taxis dataset.
It would be helpful if you can share the sample document from your workload, to verify and reproduce the the difference in size of stored fields.

@sarthakaggarwal97
Copy link
Contributor

sarthakaggarwal97 commented Feb 12, 2024

I ran some more benchmarks, and would like to share the results.

With the mappings share by @oded-dd, I ran two types of experiments, and the results are quite interesting.

Experiment #A

Using the index mappings, I took a sample document and ingested duplicate documents to both ES 7.10 (60K Block Size), OS 2.11 (8K Block Size), OS 2.11 (60K Block Size). Post ingestion, the index was force merged into a single segment to derive accurate comparison.

  1. fdt file size in ES 7.10 [60K Block Size (default)] ~ 2167910 bytes
  2. fdt file size in OS 2.11 [8K Block Size (default)] ~ 3356692 bytes [54% increase]
  3. fdt file size in OS 2.11 [16K Block Size (default)] ~ 3395536 bytes [55% increase]
  4. fdt files size in OS 2.11 [60K Block Size] ~ 2234524 bytes [2% increase]

Since the values of the fields in the documents were totally same, it was expected to have a much higher compression ratio. With this, we are able to root cause the increase in the fdt files (using the mappings of @oded-dd) between ES 7.10 and OS 2.11 seems to be the block size change at the Lucene.

Experiment #B

In this experiment, I took the same index mappings but ingested unique values of the fields in the documents. Even this time, I ingested 30M unique documents with the same mappings to both ES 7.10 and OS 2.11.

  1. fdt file size in ES 7.10 [60K Block Size (default)] ~ 29221333648 bytes
  2. fdt file size in OS 2.11 [8K Block Size (default)] ~ 30227082822 bytes [3% increase]
  3. fdt file size in OS 2.11 [16K Block Size (default)] ~ 30226988072 bytes [3% increase]

With unique documents, the compression ratio does not seem to vary very significantly with just the switch of block size from 60K to 8K.
Moreover, OS 2.11 seems to have provided improved store size for the index.

  1. ES 7.10 [60K Block Size (default)]: 72.2gb
  2. OS 2.11 [8K Block Size (default)]: 70.8gb

Summary

  1. It looks like for documents which are similar in nature, the increase in block size significantly improves the compression ratio.
  2. In cases where the documents would vary, the difference in the size of stored fields is not much significant. Infact, in the mappings provided in the issue, OS 2.11 seems to be doing better in terms of overall store size of the index.

Why did Lucene changed the value of block size from 60K to 8K?

I ran the experiments using the top_hits which queries over stored fields, and found out that 8K outperforms other block sizes. Note: This is specific to queries over stored fields, in general, queries don't really rely on stored fields much.

Screenshot 2024-02-12 at 08 56 54

This confirms the change of block size was attributed to improved query performance.

@sarthakaggarwal97
Copy link
Contributor

sarthakaggarwal97 commented Feb 15, 2024

I re-performed the experiments for the similar documents using the mappings in the issue with a larger dataset (7gb in ES 7.10)

  1. fdt file size in ES 7.10 [60K Block Size (default)] ~ 1495742708 bytes
  2. fdt file size in OS 2.11 16K Block Size (default)] ~ 1907206375 bytes [27% increase]
  3. fdt file size in OS 2.11 [8K Block Size (default)] ~ 2464354679 bytes [64% increase]
  4. fdt files size in OS 2.11 [zstd - level 3] ~ 1165550677 bytes [24% decrease]
  5. fdt files size in OS 2.11 [zstd_no_dict - level 3] ~ 2295998203 bytes [53% increase]

This summarizes that Zstandard (level 3) saves up on storage by 24% when compared to ES 7.10. Currently, OpenSearch supports 6 compression levels, where with an increase in compression level, we tend to see higher compression ratio at a trade off for speed.

@mgodwan
Copy link
Member

mgodwan commented Feb 15, 2024

Thanks @sarthakaggarwal97 for sharing the storage numbers. What is the performance impact wrt throughput due to the zstd level3 and zstd no dict level3 in the benchmarks you executed recently? Is it inline with https://opensearch.org/docs/latest/im-plugin/index-codecs/#benchmarking for the workload you tested as well?

@sarthakaggarwal97
Copy link
Contributor

sarthakaggarwal97 commented Feb 15, 2024

Indexing Performance

During the indexing of these duplicate documents the CPU of the data node roughly hovered around 20-25% for all the runs.

Average Indexing Rate during the runs:

  1. ZSTD NO DICT ~ 106,331
  2. ZSTD ~ 103,543
  3. LZ4 (8K Block Size) ~ 100,423
  4. LZ4 (ES 7.10 - 60K Block Size) ~ 98,364
  5. LZ4 (16K Block Size) ~ 95,238

Query Performance

Sharing numbers for the query focused on top of stored fields for the NYC Taxis Dataset.

newplot(1)

Zstandard compression is providing optimal query performance alongside improved compression ratio over LZ4 with 60K Block size (ES 7.10).
LZ4 8K (current configuration) provides the best query performance with more disk space.

@backslasht
Copy link
Contributor

Thanks @sarthakaggarwal97 for the detailed analysis.

While Zstandard is a good fit for this use case (duplicate values), what are your thoughts on exposing block size as a configurable parameter?

@sarthakaggarwal97
Copy link
Contributor

sarthakaggarwal97 commented Feb 16, 2024

Currently, there is not an easy way to expose block size as configurable parameter. In order to support this, we would be required to have additional maintenance overhead upon lucene minor/major version release.
Apart from that, it would be another knob for the user to toggle.

Given that we have zstandard compression codecs as an alternative for such cases in the custom-codecs plugin, we may not need to expose this configuration as a setting.

@backslasht
Copy link
Contributor

Considering an alternate (Zstandard) is already available to meet this use case, I agree introducing a new setting via custom codec is an additional maintenance overhead. We can revisit the decision if a new pressing need arises.

@sarthakaggarwal97
Copy link
Contributor

In another experiment, I changed two fields ~ id and time, in order to replicate real world scenarios

  1. .fdt file size in ES 7.10 ~ 5497374633
  2. fdt file size in OS 2.11 (default [8K]) ~ 6170919472 [12% increase]
  3. fdt file size in OS 2.11 (zstd_no_dict) ~ 5393111098 [2% decrease]
  4. fdt file size in OS 2.11 (zstd) ~ 4396883954 [21% decrease]

@getsaurabh02 getsaurabh02 moved this from 🆕 New to Later (6 months plus) in Search Project Board Aug 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Later (6 months plus)
Development

No branches or pull requests

7 participants