diff --git a/docs/benchmark.md b/docs/benchmark.md index c8f50edf0a..42aa1e180e 100644 --- a/docs/benchmark.md +++ b/docs/benchmark.md @@ -1,17 +1,17 @@ # Benchmark -**Infinity** provides Python script for sift1m and gist1m dataset benchmark. +Infinity provides a Python script for benchmarking the SIFT1M and GIST1M datasets. -## Get the Infinity binary file +## Build and start Infinity -```sh -git clone https://github.com/infiniflow/infinity.git -cd infinity -``` +You have two options for building Infinity. Choose the option that best fits your needs: -## Download the benchmark file +- [Build Infinity using Docker](../README.md) +- [Build from source](./build_from_source.md) -Download via wget. +## Download the Benchmark datasets + +To obtain the benchmark datasets, you have the option to download them using the wget command. ```sh #download sift benchmark @@ -21,16 +21,16 @@ wget ftp://ftp.irisa.fr/local/texmex/corpus/gist.tar.gz ``` -or visit [http://corpus-texmex.irisa.fr/](http://corpus-texmex.irisa.fr/) to download manually. +Alternatively, you can manually download the benchmark datasets by visiting [http://corpus-texmex.irisa.fr/](http://corpus-texmex.irisa.fr/). ```sh -#uncompress and move benchmark file +# Unzip and move the SIFT1M benchmark file. tar -zxvf sift.tar.gz mv sift/sift_base.fvecs test/data/benchmark/sift_1m/sift_base.fvecs mv sift/sift_query.fvecs test/data/benchmark/sift_1m/sift_query.fvecs mv sift/sift_groundtruth.ivecs test/data/benchmark/sift_1m/sift_groundtruth.ivecs - +# Unzip and move the GIST1M benchmark file. tar -zxvf gist.tar.gz mv gist/gist_base.fvecs test/data/benchmark/gist_1m/gist_base.fvecs mv gist/gist_query.fvecs test/data/benchmark/gist_1m/gist_query.fvecs @@ -48,36 +48,39 @@ python setup.py bdist_wheel pip install dist/infinity_sdk-0.1.0.dev1-py3-none-any.whl ``` -## Start Infinity - -See the [README.md](https://github.com/infiniflow/infinity/blob/main/README.md) to start Infinity. - -## Import data +## Import the Benchmark datasets ```sh cd benchmark -options: - -h, --help show this help message and exit - -d DATA_SET, --data DATA_SET +# options: +# -h, --help show this help message and exit +# -d DATA_SET, --data DATA_SET python remote_benchmark_import.py -d sift_1m python remote_benchmark_import.py -d gist_1m ``` -## Run benchmark +## Run Benchmark ```sh -options: - -h, --help show this help message and exit - -t THREADS, --threads THREADS - -r ROUNDS, --rounds ROUNDS - -d DATA_SET, --data DATA_SET +# options: +# -h, --help show this help message and exit +# -t THREADS, --threads THREADS +# -r ROUNDS, --rounds ROUNDS +# -d DATA_SET, --data DATA_SET -# ROUNDS refers to the number of times that Python runs the benchmark. The result is the average time for all runs. +# ROUNDS indicates the number of times Python executes the benchmark, and the result represents the average duration for each run. -# The following command means run benchmark with one thread, for one time using the sift dataset. +# Perform a benchmark on the SIFT1M dataset using a single thread, running it only once. python remote_benchmark.py -t 1 -r 1 -d sift_1m - +# Perform a benchmark on the GIST1M dataset using a single thread, running it only once. python remote_benchmark.py -t 1 -r 1 -d gist_1m ``` +## A SIFT1M Benchmark report + +- **Hardware**: Intel i5-12500H, 16C, 16GB +- **Operating system**: Ubuntu 22.04 +- **Dataset**: SIFT1M; **topk**: 100; **recall**: 97%+ +- **QPS**: 10,305 +- **P99 Latency**: < 0.4 ms \ No newline at end of file