Skip to content

Commit

Permalink
[doc] Added a benchmark report on sift (#374)
Browse files Browse the repository at this point in the history
* Added a benchmark report on sift

* Added Build Infinity

* Removed history

* minor updates

* minor updates

* Editorial updates.

* Updated benchmark report
  • Loading branch information
writinwaters authored Dec 26, 2023
1 parent dbf320b commit 06d6093
Showing 1 changed file with 31 additions and 28 deletions.
59 changes: 31 additions & 28 deletions docs/benchmark.md
Original file line number Diff line number Diff line change
@@ -1,17 +1,17 @@
# Benchmark

**Infinity** provides Python script for sift1m and gist1m dataset benchmark.
Infinity provides a Python script for benchmarking the SIFT1M and GIST1M datasets.

## Get the Infinity binary file
## Build and start Infinity

```sh
git clone https://github.com/infiniflow/infinity.git
cd infinity
```
You have two options for building Infinity. Choose the option that best fits your needs:

## Download the benchmark file
- [Build Infinity using Docker](../README.md)
- [Build from source](./build_from_source.md)

Download via wget.
## Download the Benchmark datasets

To obtain the benchmark datasets, you have the option to download them using the wget command.

```sh
#download sift benchmark
Expand All @@ -21,16 +21,16 @@ wget ftp://ftp.irisa.fr/local/texmex/corpus/gist.tar.gz

```

or visit [http://corpus-texmex.irisa.fr/](http://corpus-texmex.irisa.fr/) to download manually.
Alternatively, you can manually download the benchmark datasets by visiting [http://corpus-texmex.irisa.fr/](http://corpus-texmex.irisa.fr/).

```sh
#uncompress and move benchmark file
# Unzip and move the SIFT1M benchmark file.
tar -zxvf sift.tar.gz
mv sift/sift_base.fvecs test/data/benchmark/sift_1m/sift_base.fvecs
mv sift/sift_query.fvecs test/data/benchmark/sift_1m/sift_query.fvecs
mv sift/sift_groundtruth.ivecs test/data/benchmark/sift_1m/sift_groundtruth.ivecs


# Unzip and move the GIST1M benchmark file.
tar -zxvf gist.tar.gz
mv gist/gist_base.fvecs test/data/benchmark/gist_1m/gist_base.fvecs
mv gist/gist_query.fvecs test/data/benchmark/gist_1m/gist_query.fvecs
Expand All @@ -48,36 +48,39 @@ python setup.py bdist_wheel
pip install dist/infinity_sdk-0.1.0.dev1-py3-none-any.whl
```

## Start Infinity

See the [README.md](https://github.com/infiniflow/infinity/blob/main/README.md) to start Infinity.

## Import data
## Import the Benchmark datasets

```sh
cd benchmark

options:
-h, --help show this help message and exit
-d DATA_SET, --data DATA_SET
# options:
# -h, --help show this help message and exit
# -d DATA_SET, --data DATA_SET

python remote_benchmark_import.py -d sift_1m
python remote_benchmark_import.py -d gist_1m
```

## Run benchmark
## Run Benchmark

```sh
options:
-h, --help show this help message and exit
-t THREADS, --threads THREADS
-r ROUNDS, --rounds ROUNDS
-d DATA_SET, --data DATA_SET
# options:
# -h, --help show this help message and exit
# -t THREADS, --threads THREADS
# -r ROUNDS, --rounds ROUNDS
# -d DATA_SET, --data DATA_SET

# ROUNDS refers to the number of times that Python runs the benchmark. The result is the average time for all runs.
# ROUNDS indicates the number of times Python executes the benchmark, and the result represents the average duration for each run.

# The following command means run benchmark with one thread, for one time using the sift dataset.
# Perform a benchmark on the SIFT1M dataset using a single thread, running it only once.
python remote_benchmark.py -t 1 -r 1 -d sift_1m

# Perform a benchmark on the GIST1M dataset using a single thread, running it only once.
python remote_benchmark.py -t 1 -r 1 -d gist_1m
```
## A SIFT1M Benchmark report

- **Hardware**: Intel i5-12500H, 16C, 16GB
- **Operating system**: Ubuntu 22.04
- **Dataset**: SIFT1M; **topk**: 100; **recall**: 97%+
- **QPS**: 10,305
- **P99 Latency**: < 0.4 ms

0 comments on commit 06d6093

Please sign in to comment.