Skip to content

Commit

Permalink
Remove gerunds in headings
Browse files Browse the repository at this point in the history
Fix quotes
  • Loading branch information
dyastremsky committed Jul 2, 2024
1 parent a6a396f commit a7ab59d
Show file tree
Hide file tree
Showing 2 changed files with 11 additions and 10 deletions.
10 changes: 5 additions & 5 deletions src/c++/perf_analyzer/genai-perf/docs/embeddings.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,12 +26,12 @@ OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
-->

# Profiling Embeddings Models with GenAI-Perf
# Profile Embeddings Models with GenAI-Perf

GenAI-Perf allows you to profile embedding models running on an
[OpenAI Embeddings API](https://platform.openai.com/docs/api-reference/embeddings)-compatible server.

## Creating a Sample Embeddings Input File
## Create a Sample Embeddings Input File

To create a sample embeddings input file, use the following command:

Expand All @@ -50,13 +50,13 @@ This will generate a file named embeddings.jsonl with the following content:
{"text": "In what state did they film Shrek 2?"}
```

## Starting an OpenAI Embeddings-Compatible Server
## Start an OpenAI Embeddings-Compatible Server
To start an OpenAI embeddings-compatible server, run the following command:
```bash
docker run -it --net=host --rm --gpus=all vllm/vllm-openai:latest --model intfloat/e5-mistral-7b-instruct --dtype float16 --max-model-len 1024
```

## Running GenAI-Perf
## Run GenAI-Perf
To profile embeddings models using GenAI-Perf, use the following command:

```bash
Expand Down Expand Up @@ -90,4 +90,4 @@ Example output:
│ Request latency (ms) │ 42.21 │ 28.18 │ 318.61 │ 56.50 │ 49.21 │ 43.07 │
└──────────────────────┴───────┴───────┴────────┴───────┴───────┴───────┘
Request throughput (per sec): 23.63
```
```
11 changes: 6 additions & 5 deletions src/c++/perf_analyzer/genai-perf/docs/rankings.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,13 +26,13 @@ OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
-->

# Profiling Ranking Models with GenAI-Perf
# Profile Ranking Models with GenAI-Perf


GenAI-Perf allows you to profile ranking models compatible with Hugging Face's
[Text Embeddings Interface's re-ranker API](https://huggingface.co/docs/text-embeddings-inference/en/quick_tour#re-rankers).

## Creating a Sample Rankings Input Directory
## Create a Sample Rankings Input Directory

To create a sample rankings input directory, follow these steps:

Expand Down Expand Up @@ -62,7 +62,7 @@ echo '{"text": "Eric Anderson (born January 18, 1968) is an American sociologist
{"text": "Daddys Home 2 Principal photography on the film began in Massachusetts in March 2017 and it was released in the United States by Paramount Pictures on November 10, 2017. Although the film received unfavorable reviews, it has grossed over $180 million worldwide on a $69 million budget."}' > rankings_jsonl/passages.jsonl
```

## Starting a Hugging Face Re-Ranker-Compatible Server
## Start a Hugging Face Re-Ranker-Compatible Server
To start a Hugging Face re-ranker-compatible server, run the following commands:

```bash
Expand All @@ -73,7 +73,7 @@ volume=$PWD/data
docker run --gpus all -p 8080:80 -v $volume:/data --pull always ghcr.io/huggingface/text-embeddings-inference:1.3 --model-id $model --revision $revision
```

## Running GenAI-Perf
## Run GenAI-Perf
To profile ranking models using GenAI-Perf, use the following command:

```bash
Expand All @@ -92,11 +92,12 @@ This command specifies the use of Hugging Face's ranking API with `--endpoint re

Example output:

Copy code
```
Rankings Metrics
┏━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━┳━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━┳━━━━━━┓
┃ Statistic ┃ avg ┃ min ┃ max ┃ p99 ┃ p90 ┃ p75 ┃
┡━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━╇━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━╇━━━━━━┩
│ Request latency (ms) │ 5.48 │ 2.50 │ 23.91 │ 10.27 │ 8.34 │ 6.07 │
└──────────────────────┴──────┴──────┴───────┴───────┴──────┴──────┘
Request throughput (per sec): 180.11
```

0 comments on commit a7ab59d

Please sign in to comment.