Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add continus batch size benchmark to LLM guide #404

Merged
merged 3 commits into from
Oct 6, 2023

Conversation

matthewkotila
Copy link
Contributor

@matthewkotila matthewkotila commented Sep 28, 2023

Add third part to guide for testing continuous batch size on token-to-token latency.

@matthewkotila
Copy link
Contributor Author

Do not merge until infinite loop bug has been resolved.

@matthewkotila matthewkotila marked this pull request as ready for review October 3, 2023 23:23
src/c++/perf_analyzer/docs/llm.md Outdated Show resolved Hide resolved
src/c++/perf_analyzer/docs/llm.md Outdated Show resolved Hide resolved
src/c++/perf_analyzer/docs/llm.md Outdated Show resolved Hide resolved
@matthewkotila
Copy link
Contributor Author

Do not merge until infinite loop bug has been resolved.

Infinite loop bug resolved by #410

@matthewkotila matthewkotila merged commit 1b304c5 into periodic-concurrency-mode Oct 6, 2023
@matthewkotila matthewkotila deleted the matthewkotila-llm-guide branch October 6, 2023 23:51
matthewkotila added a commit that referenced this pull request Oct 7, 2023
* Add continus batch size benchmark to LLM guide

* Update llm.md

* Update llm.md
matthewkotila added a commit that referenced this pull request Oct 7, 2023
* Add continus batch size benchmark to LLM guide

* Update llm.md

* Update llm.md
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants