Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rate limiting #2

Open
sd2k opened this issue Jul 24, 2023 · 1 comment
Open

Rate limiting #2

sd2k opened this issue Jul 24, 2023 · 1 comment

Comments

@sd2k
Copy link
Contributor

sd2k commented Jul 24, 2023

The reverse proxy should support rate limiting to enable Grafana admins to control costs due to expensive LLM calls. Things to consider:

  • limits should be per provider, once we have a concept of provider (at the time of writing this would correspond to just one limit, for the the /openai path)
  • over what time periods should we allow rate limiting? e.g. requests per minute, per hour, per day, etc.
  • should we also limit API calls per client, for some definition of client? e.g. should user A have a separate limit to user B? should plugin X have a separate limit to plugin Y?
@sd2k
Copy link
Contributor Author

sd2k commented Jul 26, 2023

Should be fairly easy using https://pkg.go.dev/go.uber.org/ratelimit#section-readme.

SandersAaronD pushed a commit that referenced this issue Oct 27, 2023
Add example of indicating when a stream of responses has finished
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant