Rate limiting #2

sd2k · 2023-07-24T08:53:46Z

The reverse proxy should support rate limiting to enable Grafana admins to control costs due to expensive LLM calls. Things to consider:

limits should be per provider, once we have a concept of provider (at the time of writing this would correspond to just one limit, for the the /openai path)
over what time periods should we allow rate limiting? e.g. requests per minute, per hour, per day, etc.
should we also limit API calls per client, for some definition of client? e.g. should user A have a separate limit to user B? should plugin X have a separate limit to plugin Y?

The text was updated successfully, but these errors were encountered:

sd2k · 2023-07-26T11:53:44Z

Should be fairly easy using https://pkg.go.dev/go.uber.org/ratelimit#section-readme.

Add example of indicating when a stream of responses has finished

SandersAaronD pushed a commit that referenced this issue Oct 27, 2023

Merge pull request #2 from grafana/sandersaarond/add-start-finish-state

2623938

Add example of indicating when a stream of responses has finished

SandersAaronD added a commit that referenced this issue Dec 12, 2023

Set yarn version in CI (attempt #2)

7e272cf

SandersAaronD added a commit that referenced this issue Dec 20, 2023

Set yarn version in CI (attempt #2)

077855f

SandersAaronD added a commit that referenced this issue Jan 11, 2024

Set yarn version in CI (attempt #2)

fcfed85

SandersAaronD added a commit that referenced this issue Jan 23, 2024

Set yarn version in CI (attempt #2)

1dac6ed

SandersAaronD added a commit that referenced this issue Feb 5, 2024

Set yarn version in CI (attempt #2)

a84d15c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rate limiting #2

Rate limiting #2

sd2k commented Jul 24, 2023

sd2k commented Jul 26, 2023

Rate limiting #2

Rate limiting #2

Comments

sd2k commented Jul 24, 2023

sd2k commented Jul 26, 2023