You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
LLM request cancellations are not transmitted reliably to Cody Gateway, leading to token inference continuing until the maximum limit is reached. It significantly increases the load on our inference providers, increasing latency and spending. By the latest estimate, about 2/3 of inferenced tokens are "overhead" tokens.
The Google Sheet with pricing breakdown and approximate potential savings.
This issue is marked as stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed automatically in 5 days.
This issue is marked as stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed automatically in 5 days.
The text was updated successfully, but these errors were encountered: