Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

📦 Release: v0.1.0-rc.1 #280

Merged
merged 51 commits into from
Jun 24, 2024
Merged

📦 Release: v0.1.0-rc.1 #280

merged 51 commits into from
Jun 24, 2024

Conversation

roma-glushko
Copy link
Member

@roma-glushko roma-glushko commented Jun 24, 2024

The first major update with breaking changes to the language chat schemas
and begging of work on instrumenting the gateway with OpenTelemetry.

Added

Changed

Breaking Changes

Fixed

Security

Miscellaneous

roma-glushko and others added 30 commits March 3, 2024 00:12
- Initing a new type of workflow, a streaming (async) routing workflow using the Streaming Chat API as an example
- Updated the Bruno collection
- Updated the LanguageModel API to include `ChatStream()` and `SupportChatStream()` methods
- Get the streaming router working
- Implemented SSE event parsing to be able to work with OpenAI streaming chat API
- Integrated OpenAI chat streaming into the Glide's streaming chat API
- Covered the happy workflow by tests
* 🔒 Upgraded the crypto lib
* ⬆️ Upgrade Go to 1.22.1
* 🔒 Fiber to v2.52.2
…tStream (#166)

- Separated sync and streaming chat schemas
- Extracted assumptions on where to find latency from routing strategies to a separate `LatencyGetters` that can be different for different models/workflows
- Elaborated the client provider `chatStream()` interface. Clients now expose a response channel instead of being provided with by caller
- Connected the stream chat workflow to latency & health tracking
- Refined the `chatStream()` method of clients to return a stream struct
- Separated latency tracking of the streaming workflow from the sync chat workflow
- defined a new `HealthTracker` to incorporate all health tracking logic
Improve general coverage of the codebase:
- covered a few configs by tests
- file content expansion in configurations
- Separated chat & chat stream request schemas 
- introduced a new finish reason field 
- added metadata to stream chat response
- allow to attach some metadata to a chat stream request and then attach it to each chat stream chunk
- adjusted error message schema to include request ID and metadata
Handle a wrong API key case to make the model as unavailable permanently
…age (#184)

- Fixed the header where Anthropic API key is passed
- Started propagating token usage of Anthropic requests
- Corrected the TokenUsage interface by changing count field to integers from floats
* #173: add streaming

* #173: update header and test data

* #173: Update test and schema

* #173: lint

---------

Co-authored-by: Max <[email protected]>
* #171: support streaming

* #171: add tests & lint

* #171: update chat.go

* #171: lint

* #171: update test

---------

Co-authored-by: Max <[email protected]>
…penAI, Azure and Cohere (#194)

- text length bound passed in request params
- content moderation/toxicity
- Cohere streaming workflow doesn't seem to be working as errMapper was not really initialized. I have fixed that in this PR
- Cohere now ignores stream chunk types that Glide doesn't support like citation related stuff
- Cohere stream chunks are not set with the correct model name (e.g. some placeholder was used before)
Rendering Durations as strings rather than nanosecond integers
…re chat streams correctly (#201)

- implementing a custom stream reader to correctly handle Cohere streams
- Start handling the stream-start event to propagate generationID to all following chunks
…n case of some errors (#203)

- Passed RouterID and ModelID information in the chat stream messages
- Introduced a new ChatStreamMessage type that joins both chunk and error messages. Removed unneeded context from provider chatStream structs
- defined a set of possible error codes during chat streaming
- started simplifying logging by using context-based loggers
- Introduced finish_reason on the error schema
- Fixed validation of nested arrays, so it can now reach all structures including provider params
- Removed ChatHistory & ConversationID fields from the params
- Added a bunch of other params like max_tokens, penalties, k, p, etc.
- Added validations to some params
…g swagger.yaml file (#211)

This change fixes panics like "./docs/swagger.yaml is not found"
# Conflicts:
#	README.md
#	docs/docs.go
#	docs/swagger.json
#	docs/swagger.yaml
#	go.mod
#	go.sum
#	pkg/api/http/handlers.go
#	pkg/api/http/server.go
#	pkg/api/schemas/chat_stream.go
#	pkg/gateway.go
#	pkg/providers/azureopenai/chat_stream.go
#	pkg/providers/azureopenai/client.go
#	pkg/providers/cohere/chat.go
#	pkg/providers/cohere/chat_stream.go
#	pkg/providers/cohere/chat_stream_test.go
#	pkg/providers/cohere/client.go
#	pkg/providers/cohere/config.go
#	pkg/providers/cohere/schemas.go
#	pkg/providers/cohere/testdata/chat_stream.success.txt
#	pkg/providers/lang.go
#	pkg/providers/openai/chat.go
#	pkg/providers/openai/chat_stream.go
#	pkg/providers/openai/client.go
#	pkg/providers/provider.go
#	pkg/providers/testing/lang.go
#	pkg/providers/testing/models.go
#	pkg/routers/config.go
#	pkg/routers/router.go
#	pkg/routers/router_test.go
roma-glushko and others added 19 commits May 5, 2024 18:32
We use go.opentelemetry.io/contrib/exporters/autoexport for standard loading exporter configurations via env variables.
…s to give clients more context around the error (#236)

- Introduced a new error type to hold useful context like HTTP response status, error name, message
- If all providers are unavailable, we are not throwing 500 error anymore but 503
- Start throwing unknown_error with 500 status on unexpected exceptions
- Predefined all static HTTP errors instead of creating them every time they occur
- Introduced the name field on the error schema
- Changed the req/response schema to snake_case (hopefully, to stick with it forever)
- Removed Bruno collections (it doesn't cover all our needs like websocket or gRPC protocol)
- Moved all schemas to `api/schema` package
- Made router list API opaque
- Changed the field name for overrides not to clash with defined statements in some languages
Removing omitempty fields from chat response per request
- Introduced a new concept/struct `ChatParams` that contains all param overrides for the specific modelName/modelID
- Adjusted the LangModel interface to rely on `ChatParams` rather than the original request schema for both sync and stream chat API
- Standardize on the chat message structure with two fields. Removed all duplicated structures
- Fixed Ollama's broken/half-backed tests
Basic implementation of connection pooling for chat functionality.
@roma-glushko roma-glushko changed the title 📦 Release: v0.1.0 📦 Release: v0.1.0-rc.1 Jun 24, 2024
Copy link

codecov bot commented Jun 24, 2024

Codecov Report

Attention: Patch coverage is 67.04805% with 144 lines in your changes missing coverage. Please review.

Project coverage is 65.98%. Comparing base (754af34) to head (59aafab).

Files Patch % Lines
pkg/api/http/handlers.go 0.00% 23 Missing ⚠️
pkg/api/schemas/errors.go 10.52% 17 Missing ⚠️
pkg/api/schemas/pool.go 0.00% 16 Missing ⚠️
pkg/providers/bedrock/chat.go 54.16% 11 Missing ⚠️
pkg/gateway.go 0.00% 9 Missing ⚠️
pkg/telemetry/telemetry.go 86.20% 6 Missing and 2 partials ⚠️
pkg/api/schemas/chat_stream.go 0.00% 6 Missing ⚠️
pkg/providers/anthropic/chat.go 75.00% 3 Missing and 1 partial ⚠️
pkg/providers/ollama/chat.go 75.00% 3 Missing and 1 partial ⚠️
pkg/api/schemas/chat.go 88.88% 2 Missing and 1 partial ⚠️
... and 23 more
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #280      +/-   ##
==========================================
- Coverage   66.98%   65.98%   -1.00%     
==========================================
  Files          78       83       +5     
  Lines        3577     3634      +57     
==========================================
+ Hits         2396     2398       +2     
- Misses       1054     1114      +60     
+ Partials      127      122       -5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@roma-glushko roma-glushko merged commit 1e2b16f into main Jun 24, 2024
15 of 16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants