Releases: solliancenet/foundationallm
Release 0.7.1
Improvements
Fixes downstream package dependency issue for the use of MS Presidio in the Gatekeeper Integration API.
Fixes logic for asynchronous vectorization processing while improving performance.
Fixes bring your own OpenAI deployment pipeline.
Fixes permission issue for the Gatekeeper API having access to Azure Content Safety.
Release 0.7.0
Gateway API
The Gateway API is a load balancing and resiliency solution for embeddings. It sits in front of Azure OpenAI, serving vectorization embedding requests with the correct model and automatically handling rate limits.
- Vectorization Text Embedding Profiles can be configured to use
GatewayTextEmbedding
, complementing the existingSemanticKernelTextEmbedding
- Vectorization with the Gateway API only supports asynchronous requests
Agent RBAC
Agent-level RBAC enables FoundationaLLM administrators to manage access to individual agents, protecting organizations from data exfiltration. When a user creates an agent through the Management API, they will automatically be granted Owner access.
Vectorization Request Management Through the Management API
Users can submit and trigger Vectorization requests through the Management API, rather than the separate Vectorization API, improving consistency across the platform. Creating and triggering Vectorization requests are handled as two separate HTTP requests.
Citations Available in the Chat UI
Knowledge Management agents without Inline Contexts will include citations, indicating the document from the vector store used to answer the user's request.
Agent to Agent Conversations
Through the Semantic Kernel API, FoundationaLLM enables robust agent-to-agent interactions. Users can develop complex, multi-agent workflows that perform well across a variety of tasks.
End to end Testing architecture
With the release of 0.7.0, FoundationaLLM has established an elaborate architecture for E2E testing
Improvements
- User portal session linking and loading improvements
- Documentation updates for ACA and AKS deployments
- Added fix to ensure API keys are unique
- Some restructuring of folders and file movement
- Added support for prompt injection detection
- Added support for authorizing multiple resources in a single request
- Vectorization pipeline execution and state management improvements
- Added the ability for invocation of external orchestration services
- Added the ability to create OneLake synchronous and asynchronous vectorization
- Added support for GPT-3.5 1106 and GPT-4o
Release 0.6.0
Changes to the 0.6.0 release
This document outlines the changes made to the FoundationaLLM project in the 0.6.0 release.
Zero trust - removing dependencies on API keys
The following components are now added to the list of Entra ID managed identity-based authentication support:
- Azure CosmosDB service
- Azure OpenAI in LangChain
- AzureAIDirect orchestrator
- AzureOpenAIDirect orchestrator
Citations
Citations (which means Explainability) is to be able to justify the responses returned by the agent and identify the source of where it based the response on. This release include the API portion of this feature and in next releases we will include the UI portion of this feature.
Release 0.5.0
Features
AzureAIDirect
orchestrator
Allows pointing agents directly (no orchestrators involved) to any LLM deployed in an Azure Machine Learning workspace (e.g., Llama-2 or Mistral models).
AzureOpenAIDirect
orchestrator
Allows pointing agents directly (no orchestrators involved) to any LLM deployed in an Azure OpenAI deployment.
Override LLM parameters in completion requests
A new section is available in the completion requests that allows direct overrides of LLM parameters (e.g., top_p
, temperature
, and logprobs
for GPT).
RBAC Roles
RBAC roles (Reader
, Contributor
and User Access Administrator
) are now activated on the Management API, Core API, and Agent Factory API.
Vectorization
- Improved validation of vectorization requests (rejecting immediately requests for file types that are not supported).
- Stop vectorization request processing after N failed attempts at any given step.
- Dynamic pace of processing in vectorization worker.
- Add custom metadata to a vectorization request.
Zero trust - removing dependencies on API keys
The following components have now Entra ID managed identity-based authentication support:
- Vectorization content sources
- Resource providers
- Azure AI Search
- Authorization store and API
- Azure AI Content Safety
The following components are getting Entra ID managed identity-based authentication support in the next release:
- Azure CosmosDB service
- Azure OpenAI in LangChain
AzureAIDirect
orchestratorAzureOpenAIDirect
orchestrator
Management Portal & API Updates
Data Sources
- Data Sources consolidate Vectorization Content Source Profiles, Text Partitioning Profiles, and Text Embedding Profiles
- Users simply need to create a Data Source and select a target Azure AI Search Index to run end-to-end Vectorization from the Management Portal
- Content Source Profiles, Text Partitioning Profiles, and Text Embedding Profiles will remain available for more advanced use cases
Configuration Management
- Management Portal automatically configures Azure App Configuration keys and Azure Key Vault secrets for new Data Sources
- Management API enables management of all Azure App Configuration keys and Azure Key Vault secrets
API Changes
- Agents
- Core API
- Session-less Completion: Removal of
X-AGENT-HINT
header & passing agent name in the JSON body
- Session-less Completion: Removal of
- Vectorization path casing
Release 0.4.2
Features
Fixes the issue with the prompt prefix not being added to the context for the Internal Context agent.
Release 0.4.1
Features
Fixes support for the vectorization of PPTX files.
Release 0.4.0
Features
Management User Interface (UI)
The Management UI enables FoundationaLLM (FLLM) administrators to configure Agents without having to directly call the Management API. With this release, the Management UI has been enhanced to:
-
Support all aspects of creating and configuring Knowledge Management Public Agents in the Management Portal. This includes:
-
Selecting existing Content Source and Indexing Profiles
-
Selecting the Chunk and Overlap size settings
-
Configuring user-agent interactions, such as saving conversations and enabling Gatekeeper functionality
-
Allowing the user to create the System Prompt to influence the tone and functionality of the Agent.
-
Note
If you have not created vectorization profiles required by the agent, you will need to run the Postman collection found here first: Directly calling the Vectorization API | FoundationaLLM.
A future release of the Management UI will support the creation of vectorization profiles.
For more information on these changes, please review the document: Management UI | FoundationaLLM
Vectorization Profiles
The FLLM Vectorization pipelines require Content Source, Text Partitioning, Text Embedding, and Indexing profiles. With
this release, you can now:
-
Create all profile types without restarting the containers
-
Use the Management API to create, read, update, and delete all profile types
For more information on these changes, please review the following
documents:
Managing vectorization profiles | FoundationaLLM
Directly calling the Vectorization API | FoundationaLLM.
Knowledge Management Agents (LangChain)
The Knowledge Management Agent is responsible for providing users with a customized copilot experience based on its configuration. With this release:
-
Create Knowledge Management agents that enable Retrieval Augmented Generation (RAG) workflows targeting vectorized content generated by the Vectorization pipelines.
-
Use the Management API to create, read, update, and delete Knowledge Management Agents
-
LangChain can be used for Knowledge Management Agents
For more information on these changes, please review the document: Knowledge management agent | FoundationaLLM
Internal Context Agents (LangChain)
The Internal Context Agent provides a pass-through mechanism that sends the user prompt directly to the large language model (LLM) without any additional processing or context. With this release:
-
The Internal Context Agent now has its own resource type.
-
You can use the Management API to create, read, update, and delete Internal Context agents where no system prompt or content source is required. This gives users the ability to create their own fully constructed prompt and send it directly to OpenAI.
For more information on these changes, please review the document: Internal context agent | FoundationaLLM
Direct Completion Requests (No need for sessions)
-
You can now send a request to an Agent without first creating a chat session.
- Session-less completion is part of the Core API and uses Entra Auth
For more information on these changes, please review the document: Core API | FoundationaLLM
Note
Management Client and API App Registrations are a requirement to utilize this functionality.
General Improvements
Management API
Extended configuration validation
- Validation is now in place to ensure that required App Configuration settings, Key Vault secrets, and environment variables are in place before starting the Management API. Nonconforming configuration parameters will be written to the logs indicating what is missing, and the Management API will not start.
Improved data validation
-
Added more thorough validation and reporting of data validation issues when creating Agents and Vectorization profiles.
-
Optional fields are no longer required for validation.
For more information on these changes, please review the document: Azure App Configuration values | FoundationaLLM
Generalized Events Platform
-
The new Events Platform was added to propagate changes across supported services. You will no longer need to restart the services when configuration changes are made or new resources are created.
-
There are two (2) different types of Events:
Storage
for handling Azure Blob Storage events andVectorization
Events for handling Vectorization pipeline requests. -
The following services can now take advantage of the Generalized Events platform:
- Core API supports Storage Events
- Agent Factory API supports Storage Events
- Vectorization API supports Storage and Vectorization Events
- Vectorization Worker supports Storage and Vectorization Events
- Management API supports Storage and Vectorization Events
Open Telemetry
- Telemetry influences how logs are collected from services. It was identified that Python services were not functioning well with the App Insights SDK, so all Python and .NET services are now using Open Telemetry. All logs are still written to Azure Application Insights for simplified analysis.
Note
The Gatekeeper Integration API is not utilizing Open Telemetry with this release.
Performance and Stability Improvements
- Added new management endpoints on each of the Python APIs to allow resource providers and App Configuration values to be refreshed. Those endpoints can be called from the Management UI to perform that action.
Maintenance & Bug Fixes
-
Credentials for API Management calls to API backends are now explicitly set.
-
Updated the Bicep Template so that the App Gateway uses OWASP 3.2.
-
Disabled Key Vault setting:
Allow trusted Microsoft services to bypass this firewall
. -
Updated script to generate a host file for standard AKS.
-
Updated the versions for major Python libraries: FastAPI, LangChain, OpenAI, and gunicorn.
-
Updated all resource providers to return "404 Not Found" instead of "500 Internal Error" when resources are not found.
-
The Management API required the
object_id
field when creating new resources. This requirement has been removed. -
The Vectorization pipelines can process PowerPoint (
.pptx
) files.