Skip to content

Latest commit

 

History

History
315 lines (197 loc) · 19.8 KB

README.md

File metadata and controls

315 lines (197 loc) · 19.8 KB

TypeAgent Typescript Code

Overview

TypeAgent is sample code that explores an architecture for building a personal agent with a natural language interfaces using TypeChat. The personal agent can work with application agents.

This directory contains Typescript implemented packages and main entry point for TypeAgent. For more details about the project, please review the TypeAgent ReadMe.

The main entry point to explore TypeAgent is the TypeAgent Shell example. Currently, we only support running from the repo (i.e. no published/installable builds). Follow the instruction below to build and run the TypeAgent Shell example.

Build

Setup

To build:

  • Install Node 18+
    • NOTE: HPC Tools conflict with node so be sure that the node.exe you are running is the correct one!
  • Install pnpm (npm i -g pnpm && pnpm setup)
  • (Linux/WSL Only) Read TypeAgent Shell's README.md for additional requirements

Steps

In this directory:

  • Run pnpm i
  • Run pnpm run build

Agent Specific Steps (Optional)

VSCode Agent

If you want to deploy the VS Code extension CODA locally please run:

  • From the root, cd ./ts/packages/coda
  • pnpm run deploy:local

You should now be able to access the extension from VS Code.

Desktop Agent (Windows only)

To use the [Desktop Agent](./packages/agents/desktop/] for windows, follow the instruction in README.md to build the AutoShell C# code necessary to interact with the OS.

Local Whisper Service (Optional)

If you want to use a local whisper service for voice input in the TypeAgent Shell, please follow instruction in the README.md in the python's whisperService directory.

Running Prerequisites

Service Keys

Multiple services are required to run the scenarios. Put the necessary keys in the .env file at this directory (TypeAgent repo's ./ts directory).

Here is an example of the minimal .env file targeting Azure:

AZURE_OPENAI_API_KEY=<service key>
AZURE_OPENAI_ENDPOINT=<endpoint URL for LLM model, e.g. GPT-4o>
AZURE_OPENAI_RESPONSE_FORMAT=1

AZURE_OPENAI_API_KEY_EMBEDDING=<service key>
AZURE_OPENAI_ENDPOINT_EMBEDDING=<endpoint URL for text-embedding-ada-002 or equivalent

Here is an example of the minimal .env file targeting OpenAI:

OPENAI_ORGANIZATION=<organization id>
OPENAI_API_KEY=<service key>
OPENAI_ENDPOINT=https://api.openai.com/v1/chat/completions
OPENAI_MODEL=gpt-4o
OPENAI_RESPONSE_FORMAT=1

OPENAI_ENDPOINT_EMBEDDING=https://api.openai.com/v1/embeddings
OPENAI_MODEL_EMBEDDING=text-embedding-ada-002

The follow set of functionality will need the services keys. Please read the links for details about the variables needed. It is possible to use "keyless" configuration for some APIs. See Keyless API Access below.

Minimum requirements to try out the experience with the List TypeAgent:

Requirements Functionality Variables Instructions Keyless Access Supported
LLM (GPT-4 or equivalent) Request translation AZURE_OPENAI_API_KEY
AZURE_OPENAI_ENDPOINT
AZURE_OPENAI_RESPONSE_FORMAT
or
OPENAI_API_KEY
OPENAI_ORGANIZATION
OPENAI_ENDPOINT
OPENAI_MODEL
OPENAI_RESPONSE_FORMAT
TypeChat instruction. Yes
Embeddings Conversation Memory

Desktop App name Fuzzy match
AZURE_OPENAI_API_KEY_EMBEDDING
AZURE_OPENAI_ENDPOINT_EMBEDDING
or
OPENAI_ENDPOINT_EMBEDDING
OPENAI_MODEL_EMBEDDING
OPENAI_API_KEY_EMBEDDING (optional if different from OPENAI_API_KEY)
Yes

Optional requirements

Requirements Functionality Variables Instructions Keyless Access Supported
Speech to Text service Voice input (shell only) SPEECH_SDK_ENDPOINT
SPEECH_SDK_KEY
SPEECH_SDK_REGION
Shell setup instruction Yes

Additional keys required for individual AppAgents (Optional if not using these AppAgents)

Requirements Functionality Variables Instructions Keyless Access Supported
Bing Search API Chat Lookup BING_API_KEY No
GPT-3.5 Turbo Fast Chat Response
Email content generation
AZURE_OPENAI_API_KEY_GPT_35_TURBO
AZURE_OPENAI_ENDPOINT_GPT_35_TURBO
Yes
Spotify Web API Music player SPOTIFY_APP_CLI
SPOTIFY_APP_CLISEC
SPOTIFY_APP_PORT
Music player setup No
Graph Application Calendar/Email MSGRAPH_APP_CLIENTID
MSGRAPH_APP_CLIENTSECRET
MSGRAPH_APP_TENANTID
No
GPT-4o Browser - Crossword Page AZURE_OPENAI_API_KEY_GPT_4_O
AZURE_OPENAI_ENDPOINT_GPT_4_O
Yes
Bing Maps Location Rest API Browser - PaleoBioDB set Lat/Longitude action BING_MAPS_API_KEY No

Other examples in the example directory may have additional service keys requirements. See the README in those examples for more detail.

Read the Debugging section for additional service keys that can be used for debugging.

Using Azure Key Vault to manage keys

The getKey script is created for developer convenience to manage service secret using Azure Key Vault and set up the local development environments.

To setup:

  • Install the latest Azure CLI
  • Run az login to login using the CLI.
  • Run az account set --subscription <Subscription Id> to set the subscription.
  • Create a Azure Key Vault with name <name>.

To update keys on the key vault:

  • Add or change the values in the .env file
  • Add new keys name in tools/scripts/getKeys.config.json
  • Run npm run getKeys -- push [--vault <name>]. (If the --vault option is omitted, the default from vault name in tools/scripts/getKeys.config.json is used.)
  • Check in the changes to tools/scripts/getKeys.config.json

To get the required config and keys saved to the .env file under the ts folder:

  • Run npm run getKeys [--vault <name>] at the root to pull secret from the key vault with <name>. (If the --vault option is omitted, the default from vault name in tools/scripts/getKeys.config.json is used.)

Note: Shared keys doesn't include Spotify integration, which can be created using the the Spotify API keys instructions

Keyless API Access

For additional security, it is possible to run a subset of the TypeAgent endpoints in a keyless environment. Instead of using keys the examples provided can use Azure Entra user identities to authenticate against endpoints. To use this approach, modify the .env file and specify identity as the key value. You must also configure your services to use RBAC and assign users access to the correct roles for each endpoint. Please see the tables above to determine keyless endpoint support.

Just-in-time Access

TypeAgent also supports least privileged security approach using Azure Entra Prividged Identity Management. Elevate.js is a script used to automate elevation. Default configuration options for elevation (duration, justification message, etc.) are stored in tools/scripts/elevate.config.json. A typical developer workflow is to run npm run elevate once at the beginning of each workday.

To learn more about JIT access: start here.

WSL

For TypeAgents that operates on the Microsoft Graph (e.g. Calendar and Email), they leverage @azure/identity for authentication and use @azure/identity-cache-persistence to cache on the local machine.

Install the following packages if you are on WSL2 environment (please restart the shell after running the commands below):

  sudo apt-get update
  sudo apt install -y gnome-keyring

After the step above, you will need to enter a password to protect the secrets in the keyring. The popup normally appears when you restart the shell and run the code that needs to persists secrets in the keyring.

Running

There are two main apps to start exploring TypeAgent: TypeAgent Shell and TypeAgent CLI. Both provides interactive agents with natural language interfaces experience via a shared package dispatcher that implemented core TypeAgent functionalities. Currently, we only support running from the repo (i.e. no published/installable builds).

Shell

TypeAgent Shell provides a light weight GUI interactive agents with natural language interfaces experience

  • Run pnpm run shell.

Also, you can go to the shell directory ./packages/shell and start from there. Please see instruction in TypeAgent Shell's README.md.

CLI

TypeAgent CLI provides a console based interactive agents with natural language interfaces experience. Additional console command is available to explore different part of TypeAgent functionalities.

  • Run pnpm run cli to get the available command
  • Run pnpm run cli -- interactive will start the interactive prompt

Also, you can go to the CLI directory ./packages/cli and start from there. Please see instruction in TypeAgent CLI's README.md for more options and detail.

Development

Main packages and directory structure

Apps:

Libraries:

Agents with natural language interfaces:

Other directories:

  • examples: various additional standalone explorations.
  • tools: tools for CI/CD and internal development environments.

Testing

Run npm run test at the root.

Schema Changes

If new translator or explainer, or any of the translator schema or explanation schema changes, the built-in construction cache and the test data needs to be regenerated and be evaluated for correctness.

Test data are located in the dispatcher's test/data directory. Each test data files are for specify translator and explainer.

Use the agent-cli data add command to add new test cases.

To regenerated you can run the following at the root or in the cli directory:

  • npm run regen:builtin - Regenerate builtin construction store.
  • npm run regen - Regenerate test data

To evaluate correctness for the test data:

  • agent-cli data diff <file> can be used to open test data file diff in the vscode.
  • Look at the translation to check if its correct. (can be skipped if translator schema didn't change).
  • Run npm run test to make sure the generated test data can be round tripped (Run in the CI as well).
  • Check the stats in the regen before and after:
    • npm run regen -- -- --none at the root will print out per file stats and total stats.
    • Make sure that the number explanation failure per file and total stay roughly same (or improved).
    • Make sure that the attempts (corrections) ratios stay roughly the same (or improve).
    • Examine if the failures are because of LLM instability:
      • Borderline failure: was there a lot of correction before and failed now.
      • Run the explanation before and after agent-cli explain --repeat 5 <RequestAction> to repeat it 5 times and compare the stats.

Linting

The repo is set up with prettier to help with consistent code style. Run npm run lint to check and npm run lint:fix to fix any issues.

Debugging

Starting Development version of TypeAgent CLI

Go to ./packages/cli, you don't have to build and just run ./bin/dev.js. It will use ts-node and build the typescript as it goes.

Launching from VSCode

If you open this directory as a workspace in VSCode, multiple launch task is defined to quickly start debug.

Common Debug Launch Task:

  • CLI interactive - ./package/cli/bin/run.js interactive
  • CLI (dev) interactive - ./package/cli/bin/dev.js interactive with a new command prompt
  • CLI (dev) interactive [Integrated Terminal] - ./bin/dev.js interactive using VSCode terminal (needed for WSL)

Attaching to running sessions

To attaching to an existing session with TypeAgent CLI's interactive mode or TypeAgent Shell, you can start inspector by issuing the command @debug and use the VSCode Attach debugger launch task to attach.

TypeAgent Shell Browser Process

With the TypeAgent Shell, press F12 will bring up the devtool to debug the browser process.

Tracing

The project uses debug package to enable tracing. There are two options to enable these traces:

Option 1: Set the namespace pattern in the environment variable DEBUG=typeagent:prompt before starting the program.

For example (in Linux), to trace the GPT prompt that get sent when running the interactive CLI.

DEBUG=typeagent:prompt packages/cli/bin/run.js interactive

Option 2: In the shell or CLI's interactive mode, you can issue the command @trace <pattern> to add to the list of namespace. Use "-" or "-*" to disable all the trace.

For example inside the CLI's interactive mode, enter

@trace *:prompt:*

Search the code base with '"typeagent:' will give all the traces available.

Logging

TypeAgent does not collect telemetry by default. Developer can enable logging to a mongodb for internal debugging purpose by providing a mongodb connection string with the MONGODB_CONNECTION_STRING variable in the .env file.

Alternate LLM

Other LLM can be substituted for GPT-4 as long as they are REST API compatible. To use a local model the follow environment variable can be used:

OPENAI_API_KEY_LOCAL=None
OPENAI_ENDPOINT_LOCAL=
OPENAI_MODEL_LOCAL=
OPENAI_ORGANIZATION_LOCAL=
OPENAI_RESPONSE_FORMAT_LOCAL=

User data location

To share user data with other developers for debugging, please look for the folder .typeagent under %USERPROFILE%on Windows and the home directory ~/ on WSL/Linux/MacOS.

Trademarks

This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.