Skip to content

Commit

Permalink
Instrumenting Cohere (#202)
Browse files Browse the repository at this point in the history
* Setting session to none if server does not return 200 for /sessions

* Removed wrong crew link

* Fixes

* WIP. chat working

* Working. Couple bugs

* WIP. Working

* Working

* Adding placeholders

* Done

* PR fixes

* Tested openai

* Adding Cohere to README

* Bumping version

* Final fixes
  • Loading branch information
HowieG authored May 10, 2024
1 parent 4b16778 commit 9acda5a
Show file tree
Hide file tree
Showing 7 changed files with 415 additions and 144 deletions.
141 changes: 99 additions & 42 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,37 +25,42 @@

# AgentOps 🖇️

[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) ![PyPI - Version](https://img.shields.io/pypi/v/agentops) <a href="https://pepy.tech/project/agentops">
<img src="https://static.pepy.tech/badge/agentops/month"> <a href="https://twitter.com/agentopsai">
<img src="https://img.shields.io/badge/follow-%40agentops-1DA1F2?logo=twitter&style=social" alt="AgentOps Twitter" />
</a>
<a href="https://discord.gg/mKW3ZhN9p2">
<img src="https://img.shields.io/badge/chat-on%20Discord-blueviolet" alt="Discord community channel" />
</a>
<a href="mailto:[email protected]"><img src="https://img.shields.io/website?color=%23f26522&down_message=Y%20Combinator&label=Not%20Backed%20By&logo=ycombinator&style=flat-square&up_message=Y%20Combinator&url=https%3A%2F%2Fwww.ycombinator.com"/>
<a href="https://github.com/agentops-ai/agentops/issues">
<img src="https://img.shields.io/github/commit-activity/m/agentops-ai/agentops" alt="git commit activity" />
</a>

[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
![PyPI - Version](https://img.shields.io/pypi/v/agentops)
<a href="https://pepy.tech/project/agentops">
<img src="https://static.pepy.tech/badge/agentops/month">
</a>
<a href="https://twitter.com/agentopsai">
<img src="https://img.shields.io/badge/follow-%40agentops-1DA1F2?logo=twitter&style=social" alt="AgentOps Twitter"/>
</a>
<a href="https://discord.gg/mKW3ZhN9p2">
<img src="https://img.shields.io/badge/chat-on%20Discord-blueviolet" alt="Discord community channel"/>
</a>
<a href="mailto:[email protected]">
<img src="https://img.shields.io/website?color=%23f26522&down_message=Y%20Combinator&label=Not%20Backed%20By&logo=ycombinator&style=flat-square&up_message=Y%20Combinator&url=https%3A%2F%2Fwww.ycombinator.com"/>
</a>
<a href="https://github.com/agentops-ai/agentops/issues">
<img src="https://img.shields.io/github/commit-activity/m/agentops-ai/agentops" alt="git commit activity"/>
</a>

AgentOps helps developers build, evaluate, and monitor AI agents. Tools to build agents from prototype to production.


|||
|------------------------------------------|----------------------------------------------------|
| 📊 **Replay Analytics and Debugging** | Step-by-step agent execution graphs |
| 💸 **LLM Cost Management** | Track spend with LLM foundation model providers |
| 🧪 **Agent Benchmarking** | Test your agents against 1,000+ evals |
| 🔐 **Compliance and Security** | Detect common prompt injection and data exfiltration exploits |
| 🤝 **Framework Integrations** | Easily plugs in with frameworks like CrewAI and LangChain |
| | |
| ------------------------------------- | ------------------------------------------------------------- |
| 📊 **Replay Analytics and Debugging** | Step-by-step agent execution graphs |
| 💸 **LLM Cost Management** | Track spend with LLM foundation model providers |
| 🧪 **Agent Benchmarking** | Test your agents against 1,000+ evals |
| 🔐 **Compliance and Security** | Detect common prompt injection and data exfiltration exploits |
| 🤝 **Framework Integrations** | Easily plugs in with frameworks like CrewAI and LangChain |

## Quick Start ⌨️

```python
```bash
pip install agentops
```

### Session replays in 3 lines of code

Initialize the AgentOps client and automatically get analytics on every LLM call.

```python
Expand All @@ -77,7 +82,6 @@ agentops.end_session('Success')

All your sessions are available on the [AgentOps dashboard](https://app.agentops.ai?ref=gh). Refer to our [API documentation](http://docs.agentops.ai) for detailed instructions.


<details open>
<summary>Agent Dashboard</summary>
<a href="https://app.agentops.ai?ref=gh">
Expand Down Expand Up @@ -105,14 +109,19 @@ All your sessions are available on the [AgentOps dashboard](https://app.agentops

Build Crew agents with observability with only 2 lines of code. Simply set an `AGENTOPS_API_KEY` in your environment, and your crews will get automatic monitoring on the AgentOps dashboard.

AgentOps is officially supported on Crew's latest rc branch: `crewai==0.28.9rc1`
AgentOps is integrated with CrewAI on a pre-release fork. Install crew with

* [AgentOps integration example](https://docs.agentops.ai/v1/integrations/crewai)
* [Offical CrewAI documentation](https://docs.crewai.com/how-to/AgentOps-Observability)
```bash
pip install git+https://github.com/AgentOps-AI/crewAI.git@main
```

- [AgentOps integration example](https://docs.agentops.ai/v1/integrations/crewai)
- [Official CrewAI documentation](https://docs.crewai.com/how-to/AgentOps-Observability)

### Langchain 🦜🔗

AgentOps works seamlessly with applications built using Langchain. To use the handler, install Langchain as an optional dependency:

```shell
pip install agentops[langchain]
```
Expand Down Expand Up @@ -142,41 +151,89 @@ agent = initialize_agent(tools,

Check out the [Langchain Examples Notebook](./examples/langchain_examples.ipynb) for more details including Async handlers.

### LlamaIndex 🦙
(Coming Soon)
### Cohere

First class support for Cohere(>=5.4.0). This is a living integration, should you need any added functionality please message us on Discord!

- [AgentOps integration example](https://docs.agentops.ai/v1/integrations/cohere)
- [Official Cohere documentation](https://docs.cohere.com/reference/about)

```bash
pip install cohere
```

```python python
import cohere
import agentops

# Beginning of program's code (i.e. main.py, __init__.py)
agentops.init(<INSERT YOUR API KEY HERE>)
co = cohere.Client()

chat = co.chat(
message="Is it pronounced ceaux-hear or co-hehray?"
)

print(chat)

agentops.end_session('Success')
```

```python python
import cohere
import agentops

# Beginning of program's code (i.e. main.py, __init__.py)
agentops.init(<INSERT YOUR API KEY HERE>)

co = cohere.Client()

stream = co.chat_stream(
message="Write me a haiku about the synergies between Cohere and AgentOps"
)

for event in stream:
if event.event_type == "text-generation":
print(event.text, end='')

agentops.end_session('Success')
```

### LlamaIndex 🦙

(Coming Soon)

## Time travel debugging 🔮

(coming soon!)

## Agent Arena 🥊

(coming soon!)

## Evaluations Roadmap 🧭

| Platform | Dashboard | Evals |
|---|---|---|
|✅ Python SDK | ✅ Multi-session and Cross-session metrics | ✅ Custom eval metrics |
|🚧 Evaluation builder API | ✅ Custom event tag tracking | 🔜 Agent scorecards |
|[Javascript/Typescript SDK](https://github.com/AgentOps-AI/agentops-node) | ✅ Session replays| 🔜 Evaluation playground + leaderboard|

| Platform | Dashboard | Evals |
| ---------------------------------------------------------------------------- | ------------------------------------------ | -------------------------------------- |
| ✅ Python SDK | ✅ Multi-session and Cross-session metrics | ✅ Custom eval metrics |
| 🚧 Evaluation builder API | ✅ Custom event tag tracking  | 🔜 Agent scorecards |
|[Javascript/Typescript SDK](https://github.com/AgentOps-AI/agentops-node) | ✅ Session replays | 🔜 Evaluation playground + leaderboard |

## Debugging Roadmap 🧭

| Performance testing | Environments | LLM Testing | Reasoning and execution testing |
|---|---|---|---|
|✅ Event latency analysis | 🔜 Non-stationary environment testing | 🔜 LLM non-deterministic function detection | 🚧 Infinite loops and recursive thought detection |
|✅ Agent workflow execution pricing | 🔜 Multi-modal environments | 🚧 Token limit overflow flags | 🔜 Faulty reasoning detection |
|🚧 Success validators (external) | 🔜 Execution containers | 🔜 Context limit overflow flags | 🔜 Generative code validators |
|🔜 Agent controllers/skill tests | ✅ Honeypot and prompt injection detection ([PromptArmor](https://promptarmor.com)) | 🔜 API bill tracking | 🔜 Error breakpoint analysis |
|🔜 Information context constraint testing | 🔜 Anti-agent roadblocks (i.e. Captchas) | 🔜 CI/CD integration checks | |
|🔜 Regression testing | 🔜 Multi-agent framework visualization | | |
| Performance testing | Environments | LLM Testing | Reasoning and execution testing |
| ----------------------------------------- | ----------------------------------------------------------------------------------- | ------------------------------------------- | ------------------------------------------------- |
| ✅ Event latency analysis | 🔜 Non-stationary environment testing | 🔜 LLM non-deterministic function detection | 🚧 Infinite loops and recursive thought detection |
| ✅ Agent workflow execution pricing | 🔜 Multi-modal environments | 🚧 Token limit overflow flags | 🔜 Faulty reasoning detection |
| 🚧 Success validators (external) | 🔜 Execution containers | 🔜 Context limit overflow flags | 🔜 Generative code validators |
| 🔜 Agent controllers/skill tests | ✅ Honeypot and prompt injection detection ([PromptArmor](https://promptarmor.com)) | 🔜 API bill tracking | 🔜 Error breakpoint analysis |
| 🔜 Information context constraint testing | 🔜 Anti-agent roadblocks (i.e. Captchas) | 🔜 CI/CD integration checks | |
| 🔜 Regression testing | 🔜 Multi-agent framework visualization | | |

### Why AgentOps? 🤔

Our mission is to bring your agent from prototype to production.

Agent developers often work with little to no visibility into agent testing performance. This means their agents never leave the lab. We're changing that.
Agent developers often work with little to no visibility into agent testing performance. This means their agents never leave the lab. We're changing that.

AgentOps is the easiest way to evaluate, grade, and test agents. Is there a feature you'd like to see AgentOps cover? Just raise it in the issues tab, and we'll work on adding it to the roadmap.
8 changes: 4 additions & 4 deletions agentops/client.py
Original file line number Diff line number Diff line change
Expand Up @@ -255,7 +255,7 @@ def start_session(self, tags: Optional[List[str]] = None, config: Optional[Confi
self._session = None
return logger.warning("🖇 AgentOps: Cannot start session")

logger.info('View info on this session at https://app.agentops.ai/drilldown?session_id=%s',
logger.info('View info on this session at https://app.agentops.ai/drilldown?session_id=%s',
self._session.session_id)

return self._session.session_id
Expand All @@ -277,19 +277,19 @@ def end_session(self,

if not any(end_state == state.value for state in EndState):
return logger.warning("🖇 AgentOps: Invalid end_state. Please use one of the EndState enums")

if self._worker is None or self._worker._session is None:
return logger.warning("🖇 AgentOps: Cannot end session - no current worker or session")

self._session.video = video
self._session.end_session(end_state, end_state_reason)
token_cost = self._worker.end_session(self._session)

if token_cost == 'unknown':
print('🖇 AgentOps: Could not determine cost of run.')
else:
token_cost_d = Decimal(token_cost)
print('🖇 AgentOps: This run cost ${}'.format('{:.2f}'.format(
print('\n🖇 AgentOps: This run cost ${}'.format('{:.2f}'.format(
token_cost_d) if token_cost_d == 0 else '{:.6f}'.format(token_cost_d)))
self._session = None
self._worker = None
Expand Down
Loading

0 comments on commit 9acda5a

Please sign in to comment.