Skip to content

Commit

Permalink
Support streaming mode in AgentScope (#347)
Browse files Browse the repository at this point in the history
---------

Co-authored-by: zhijianma <[email protected]>
  • Loading branch information
DavdGao and zhijianma authored Jul 19, 2024
1 parent 4691a3d commit 47f570e
Show file tree
Hide file tree
Showing 36 changed files with 1,652 additions and 392 deletions.
14 changes: 11 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,13 @@ Start building LLM-empowered multi-agent applications in an easier way.

## News

- <img src="https://img.alicdn.com/imgextra/i3/O1CN01SFL0Gu26nrQBFKXFR_!!6000000007707-2-tps-500-500.png" alt="new" width="30" height="30"/>**[2024-07-18]** AgentScope supports streaming mode now! Refer to our [tutorial](https://modelscope.github.io/agentscope/en/tutorial/203-stream.html) and example [conversation in stream mode](https://github.com/modelscope/agentscope/tree/main/examples/conversation_in_stream_mode) for more details.

<h5 align="left">
<img src="https://github.com/user-attachments/assets/b14d9b2f-ce02-4f40-8c1a-950f4022c0cc" width="30%" alt="agentscope-logo">
<img src="https://github.com/user-attachments/assets/dfffbd1e-1fe7-49ee-ac11-902415b2b0d6" width="30%" alt="agentscope-logo">
</h5>

- <img src="https://img.alicdn.com/imgextra/i3/O1CN01SFL0Gu26nrQBFKXFR_!!6000000007707-2-tps-500-500.png" alt="new" width="30" height="30"/>**[2024-07-15]** AgentScope has implemented the Mixture-of-Agents algorithm. Refer to our [MoA example](https://github.com/modelscope/agentscope/blob/main/examples/conversation_mixture_of_agents) for more details.

- <img src="https://img.alicdn.com/imgextra/i3/O1CN01SFL0Gu26nrQBFKXFR_!!6000000007707-2-tps-500-500.png" alt="new" width="30" height="30"/>**[2024-06-14]** A new prompt tuning module is available in AgentScope to help developers generate and optimize the agents' system prompts! Refer to our [tutorial](https://modelscope.github.io/agentscope/en/tutorial/209-prompt_opt.html) for more details!
Expand Down Expand Up @@ -145,7 +152,7 @@ the following libraries.
**Example Applications**

- Model
- <img src="https://img.alicdn.com/imgextra/i3/O1CN01SFL0Gu26nrQBFKXFR_!!6000000007707-2-tps-500-500.png" alt="new" width="30" height="30"/>[Using Llama3 in AgentScope](https://github.com/modelscope/agentscope/blob/main/examples/model_llama3)
- [Using Llama3 in AgentScope](https://github.com/modelscope/agentscope/blob/main/examples/model_llama3)

- Conversation
- [Basic Conversation](https://github.com/modelscope/agentscope/blob/main/examples/conversation_basic)
Expand All @@ -157,8 +164,9 @@ the following libraries.
- [Conversation with RAG Agent](https://github.com/modelscope/agentscope/blob/main/examples/conversation_with_RAG_agents)
- <img src="https://img.alicdn.com/imgextra/i3/O1CN01SFL0Gu26nrQBFKXFR_!!6000000007707-2-tps-500-500.png" alt="new" width="30" height="30"/>[Conversation with gpt-4o](https://github.com/modelscope/agentscope/blob/main/examples/conversation_with_gpt-4o)
- <img src="https://img.alicdn.com/imgextra/i3/O1CN01SFL0Gu26nrQBFKXFR_!!6000000007707-2-tps-500-500.png" alt="new" width="30" height="30"/>[Conversation with Software Engineering Agent](https://github.com/modelscope/agentscope/blob/main/examples/conversation_with_swe-agent/)
- <img src="https://img.alicdn.com/imgextra/i3/O1CN01SFL0Gu26nrQBFKXFR_!!6000000007707-2-tps-500-500.png" alt="new" width="30" height="30"/>[Conversation with Customized Services](https://github.com/modelscope/agentscope/blob/main/examples/conversation_with_customized_services/)
- <img src="https://img.alicdn.com/imgextra/i3/O1CN01SFL0Gu26nrQBFKXFR_!!6000000007707-2-tps-500-500.png" alt="new" width="30" height="30"/>[Conversation with Mixture of Agents](https://github.com/modelscope/agentscope/blob/main/examples/conversation_mixture_of_agents/)
- <img src="https://img.alicdn.com/imgextra/i3/O1CN01SFL0Gu26nrQBFKXFR_!!6000000007707-2-tps-500-500.png" alt="new" width="30" height="30"/>[Conversation with Customized Tools](https://github.com/modelscope/agentscope/blob/main/examples/conversation_with_customized_services/)
- <img src="https://img.alicdn.com/imgextra/i3/O1CN01SFL0Gu26nrQBFKXFR_!!6000000007707-2-tps-500-500.png" alt="new" width="30" height="30"/>[Mixture of Agents Algorithm](https://github.com/modelscope/agentscope/blob/main/examples/conversation_mixture_of_agents/)
- <img src="https://img.alicdn.com/imgextra/i3/O1CN01SFL0Gu26nrQBFKXFR_!!6000000007707-2-tps-500-500.png" alt="new" width="30" height="30"/>[Conversation in Stream Mode](https://github.com/modelscope/agentscope/blob/main/examples/conversation_in_stream_mode/)

- Game
- [Gomoku](https://github.com/modelscope/agentscope/blob/main/examples/game_gomoku)
Expand Down
14 changes: 11 additions & 3 deletions README_ZH.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,13 @@

## 新闻

- <img src="https://img.alicdn.com/imgextra/i3/O1CN01SFL0Gu26nrQBFKXFR_!!6000000007707-2-tps-500-500.png" alt="new" width="30" height="30"/>**[2024-07-18]** AgentScope 已支持模型流式输出。请参考我们的 [**教程**](https://modelscope.github.io/agentscope/zh_CN/tutorial/203-stream.html)[**流式对话样例**](https://github.com/modelscope/agentscope/tree/main/examples/conversation_in_stream_mode)

<h5 align="left">
<img src="https://github.com/user-attachments/assets/b14d9b2f-ce02-4f40-8c1a-950f4022c0cc" width="30%" alt="agentscope-logo">
<img src="https://github.com/user-attachments/assets/dfffbd1e-1fe7-49ee-ac11-902415b2b0d6" width="30%" alt="agentscope-logo">
</h5>

- <img src="https://img.alicdn.com/imgextra/i3/O1CN01SFL0Gu26nrQBFKXFR_!!6000000007707-2-tps-500-500.png" alt="new" width="30" height="30"/>**[2024-07-15]** AgentScope 中添加了 Mixture of Agents 算法。使用样例请参考 [MoA 示例](https://github.com/modelscope/agentscope/blob/main/examples/conversation_mixture_of_agents)

- <img src="https://img.alicdn.com/imgextra/i3/O1CN01SFL0Gu26nrQBFKXFR_!!6000000007707-2-tps-500-500.png" alt="new" width="30" height="30"/>**[2024-06-14]** 新的提示调优(Prompt tuning)模块已经上线 AgentScope,用以帮助开发者生成和优化智能体的 system prompt。更多的细节和使用样例请参考 AgentScope [教程](https://modelscope.github.io/agentscope/en/tutorial/209-prompt_opt.html)
Expand Down Expand Up @@ -135,7 +142,7 @@ AgentScope支持使用以下库快速部署本地模型服务。
**样例应用**

- 模型
- <img src="https://img.alicdn.com/imgextra/i3/O1CN01SFL0Gu26nrQBFKXFR_!!6000000007707-2-tps-500-500.png" alt="new" width="30" height="30"/>[在AgentScope中使用Llama3](./examples/model_llama3)
- [在AgentScope中使用Llama3](./examples/model_llama3)

- 对话
- [基础对话](./examples/conversation_basic)
Expand All @@ -147,8 +154,9 @@ AgentScope支持使用以下库快速部署本地模型服务。
- [与RAG智能体对话](./examples/conversation_with_RAG_agents)
- <img src="https://img.alicdn.com/imgextra/i3/O1CN01SFL0Gu26nrQBFKXFR_!!6000000007707-2-tps-500-500.png" alt="new" width="30" height="30"/>[与gpt-4o模型对话](./examples/conversation_with_gpt-4o)
- <img src="https://img.alicdn.com/imgextra/i3/O1CN01SFL0Gu26nrQBFKXFR_!!6000000007707-2-tps-500-500.png" alt="new" width="30" height="30"/>[与SoftWare Engineering智能体对话](./examples/conversation_with_swe-agent/)
- <img src="https://img.alicdn.com/imgextra/i3/O1CN01SFL0Gu26nrQBFKXFR_!!6000000007707-2-tps-500-500.png" alt="new" width="30" height="30"/>[与自定义服务对话](./examples/conversation_with_customized_services/)
- <img src="https://img.alicdn.com/imgextra/i3/O1CN01SFL0Gu26nrQBFKXFR_!!6000000007707-2-tps-500-500.png" alt="new" width="30" height="30"/>[与混合智能体(Mixture of Agents)对话](https://github.com/modelscope/agentscope/blob/main/examples/conversation_mixture_of_agents/)
- <img src="https://img.alicdn.com/imgextra/i3/O1CN01SFL0Gu26nrQBFKXFR_!!6000000007707-2-tps-500-500.png" alt="new" width="30" height="30"/>[自定义工具函数](./examples/conversation_with_customized_services/)
- <img src="https://img.alicdn.com/imgextra/i3/O1CN01SFL0Gu26nrQBFKXFR_!!6000000007707-2-tps-500-500.png" alt="new" width="30" height="30"/>[Mixture of Agents算法](https://github.com/modelscope/agentscope/blob/main/examples/conversation_mixture_of_agents/)
- <img src="https://img.alicdn.com/imgextra/i3/O1CN01SFL0Gu26nrQBFKXFR_!!6000000007707-2-tps-500-500.png" alt="new" width="30" height="30"/>[流式对话](https://github.com/modelscope/agentscope/blob/main/examples/conversation_in_stream_mode/)

- 游戏
- [五子棋](./examples/game_gomoku)
Expand Down
1 change: 1 addition & 0 deletions docs/sphinx_doc/en/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ AgentScope Documentation
tutorial/103-example.md

tutorial/203-model.md
tutorial/203-stream.md
tutorial/206-prompt.md
tutorial/201-agent.md
tutorial/205-memory.md
Expand Down
123 changes: 123 additions & 0 deletions docs/sphinx_doc/en/source/tutorial/203-stream.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
(203-stream-en)=

# Streaming

AgentScope supports streaming mode for the following LLM APIs in both **terminal** and **AgentScope Studio**.

| API | Model Wrapper | `model_type` field in model configuration |
|--------------------|---------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------|
| OpenAI Chat API | [`OpenAIChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/openai_model.py) | `"openai_chat"` |
| DashScope Chat API | [`DashScopeChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/dashscope_model.py) | `"dashscope_chat"` |
| Gemini Chat API | [`GeminiChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/gemini_model.py) | `"gemini_chat"` |
| ZhipuAI Chat API | [`ZhipuAIChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/zhipu_model.py) | `"zhipuai_chat"` |
| ollama Chat API | [`OllamaChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/ollama_model.py) | `"ollama_chat"` |
| LiteLLM Chat API | [`LiteLLMChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/litellm_model.py) | `"litellm_chat"` |


## Setup Streaming Mode

AgentScope allows users to set up streaming mode in both model configuration and model calling.

### In Model Configuration

To use streaming mode, set the stream field to `True` in the model configuration.

```python
model_config = {
"config_name": "xxx",
"model_type": "xxx",
"stream": True,
# ...
}
```

### In Model Calling

Within an agent, you can call the model with the `stream` parameter set to `True`.
Note the `stream` parameter in the model calling will override the `stream` field in the model configuration.

```python
class MyAgent(AgentBase):
# ...
def reply(self, x: Optional[Msg, Sequence[Msg]] = None) -> Msg:
# ...
response = self.model(
prompt,
stream=True,
)
# ...
```

## Printing in Streaming Mode

In streaming mode, the `stream` field of a model response will be a generator, and the `text` field will be `None`.
For compatibility with the non-streaming mode, once the `text` field is accessed, the generator in `stream` field will be iterated to generate the full text and store it in the `text` field.
So that even in streaming mode, users can handle the response text in `text` field as usual.

However, if you want to print in streaming mode, just put the generator in `self.speak` to print the streaming text in the terminal and AgentScope Studio.

After printing the streaming response, the full text of the response will be available in the `response.text` field.

```python
def reply(self, x: Optional[Msg, Sequence[Msg]] = None) -> Msg:
# ...
# Use stream=True if you want to set up streaming mode in model calling
response = self.model(prompt)

# For now, the response.text is None

# Print the response in streaming mode in terminal and AgentScope Studio (if available)
self.speak(response.stream)

# After printing, the response.text will be the full text of the response, and you can handle it as usual
msg = Msg(self.name, content=response.text, role="assistant")

self.memory.add(msg)

return msg

```

## Advanced Usage

For users who want to handle the streaming response by themselves, they can iterate the generator and handle the response text in their own way.

An example of how to handle the streaming response is in the `speak` function of `AgentBase` as follows.
The `log_stream_msg` function will print the streaming response in the terminal and AgentScope Studio (if registered).

```python
# ...
elif isinstance(content, GeneratorType):
# The streaming message must share the same id for displaying in
# the agentscope studio.
msg = Msg(name=self.name, content="", role="assistant")
for last, text_chunk in content:
msg.content = text_chunk
log_stream_msg(msg, last=last)
else:
# ...
```

However, they should remember the following points:

1. When iterating the generator, the `response.text` field will include the text that has been iterated automatically.
2. The generator in the `stream` field will generate a tuple of boolean and text. The boolean indicates whether the text is the end of the response, and the text is the response text until now.
3. To print streaming text in AgentScope Studio, the message id should be the same for one response in the `log_stream_msg` function.


```python
def reply(self, x: Optional[Msg, Sequence[Msg]] = None) -> Msg:
# ...
response = self.model(prompt)

# For now, the response.text is None

# Iterate the generator and handle the response text by yourself
for last_chunk, text in response.stream:
# Handle the text in your way
# ...


```

[[Return to the top]](#203-stream-en)
1 change: 1 addition & 0 deletions docs/sphinx_doc/en/source/tutorial/main.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ AgentScope is an innovative multi-agent platform designed to empower developers
- [Installation](102-installation.md)
- [Quick Start](103-example.md)
- [Model](203-model.md)
- [Streaming](203-model.md)
- [Prompt Engineering](206-prompt.md)
- [Agent](201-agent.md)
- [Memory](205-memory.md)
Expand Down
1 change: 1 addition & 0 deletions docs/sphinx_doc/zh_CN/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ AgentScope 文档
tutorial/103-example.md

tutorial/203-model.md
tutorial/203-stream.md
tutorial/206-prompt.md
tutorial/201-agent.md
tutorial/205-memory.md
Expand Down
121 changes: 121 additions & 0 deletions docs/sphinx_doc/zh_CN/source/tutorial/203-stream.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,121 @@
(203-stream-zh)=

# 流式输出

AgentScope 支持在**终端****AgentScope Studio** 中使用以下大模型 API 的流式输出模式。

| API | Model Wrapper | 对应的 `model_type`|
|--------------------|---------------------------------------------------------------------------------------------------------------------------------|--------------------|
| OpenAI Chat API | [`OpenAIChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/openai_model.py) | `"openai_chat"` |
| DashScope Chat API | [`DashScopeChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/dashscope_model.py) | `"dashscope_chat"` |
| Gemini Chat API | [`GeminiChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/gemini_model.py) | `"gemini_chat"` |
| ZhipuAI Chat API | [`ZhipuAIChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/zhipu_model.py) | `"zhipuai_chat"` |
| ollama Chat API | [`OllamaChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/ollama_model.py) | `"ollama_chat"` |
| LiteLLM Chat API | [`LiteLLMChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/litellm_model.py) | `"litellm_chat"` |


## 设置流式输出

AgentScope 允许用户在模型配置和模型调用中设置流式输出模式。

### 模型配置

在模型配置中将 `stream` 字段设置为 `True` 以使用流式输出模式。

```python
model_config = {
"config_name": "xxx",
"model_type": "xxx",
"stream": True,
# ...
}
```

### 模型调用

在智能体中,可以在调用模型时将 `stream` 参数设置为 `True`。注意,模型调用中的 `stream` 参数将覆盖模型配置中的 `stream` 字段。

```python
class MyAgent(AgentBase):
# ...
def reply(self, x: Optional[Msg, Sequence[Msg]] = None) -> Msg:
# ...
response = self.model(
prompt,
stream=True,
)
# ...
```

## 流式打印

在流式输出模式下,模型响应的 `stream` 字段将是一个生成器,而 `text` 字段将是 `None`
为了与非流式兼容,用户一旦在迭代生成器前访问 `text` 字段,`stream` 中的生成器将被迭代以生成完整的文本,并将其存储在 `text` 字段中。
因此,即使在流式输出模式下,用户也可以像往常一样在 `text` 字段中处理响应文本而无需任何改变。

但是,如果用户需要流式的输出,只需要将生成器放在 `self.speak` 函数中,以在终端和 AgentScope Studio 中流式打印文本。

```python
def reply(self, x: Optional[Msg, Sequence[Msg]] = None) -> Msg:
# ...
# 如果想在调用时使用流式打印,在这里调用时使用 stream=True
response = self.model(prompt)

# 程序运行到这里时,response.text 为 None

# 在 terminal 和 AgentScope Studio 中流式打印文本
self.speak(response.stream)

# 生成器被迭代时,产生的文本将自动被存储在 response.text 中,因此用户可以直接使用 response.text 处理响应文本
msg = Msg(self.name, content=response.text, role="assistant")

self.memory.add(msg)

return msg

```

## 进阶用法

如果用户想要自己处理流式输出,可以通过迭代生成器来实时获得流式的响应文本。

An example of how to handle the streaming response is in the `speak` function of `AgentBase` as follows.
关于如何处理流式输出,可以参考 `AgentBase` 中的 `speak` 函数。
The `log_stream_msg` function will print the streaming response in the terminal and AgentScope Studio (if registered).
其中 `log_stream_msg` 函数将在终端和 AgentScope Studio 中实时地流式打印文本。

```python
# ...
elif isinstance(content, GeneratorType):
# 流式消息必须共享相同的 id 才能在 AgentScope Studio 中显示,因此这里通过同一条消息切换 content 字段来实现
msg = Msg(name=self.name, content="", role="assistant")
for last, text_chunk in content:
msg.content = text_chunk
log_stream_msg(msg, last=last)
else:
# ...
```

在处理生成器的时候,用户应该记住以下几点:

1. 在迭代生成器时,`response.text` 字段将自动包含已迭代的文本。
2. `stream` 字段中的生成器将生成一个布尔值和字符串的二元组。布尔值表示当前是否是最后一段文本,而字符串则是到目前为止的响应文本。
3. AgentScope Studio 依据 `log_stream_msg` 函数中输入的 `Msg` 对象的 id 判断文本是否属于同一条流式响应,若 id 不同,则会被视为不同的响应。


```python
def reply(self, x: Optional[Msg, Sequence[Msg]] = None) -> Msg:
# ...
response = self.model(prompt)

# 程序运行到这里时,response.text 为 None

# 迭代生成器,自己处理响应文本
for last_chunk, text in response.stream:
# 按照自己的需求处理响应文本
# ...


```

[[Return to the top]](#203-stream-zh)
Loading

0 comments on commit 47f570e

Please sign in to comment.