diff --git a/.github/ISSUE_TEMPLATE/bug_report.md b/.github/ISSUE_TEMPLATE/bug_report.md index c1bf236b4..5de222fa8 100644 --- a/.github/ISSUE_TEMPLATE/bug_report.md +++ b/.github/ISSUE_TEMPLATE/bug_report.md @@ -7,6 +7,8 @@ assignees: '' --- +**AgentScope is an open-source project. To involve a broader community, we recommend asking your questions in English.** + **Describe the bug** A clear and concise description of what the bug is. diff --git a/.github/ISSUE_TEMPLATE/custom.md b/.github/ISSUE_TEMPLATE/custom.md index 48d5f81fa..8ff2e28ce 100644 --- a/.github/ISSUE_TEMPLATE/custom.md +++ b/.github/ISSUE_TEMPLATE/custom.md @@ -7,4 +7,7 @@ assignees: '' --- +**AgentScope is an open-source project. To involve a broader community, we recommend asking your questions in English.** + + diff --git a/.github/ISSUE_TEMPLATE/feature_request.md b/.github/ISSUE_TEMPLATE/feature_request.md index 215ab80f1..6d6921243 100644 --- a/.github/ISSUE_TEMPLATE/feature_request.md +++ b/.github/ISSUE_TEMPLATE/feature_request.md @@ -7,6 +7,9 @@ assignees: '' --- +**AgentScope is an open-source project. To involve a broader community, we recommend asking your questions in English.** + + **Is your feature request related to a problem? Please describe.** A clear and concise description of what the problem is. Ex. I'm always frustrated when [...] diff --git a/README.md b/README.md index e2a608b2b..993ffdcd6 100644 --- a/README.md +++ b/README.md @@ -2,11 +2,15 @@ English | [**中文**](README_ZH.md) # AgentScope +

+agentscope-logo +

+ Start building LLM-empowered multi-agent applications in an easier way. [![](https://img.shields.io/badge/cs.MA-2402.14034-B31C1C?logo=arxiv&logoColor=B31C1C)](https://arxiv.org/abs/2402.14034) [![](https://img.shields.io/badge/python-3.9+-blue)](https://pypi.org/project/agentscope/) -[![](https://img.shields.io/badge/pypi-v0.0.3-blue?logo=pypi)](https://pypi.org/project/agentscope/) +[![](https://img.shields.io/badge/pypi-v0.0.4-blue?logo=pypi)](https://pypi.org/project/agentscope/) [![](https://img.shields.io/badge/Docs-English%7C%E4%B8%AD%E6%96%87-blue?logo=markdown)](https://modelscope.github.io/agentscope/#welcome-to-agentscope-tutorial-hub) [![](https://img.shields.io/badge/Docs-API_Reference-blue?logo=markdown)](https://modelscope.github.io/agentscope/) [![](https://img.shields.io/badge/ModelScope-Demos-4e29ff.svg?logo=)](https://modelscope.cn/studios?name=agentscope&page=1&sort=latest) @@ -27,13 +31,21 @@ Welcome to join our community on ## News +- new**[2024-05-15]** A new **Parser Module** for **formatted response** is added in AgentScope! Refer to our [tutorial](https://modelscope.github.io/agentscope/en/tutorial/203-parser.html) for more details. The [`DictDialogAgent`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/agents/dict_dialog_agent.py) and [werewolf game](https://github.com/modelscope/agentscope/tree/main/examples/game_werewolf) example are updated simultaneously. + +- new**[2024-05-14]** Dear AgentScope users, we are conducting a survey on **AgentScope Workstation & Copilot** user experience. We currently need your valuable feedback to help us improve the experience of AgentScope's Drag & Drop multi-agent application development and Copilot. Your feedback is valuable and the survey will take about 3~5 minutes. Please click [URL](https://survey.aliyun.com/apps/zhiliao/vgpTppn22) to participate in questionnaire surveys. Thank you very much for your support and contribution! + +- new**[2024-05-14]** AgentScope supports **gpt-4o** as well as other OpenAI vision models now! Try gpt-4o with its [model configuration](./examples/model_configs_template/openai_chat_template.json) and new example [Conversation with gpt-4o](./examples/conversation_with_gpt-4o)! + +- new**[2024-04-30]** We release **AgentScope** v0.0.4 now! + - new**[2024-04-27]** [AgentScope Workstation](https://agentscope.aliyun.com/) is now online! You are welcome to try building your multi-agent application simply with our *drag-and-drop platform* and ask our *copilot* questions about AgentScope! -- new**[2024-04-19]** AgentScope supports Llama3 now! We provide [scripts](./examples/model_llama3) and example [model configuration](./examples/model_llama3) for quick set-up. Feel free to try llama3 in our examples! +- **[2024-04-19]** AgentScope supports Llama3 now! We provide [scripts](./examples/model_llama3) and example [model configuration](./examples/model_llama3) for quick set-up. Feel free to try llama3 in our examples! -- new**[2024-04-06]** We release **AgentScope** v0.0.3 now! +- **[2024-04-06]** We release **AgentScope** v0.0.3 now! -- new**[2024-04-06]** New examples [Gomoku](./examples/game_gomoku), [Conversation with ReAct Agent](./examples/conversation_with_react_agent), [Conversation with RAG Agent](./examples/conversation_with_RAG_agents) and [Distributed Parallel Search](./examples/distributed_search) are available now! +- **[2024-04-06]** New examples [Gomoku](./examples/game_gomoku), [Conversation with ReAct Agent](./examples/conversation_with_react_agent), [Conversation with RAG Agent](./examples/conversation_with_RAG_agents) and [Distributed Parallel Search](./examples/distributed_search) are available now! - **[2024-03-19]** We release **AgentScope** v0.0.2 now! In this new version, AgentScope supports [ollama](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#supported-models)(A local CPU inference engine), [DashScope](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#supported-models) and Google [Gemini](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#supported-models) APIs. @@ -69,21 +81,24 @@ applications in a centralized programming manner for streamlined development. AgentScope provides a list of `ModelWrapper` to support both local model services and third-party model APIs. -| API | Task | Model Wrapper | Configuration | Some Supported Models | -|------------------------|-----------------|---------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------|-----------------------------------------------| -| OpenAI API | Chat | [`OpenAIChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/openai_model.py) |[guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#openai-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/openai_chat_template.json) | gpt-4, gpt-3.5-turbo, ... | -| | Embedding | [`OpenAIEmbeddingWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/openai_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#openai-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/openai_embedding_template.json) | text-embedding-ada-002, ... | -| | DALL·E | [`OpenAIDALLEWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/openai_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#openai-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/openai_dall_e_template.json) | dall-e-2, dall-e-3 | -| DashScope API | Chat | [`DashScopeChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/dashscope_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#dashscope-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/dashscope_chat_template.json) | qwen-plus, qwen-max, ... | -| | Image Synthesis | [`DashScopeImageSynthesisWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/dashscope_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#dashscope-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/dashscope_image_synthesis_template.json) | wanx-v1 | -| | Text Embedding | [`DashScopeTextEmbeddingWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/dashscope_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#dashscope-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/dashscope_text_embedding_template.json) | text-embedding-v1, text-embedding-v2, ... | -| | Multimodal | [`DashScopeMultiModalWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/dashscope_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#dashscope-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/dashscope_multimodal_template.json) | qwen-vl-max, qwen-vl-chat-v1, qwen-audio-chat | -| Gemini API | Chat | [`GeminiChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/gemini_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#gemini-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/gemini_chat_template.json) | gemini-pro, ... | -| | Embedding | [`GeminiEmbeddingWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/gemini_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#gemini-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/gemini_embedding_template.json) | models/embedding-001, ... | -| ollama | Chat | [`OllamaChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/ollama_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#ollama-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/ollama_chat_template.json) | llama3, llama2, Mistral, ... | -| | Embedding | [`OllamaEmbeddingWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/ollama_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#ollama-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/ollama_embedding_template.json) | llama2, Mistral, ... | -| | Generation | [`OllamaGenerationWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/ollama_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#ollama-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/ollama_generate_template.json) | llama2, Mistral, ... | -| Post Request based API | - | [`PostAPIModelWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/post_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#post-request-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/postapi_model_config_template.json) | - | +| API | Task | Model Wrapper | Configuration | Some Supported Models | +|------------------------|-----------------|---------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------|-----------------------------------------------------------------| +| OpenAI API | Chat | [`OpenAIChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/openai_model.py) |[guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#openai-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/openai_chat_template.json) | gpt-4o, gpt-4, gpt-3.5-turbo, ... | +| | Embedding | [`OpenAIEmbeddingWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/openai_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#openai-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/openai_embedding_template.json) | text-embedding-ada-002, ... | +| | DALL·E | [`OpenAIDALLEWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/openai_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#openai-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/openai_dall_e_template.json) | dall-e-2, dall-e-3 | +| DashScope API | Chat | [`DashScopeChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/dashscope_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#dashscope-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/dashscope_chat_template.json) | qwen-plus, qwen-max, ... | +| | Image Synthesis | [`DashScopeImageSynthesisWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/dashscope_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#dashscope-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/dashscope_image_synthesis_template.json) | wanx-v1 | +| | Text Embedding | [`DashScopeTextEmbeddingWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/dashscope_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#dashscope-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/dashscope_text_embedding_template.json) | text-embedding-v1, text-embedding-v2, ... | +| | Multimodal | [`DashScopeMultiModalWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/dashscope_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#dashscope-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/dashscope_multimodal_template.json) | qwen-vl-max, qwen-vl-chat-v1, qwen-audio-chat | +| Gemini API | Chat | [`GeminiChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/gemini_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#gemini-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/gemini_chat_template.json) | gemini-pro, ... | +| | Embedding | [`GeminiEmbeddingWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/gemini_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#gemini-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/gemini_embedding_template.json) | models/embedding-001, ... | +| ZhipuAI API | Chat | [`ZhipuAIChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/zhipu_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#zhipu-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/zhipu_chat_template.json) | glm-4, ... | +| | Embedding | [`ZhipuAIEmbeddingWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/zhipu_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#zhipu-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/zhipu_embedding_template.json) | embedding-2, ... | +| ollama | Chat | [`OllamaChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/ollama_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#ollama-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/ollama_chat_template.json) | llama3, llama2, Mistral, ... | +| | Embedding | [`OllamaEmbeddingWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/ollama_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#ollama-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/ollama_embedding_template.json) | llama2, Mistral, ... | +| | Generation | [`OllamaGenerationWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/ollama_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#ollama-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/ollama_generate_template.json) | llama2, Mistral, ... | +| LiteLLM API | Chat | [`LiteLLMChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/litellm_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#litellm-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/litellm_chat_template.json) | [models supported by litellm](https://docs.litellm.ai/docs/)... | +| Post Request based API | - | [`PostAPIModelWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/post_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#post-request-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/postapi_model_config_template.json) | - | **Supported Local Model Deployment** @@ -116,7 +131,12 @@ the following libraries. - [Self-Organizing Conversation](./examples/conversation_self_organizing) - [Basic Conversation with LangChain library](./examples/conversation_with_langchain) - [Conversation with ReAct Agent](./examples/conversation_with_react_agent) + - [Conversation in Natural Language to Query SQL](./examples/conversation_nl2sql/) - [Conversation with RAG Agent](./examples/conversation_with_RAG_agents) + - new[Conversation with gpt-4o](./examples/conversation_with_gpt-4o) + - new[Conversation with Software Engineering Agent](./examples/swe_agent/) + - new[Conversation with Customized Services](./examples/conversation_with_customized_services/) + - Game - [Gomoku](./examples/game_gomoku) @@ -126,6 +146,7 @@ the following libraries. - [Distributed Conversation](./examples/distributed_basic) - [Distributed Debate](./examples/distributed_debate) - [Distributed Parallel Search](./examples/distributed_search) + - [Distributed Large Scale Simulation](./examples/distributed_simulation) More models, services and examples are coming soon! @@ -133,8 +154,8 @@ More models, services and examples are coming soon! AgentScope requires **Python 3.9** or higher. -**_Note: This project is currently in active development, it's recommended to -install AgentScope from source._** +***Note: This project is currently in active development, it's recommended to +install AgentScope from source.*** ### From source diff --git a/README_ZH.md b/README_ZH.md index b3c33095a..2a3c9ced0 100644 --- a/README_ZH.md +++ b/README_ZH.md @@ -2,11 +2,15 @@ # AgentScope +

+agentscope-logo +

+ 更简单地构建基于LLM的多智能体应用。 [![](https://img.shields.io/badge/cs.MA-2402.14034-B31C1C?logo=arxiv&logoColor=B31C1C)](https://arxiv.org/abs/2402.14034) [![](https://img.shields.io/badge/python-3.9+-blue)](https://pypi.org/project/agentscope/) -[![](https://img.shields.io/badge/pypi-v0.0.3-blue?logo=pypi)](https://pypi.org/project/agentscope/) +[![](https://img.shields.io/badge/pypi-v0.0.4-blue?logo=pypi)](https://pypi.org/project/agentscope/) [![](https://img.shields.io/badge/Docs-English%7C%E4%B8%AD%E6%96%87-blue?logo=markdown)](https://modelscope.github.io/agentscope/#welcome-to-agentscope-tutorial-hub) [![](https://img.shields.io/badge/Docs-API_Reference-blue?logo=markdown)](https://modelscope.github.io/agentscope/) [![](https://img.shields.io/badge/ModelScope-Demos-4e29ff.svg?logo=)](https://modelscope.cn/studios?name=agentscope&page=1&sort=latest) @@ -24,13 +28,21 @@ ## 新闻 +- new**[2024-05-15]** 用于解析模型格式化输出的**解析器**模块已经上线 AgentScope!更轻松的构建多智能体应用,使用方法请参考[教程](https://modelscope.github.io/agentscope/en/tutorial/203-parser.html)。与此同时,[`DictDialogAgent`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/agents/dict_dialog_agent.py) 类和 [狼人杀游戏](https://github.com/modelscope/agentscope/tree/main/examples/game_werewolf) 样例也已经同步更新! + +- new**[2024-05-14]** 目前 AgentScope 正在进行 AgentScope Workstation & Copilot 用户体验反馈活动,需要您宝贵的意见来帮助我们改善 AgentScope 的拖拽式多智能体应用开发与 Copilot 体验。您的每一个反馈都十分宝贵,请点击 [链接](https://survey.aliyun.com/apps/zhiliao/vgpTppn22) 参与问卷,感谢您的支持! + +- new**[2024-05-14]** AgentScope 现已支持 **gpt-4o** 等 OpenAI Vision 模型! 模型配置请见[链接](./examples/model_configs_template/openai_chat_template.json)。同时,新的样例“[与gpt-4o模型对话](./examples/conversation_with_gpt-4o)”已上线! + +- new**[2024-04-30]** 我们现在发布了**AgentScope** v0.0.4版本! + - new**[2024-04-27]** [AgentScope Workstation](https://agentscope.aliyun.com/)上线了! 欢迎使用 Workstation 体验如何在*拖拉拽编程平台* 零代码搭建多智体应用,也欢迎大家通过*copilot*查询AgentScope各种小知识! -- new**[2024-04-19]** AgentScope现已经支持Llama3!我们提供了面向CPU推理和GPU推理的[脚本](./examples/model_llama3)和[模型配置](./examples/model_llama3),一键式开启Llama3的探索,在我们的样例中尝试Llama3吧! +- **[2024-04-19]** AgentScope现已经支持Llama3!我们提供了面向CPU推理和GPU推理的[脚本](./examples/model_llama3)和[模型配置](./examples/model_llama3),一键式开启Llama3的探索,在我们的样例中尝试Llama3吧! -- new**[2024-04-06]** 我们现在发布了**AgentScope** v0.0.3版本! +- **[2024-04-06]** 我们现在发布了**AgentScope** v0.0.3版本! -- new**[2024-04-06]** 新的样例“[五子棋](./examples/game_gomoku)”,“[与ReAct智能体对话](./examples/conversation_with_react_agent)”,“[与RAG智能体对话](./examples/conversation_with_RAG_agents)”,“[分布式并行搜索](./examples/distributed_search)”上线了! +- **[2024-04-06]** 新的样例“[五子棋](./examples/game_gomoku)”,“[与ReAct智能体对话](./examples/conversation_with_react_agent)”,“[与RAG智能体对话](./examples/conversation_with_RAG_agents)”,“[分布式并行搜索](./examples/distributed_search)”上线了! - **[2024-03-19]** 我们现在发布了**AgentScope** v0.0.2版本!在这个新版本中,AgentScope支持了[ollama](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#supported-models)(本地CPU推理引擎),[DashScope](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#supported-models)和[Gemini](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#supported-models) APIs。 @@ -60,7 +72,7 @@ AgentScope提供了一系列`ModelWrapper`来支持本地模型服务和第三 | API | Task | Model Wrapper | Configuration | Some Supported Models | |------------------------|-----------------|---------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------|-----------------------------------------------| -| OpenAI API | Chat | [`OpenAIChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/openai_model.py) |[guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#openai-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/openai_chat_template.json) | gpt-4, gpt-3.5-turbo, ... | +| OpenAI API | Chat | [`OpenAIChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/openai_model.py) |[guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#openai-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/openai_chat_template.json) | gpt-4o, gpt-4, gpt-3.5-turbo, ... | | | Embedding | [`OpenAIEmbeddingWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/openai_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#openai-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/openai_embedding_template.json) | text-embedding-ada-002, ... | | | DALL·E | [`OpenAIDALLEWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/openai_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#openai-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/openai_dall_e_template.json) | dall-e-2, dall-e-3 | | DashScope API | Chat | [`DashScopeChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/dashscope_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#dashscope-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/dashscope_chat_template.json) | qwen-plus, qwen-max, ... | @@ -69,9 +81,12 @@ AgentScope提供了一系列`ModelWrapper`来支持本地模型服务和第三 | | Multimodal | [`DashScopeMultiModalWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/dashscope_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#dashscope-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/dashscope_multimodal_template.json) | qwen-vl-max, qwen-vl-chat-v1, qwen-audio-chat | | Gemini API | Chat | [`GeminiChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/gemini_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#gemini-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/gemini_chat_template.json) | gemini-pro, ... | | | Embedding | [`GeminiEmbeddingWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/gemini_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#gemini-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/gemini_embedding_template.json) | models/embedding-001, ... | +| ZhipuAI API | Chat | [`ZhipuAIChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/zhipu_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#zhipu-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/zhipu_chat_template.json) | glm-4, ... | +| | Embedding | [`ZhipuAIEmbeddingWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/zhipu_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#zhipu-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/zhipu_embedding_template.json) | embedding-2, ... | | ollama | Chat | [`OllamaChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/ollama_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#ollama-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/ollama_chat_template.json) | llama3, llama2, Mistral, ... | | | Embedding | [`OllamaEmbeddingWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/ollama_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#ollama-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/ollama_embedding_template.json) | llama2, Mistral, ... | | | Generation | [`OllamaGenerationWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/ollama_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#ollama-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/ollama_generate_template.json) | llama2, Mistral, ... | +| LiteLLM API | Chat | [`LiteLLMChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/litellm_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#litellm-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/litellm_chat_template.json) | [models supported by litellm](https://docs.litellm.ai/docs/)... | | Post Request based API | - | [`PostAPIModelWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/post_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#post-request-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/postapi_model_config_template.json) | - | **支持的本地模型部署** @@ -106,6 +121,10 @@ AgentScope支持使用以下库快速部署本地模型服务。 - [与ReAct智能体对话](./examples/conversation_with_react_agent) - [通过对话查询SQL信息](./examples/conversation_nl2sql/) - [与RAG智能体对话](./examples/conversation_with_RAG_agents) + - new[与gpt-4o模型对话](./examples/conversation_with_gpt-4o) + - new[与SoftWare Engineering智能体对话](./examples/swe_agent/) + - - new[与自定义服务对话](./examples/conversation_with_customized_services/) + - 游戏 - [五子棋](./examples/game_gomoku) @@ -115,6 +134,7 @@ AgentScope支持使用以下库快速部署本地模型服务。 - [分布式对话](./examples/distributed_basic) - [分布式辩论](./examples/distributed_debate) - [分布式并行搜索](./examples/distributed_search) + - [分布式大规模仿真](./examples/distributed_simulation) 更多模型API、服务和示例即将推出! @@ -122,7 +142,7 @@ AgentScope支持使用以下库快速部署本地模型服务。 AgentScope需要Python 3.9或更高版本。 -**_注意:该项目目前正在积极开发中,建议从源码安装AgentScope。_** +***注意:该项目目前正在积极开发中,建议从源码安装AgentScope。*** ### 从源码安装 diff --git a/docs/sphinx_doc/en/source/conf.py b/docs/sphinx_doc/en/source/conf.py index 2025ced67..788bda020 100644 --- a/docs/sphinx_doc/en/source/conf.py +++ b/docs/sphinx_doc/en/source/conf.py @@ -49,6 +49,11 @@ autodoc_member_order = "bysource" +autodoc_default_options = { + "members": True, + "special-members": "__init__", +} + # Add any paths that contain templates here, relative to this directory. templates_path = ["_templates"] diff --git a/docs/sphinx_doc/en/source/index.rst b/docs/sphinx_doc/en/source/index.rst index fb81e2e64..1aad67356 100644 --- a/docs/sphinx_doc/en/source/index.rst +++ b/docs/sphinx_doc/en/source/index.rst @@ -38,6 +38,7 @@ AgentScope Documentation agentscope.pipelines agentscope.service agentscope.rpc + agentscope.server agentscope.web agentscope.prompt agentscope.utils diff --git a/docs/sphinx_doc/en/source/tutorial/201-agent.md b/docs/sphinx_doc/en/source/tutorial/201-agent.md index 7d583331d..dbe2f3e77 100644 --- a/docs/sphinx_doc/en/source/tutorial/201-agent.md +++ b/docs/sphinx_doc/en/source/tutorial/201-agent.md @@ -16,6 +16,8 @@ Each AgentBase derivative is composed of several key characteristics: * `sys_prompt` & `engine`: The system prompt acts as predefined instructions that guide the agent in its interactions; and the `engine` is used to dynamically generate a suitable prompt. For more details about them, we defer to [Prompt Engine](206-prompt). +* `to_dist`: Used to create a distributed version of the agent, to support efficient collaboration among multiple agents. Note that `to_dist` is a reserved field and will be automatically added to the initialization function of any subclass of `AgentBase`. For more details about `to_dist`, please refer to [Distribution](208-distribute). + In addition to these attributes, `AgentBase` endows agents with pivotal methods such as `observe` and `reply`: * `observe()`: Through this method, an agent can take note of *message* without immediately replying, allowing it to update its memory based on the observed *message*. diff --git a/docs/sphinx_doc/en/source/tutorial/203-model.md b/docs/sphinx_doc/en/source/tutorial/203-model.md index 9feee50c5..d6e153d0f 100644 --- a/docs/sphinx_doc/en/source/tutorial/203-model.md +++ b/docs/sphinx_doc/en/source/tutorial/203-model.md @@ -14,7 +14,9 @@ Currently, AgentScope supports the following model service APIs: - OpenAI API, including chat, image generation (DALL-E), and Embedding. - DashScope API, including chat, image sythesis and text embedding. - Gemini API, including chat and embedding. +- ZhipuAI API, including chat and embedding. - Ollama API, including chat, embedding and generation. +- LiteLLM API, including chat, with various model APIs. - Post Request API, model inference services based on Post requests, including Huggingface/ModelScope Inference API and various post request based model APIs. @@ -81,9 +83,12 @@ In the current AgentScope, the supported `model_type` types, the corresponding | | Multimodal | [`DashScopeMultiModalWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/dashscope_model.py) | `"dashscope_multimodal"` | qwen-vl-plus, qwen-vl-max, qwen-audio-turbo, ... | | Gemini API | Chat | [`GeminiChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/gemini_model.py) | `"gemini_chat"` | gemini-pro, ... | | | Embedding | [`GeminiEmbeddingWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/gemini_model.py) | `"gemini_embedding"` | models/embedding-001, ... | +| ZhipuAI API | Chat | [`ZhipuAIChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/zhipu_model.py) | `"zhipuai_chat"` | glm4, ... | +| | Embedding | [`ZhipuAIEmbeddingWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/zhipu_model.py) | `"zhipuai_embedding"` | embedding-2, ... | | ollama | Chat | [`OllamaChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/ollama_model.py) | `"ollama_chat"` | llama2, ... | | | Embedding | [`OllamaEmbeddingWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/ollama_model.py) | `"ollama_embedding"` | llama2, ... | | | Generation | [`OllamaGenerationWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/ollama_model.py) | `"ollama_generate"` | llama2, ... | +| LiteLLM API | Chat | [`LiteLLMChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/litellm_model.py) | `"litellm_chat"` | - | | Post Request based API | - | [`PostAPIModelWrapperBase`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/post_model.py) | `"post_api"` | - | | | Chat | [`PostAPIChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/post_model.py) | `"post_api_chat"` | meta-llama/Meta-Llama-3-8B-Instruct, ... | @@ -303,6 +308,48 @@ Here we provide example configurations for different model wrappers.
+ +#### ZhipuAI API + +
+ZhipuAI Chat API (agentscope.models.ZhipuAIChatWrapper) + +```python +{ + "config_name": "my_zhipuai_chat_config", + "model_type": "zhipuai_chat", + + # Required parameters + "model_name": "{model_name}", # The model name in ZhipuAI API, e.g. glm-4 + + # Optional parameters + "api_key": "{your_api_key}" +} +``` + +
+ +
+ZhipuAI Embedding API (agentscope.models.ZhipuAIEmbeddingWrapper) + +```python +{ + "config_name": "my_zhipuai_embedding_config", + "model_type": "zhipuai_embedding", + + # Required parameters + "model_name": "{model_name}", # The model name in ZhipuAI API, e.g. embedding-2 + + # Optional parameters + "api_key": "{your_api_key}", +} +``` + +
+ +
+ + #### Ollama API
@@ -395,6 +442,26 @@ Here we provide example configurations for different model wrappers.
+ +#### LiteLLM Chat API + +
+LiteLLM Chat API (agentscope.models.LiteLLMChatModelWrapper) + +```python +{ + "config_name": "lite_llm_openai_chat_gpt-3.5-turbo", + "model_type": "litellm_chat", + "model_name": "gpt-3.5-turbo" # You should note that for different models, you should set the corresponding environment variables, such as OPENAI_API_KEY, etc. You may refer to https://docs.litellm.ai/docs/ for this. +}, +``` + +
+ +
+ + #### Post Request Chat API
diff --git a/docs/sphinx_doc/en/source/tutorial/203-parser.md b/docs/sphinx_doc/en/source/tutorial/203-parser.md new file mode 100644 index 000000000..a4e0538c3 --- /dev/null +++ b/docs/sphinx_doc/en/source/tutorial/203-parser.md @@ -0,0 +1,460 @@ +(203-parser-en)= + +# Model Response Parser + +## Table of Contents + +- [Background](#background) +- [Parser Module](#parser-module) + - [Overview](#overview) + - [String Type](#string-type) + - [MarkdownCodeBlockParser](#markdowncodeblockparser) + - [Initialization](#initialization) + - [Format Instruction Template](#format-instruction-template) + - [Parse Function](#parse-function) + - [Dictionary Type](#dictionary-type) + - [MarkdownJsonDictParser](#markdownjsondictparser) + - [Initialization & Format Instruction Template](#initialization--format-instruction-template) + - [MultiTaggedContentParser](#multitaggedcontentparser) + - [Initialization & Format Instruction Template](#initialization--format-instruction-template-1) + - [Parse Function](#parse-function-1) + - [JSON / Python Object Type](#json--python-object-type) + - [MarkdownJsonObjectParser](#markdownjsonobjectparser) + - [Initialization & Format Instruction Template](#initialization--format-instruction-template-2) + - [Parse Function](#parse-function-2) +- [Typical Use Cases](#typical-use-cases) + - [WereWolf Game](#werewolf-game) + - [ReAct Agent and Tool Usage](#react-agent-and-tool-usage) +- [Customized Parser](#customized-parser) + +## Background + +In the process of building LLM-empowered application, parsing the LLM generated string into a specific format and extracting the required information is a very important step. +However, due to the following reasons, this process is also a very complex process: + +1. **Diversity**: The target format of parsing is diverse, and the information to be extracted may be a specific text, a JSON object, or a complex data structure. +2. **Complexity**: The result parsing is not only to convert the text generated by LLM into the target format, but also involves a series of issues such as prompt engineering (reminding LLM what format of output should be generated), error handling, etc. +3. **Flexibility**: Even in the same application, different stages may also require the agent to generate output in different formats. + +For the convenience of developers, AgentScope provides a parser module to help developers parse LLM response into a specific format. By using the parser module, developers can easily parse the response into the target format by simple configuration, and switch the target format flexibly. + +In AgentScope, the parser module features +1. **Flexibility**: Developers can flexibly set the required format, flexibly switch the parser without modifying the code of agent class. That is, the specific "target format" and the agent's `reply` function are decoupled. +2. **Freedom**: The format instruction, result parsing and prompt engineering are all explicitly finished in the `reply` function. Developers and users can freely choose to use the parser or parse LLM response by their own code. +3. **Transparency**: When using the parser, the process and results of prompt construction are completely visible and transparent to developers in the `reply` function, and developers can precisely debug their applications. + +## Parser Module + +### Overview + +The main functions of the parser module include: + +1. Provide "format instruction", that is, remind LLM where to generate what output, for example + +```` +You should generate python code in a fenced code block as follows +```python +{your_python_code} +``` +```` + +2. Provide a parse function, which directly parses the text generated by LLM into the target data format, + +3. Post-processing for dictionary format. After parsing the text into a dictionary, different fields may have different uses. + +AgentScope provides multiple built-in parsers, and developers can choose according to their needs. + +| Target Format | Parser Class | Description | +| --- | --- |------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| String | `MarkdownCodeBlockParser` | Requires LLM to generate specified text within a Markdown code block marked by ```. The result is a string. | +| Dictionary | `MarkdownJsonDictParser` | Requires LLM to produce a specified dictionary within the code block marked by \```json and \```. The result is a Python dictionary. | +| | `MultiTaggedContentParser` | Requires LLM to generate specified content within multiple tags. Contents from different tags will be parsed into a single Python dictionary with different key-value pairs. | +| JSON / Python Object Type | `MarkdownJsonObjectParser` | Requires LLM to produce specified content within the code block marked by \```json and \```. The result will be converted into a Python object via json.loads. | + +> **NOTE**: Compared to `MarkdownJsonDictParser`, `MultiTaggedContentParser` is more suitable for weak LLMs and when the required format is too complex. +> For example, when LLM is required to generate Python code, if the code is returned directly within a dictionary, LLM needs to be aware of escaping characters (\t, \n, ...), and the differences between double and single quotes when calling `json.loads` +> +> In contrast, `MultiTaggedContentParser` guides LLM to generate each key-value pair separately in individual tags and then combines them into a dictionary, thus reducing the difficulty. + + +In the following sections, we will introduce the usage of these parsers based on different target formats. + +### String Type + +#### MarkdownCodeBlockParser + +##### Initialization + +- `MarkdownCodeBlockParser` requires LLM to generate specific text within a specified code block in Markdown format. Different languages can be specified with the `language_name` parameter to utilize the large model's ability to produce corresponding outputs. For example, when asking the large model to produce Python code, initialize as follows: + + ```python + from agentscope.parsers import MarkdownCodeBlockParser + + parser = MarkdownCodeBlockParser(language_name="python", content_hint="your python code") + ``` + +##### Format Instruction Template + +- `MarkdownCodeBlockParser` provides the following format instruction template. When the user calls the `format_instruction` attribute, `{language_name}` will be replaced with the string entered at initialization: + + ```` + You should generate {language_name} code in a {language_name} fenced code block as follows: + ```{language_name} + {content_hint} + ``` + ```` + +- For the above initialization with `language_name` as `"python"`, when the `format_instruction` attribute is called, the following string will be returned: + + ```python + print(parser.format_instruction) + ``` + + ```` + You should generate python code in a python fenced code block as follows + ```python + your python code + ``` + ```` + +##### Parse Function + +- `MarkdownCodeBlockParser` provides a `parse` method to parse the text generated by LLM。Its input and output are both `ModelResponse` objects, and the parsing result will be mounted on the `parsed` attribute of the output object. + + ````python + res = parser.parse( + ModelResponse( + text="""The following is generated python code + ```python + print("Hello world!") + ``` + """ + ) + ) + + print(res.parsed) + ```` + + ``` + print("hello world!") + ``` + +### Dictionary Type + +Different from string and general JSON/Python object, as a powerful format in LLM applications, AgentScope provides additional post-processing functions for dictionary type. +When initializing the parser, you can set the `keys_to_content`, `keys_to_memory`, and `keys_to_metadata` parameters to achieve filtering of key-value pairs when calling the parser's `to_content`, `to_memory`, and `to_metadata` methods. + +- `keys_to_content` specifies the key-value pairs that will be placed in the `content` field of the returned `Msg` object. The content field will be returned to other agents, participate in their prompt construction, and will also be called by the `self.speak` function for display. +- `keys_to_memory` specifies the key-value pairs that will be stored in the memory of the agent. +- `keys_to_metadata` specifies the key-value pairs that will be placed in the `metadata` field of the returned `Msg` object, which can be used for application control flow judgment, or mount some information that does not need to be returned to other agents. + +The three parameters receive bool values, string and a list of strings. The meaning of their values is as follows: +- `False`: The corresponding filter function will return `None`. +- `True`: The whole dictionary will be returned. +- `str`: The corresponding value will be directly returned. +- `List[str]`: A filtered dictionary will be returned according to the list of keys. + +By default, `keys_to_content` and `keys_to_memory` are `True`, that is, the whole dictionary will be returned. `keys_to_metadata` defaults to `False`, that is, the corresponding filter function will return `None`. + +For example, the dictionary generated by the werewolf in the daytime discussion in a werewolf game. In this example, +- `"thought"` should not be returned to other agents, but should be stored in the agent's memory to ensure the continuity of the werewolf strategy; +- `"speak"` should be returned to other agents and stored in the agent's memory; +- `"finish_discussion"` is used in the application's control flow to determine whether the discussion has ended. To save tokens, this field should not be returned to other agents or stored in the agent's memory. + + ```python + { + "thought": "The others didn't realize I was a werewolf. I should end the discussion soon.", + "speak": "I agree with you.", + "finish_discussion": True + } + ``` + +In AgentScope, we achieve post-processing by calling the `to_content`, `to_memory`, and `to_metadata` methods, as shown in the following code: + +- The code for the application's control flow, create the corresponding parser object and load it + + ```python + from agentscope.parsers import MarkdownJsonDictParser + + # ... + + agent = DictDialogAgent(...) + + # Take MarkdownJsonDictParser as example + parser = MarkdownJsonDictParser( + content_hint={ + "thought": "what you thought", + "speak": "what you speak", + "finish_discussion": "whether the discussion is finished" + }, + keys_to_content="speak", + keys_to_memory=["thought", "speak"], + keys_to_metadata=["finish_discussion"] + ) + + # Load parser, which is equivalent to specifying the required format + agent.set_parser(parser) + + # The discussion process + while True: + # ... + x = agent(x) + # Break the loop according to the finish_discussion field in metadata + if x.metadata["finish_discussion"]: + break + ``` + +- Filter the dictionary in the agent's `reply` function + + ```python + # ... + def reply(x: dict = None) -> None: + + # ... + res = self.model(prompt, parse_func=self.parser.parse) + + # Story the thought and speak fields into memory + self.memory.add( + Msg( + self.name, + content=self.parser.to_memory(res.parsed), + role="assistant", + ) + ) + + # Store in content and metadata fields in the returned Msg object + msg = Msg( + self.name, + content=self.parser.to_content(res.parsed), + role="assistant", + metadata=self.parser.to_metadata(res.parsed), + ) + self.speak(msg) + + return msg + ``` + +> **Note**: `keys_to_content`, `keys_to_memory`, and `keys_to_metadata` parameters can be a string, a list of strings, or a bool value. +> - For `True`, the `to_content`, `to_memory`, and `to_metadata` methods will directly return the whole dictionary. +> - For `False`, the `to_content`, `to_memory`, and `to_metadata` methods will directly return `None`. +> - For a string, the `to_content`, `to_memory`, and `to_metadata` methods will directly extract the corresponding value. For example, if `keys_to_content="speak"`, the `to_content` method will put `res.parsed["speak"]` into the `content` field of the `Msg` object, and the `content` field will be a string rather than a dictionary. +> - For a list of string, the `to_content`, `to_memory`, and `to_metadata` methods will filter the dictionary according to the list of keys. +> ```python +> parser = MarkdownJsonDictParser( +> content_hint={ +> "thought": "what you thought", +> "speak": "what you speak", +> }, +> keys_to_content="speak", +> keys_to_memory=["thought", "speak"], +> ) +> +> example_dict = {"thought": "abc", "speak": "def"} +> print(parser.to_content(example_dict)) # def +> print(parser.to_memory(example_dict)) # {"thought": "abc", "speak": "def"} +> print(parser.to_metadata(example_dict)) # None +> ``` +> ``` +> def +> {"thought": "abc", "speak": "def"} +> None +> ``` + + +Next we will introduce two parsers for dictionary type. + +#### MarkdownJsonDictParser + +##### Initialization & Format Instruction Template + +- `MarkdownJsonDictParser` requires LLM to generate dictionary within a code block fenced by \```json and \``` tags. + +- Except `keys_to_content`, `keys_to_memory` and `keys_to_metadata`, the `content_hint` parameter can be provided to give an example and explanation of the response result, that is, to remind LLM where and what kind of dictionary should be generated. +This parameter can be a string or a dictionary. For dictionary, it will be automatically converted to a string when constructing the format instruction. + + ```python + from agentscope.parsers import MarkdownJsonDictParser + + # dictionary as content_hint + MarkdownJsonDictParser( + content_hint={ + "thought": "what you thought", + "speak": "what you speak", + } + ) + # or string as content_hint + MarkdownJsonDictParser( + content_hint="""{ + "thought": "what you thought", + "speak": "what you speak", + }""" + ) + ``` + + - The corresponding `instruction_format` attribute + + ```` + You should respond a json object in a json fenced code block as follows: + ```json + {content_hint} + ``` + ```` + +#### MultiTaggedContentParser + +`MultiTaggedContentParser` asks LLM to generate specific content within multiple tag pairs. The content from different tag pairs will be parsed into a single Python dictionary. Its usage is similar to `MarkdownJsonDictParser`, but the initialization method is different, and it is more suitable for weak LLMs or complex return content. + +##### Initialization & Format Instruction Template + +Within `MultiTaggedContentParser`, each tag pair will be specified by as `TaggedContent` object, which contains +- Tag name (`name`), the key value in the returned dictionary +- Start tag (`tag_begin`) +- Hint for content (`content_hint`) +- End tag (`tag_end`) +- Content parsing indication (`parse_json`), default as `False`. When set to `True`, the parser will automatically add hint that requires JSON object between the tags, and its extracted content will be parsed into a Python object via `json.loads` + +```python +from agentscope.parsers import MultiTaggedContentParser, TaggedContent +parser = MultiTaggedContentParser( + TaggedContent( + name="thought", + tag_begin="[THOUGHT]", + content_hint="what you thought", + tag_end="[/THOUGHT]" + ), + TaggedContent( + name="speak", + tag_begin="[SPEAK]", + content_hint="what you speak", + tag_end="[/SPEAK]" + ), + TaggedContent( + name="finish_discussion", + tag_begin="[FINISH_DISCUSSION]", + content_hint="true/false, whether the discussion is finished", + tag_end="[/FINISH_DISCUSSION]", + parse_json=True, # we expect the content of this field to be parsed directly into a Python boolean value + ) +) + +print(parser.format_instruction) +``` + +``` +Respond with specific tags as outlined below, and the content between [FINISH_DISCUSSION] and [/FINISH_DISCUSSION] MUST be a JSON object: +[THOUGHT]what you thought[/THOUGHT] +[SPEAK]what you speak[/SPEAK] +[FINISH_DISCUSSION]true/false, whether the discussion is finished[/FINISH_DISCUSSION] +``` + +##### Parse Function + +- `MultiTaggedContentParser`'s parsing result is a dictionary, whose keys are the value of `name` in the `TaggedContent` objects. +The following is an example of parsing the LLM response in the werewolf game: + +```python +res_dict = parser.parse( + ModelResponse( + text="""As a werewolf, I should keep pretending to be a villager +[THOUGHT]The others didn't realize I was a werewolf. I should end the discussion soon.[/THOUGHT] +[SPEAK]I agree with you.[/SPEAK] +[FINISH_DISCUSSION]true[/FINISH_DISCUSSION]""" + ) +) + +print(res_dict) +``` + +``` +{ + "thought": "The others didn't realize I was a werewolf. I should end the discussion soon.", + "speak": "I agree with you.", + "finish_discussion": true +} +``` + +### JSON / Python Object Type + +#### MarkdownJsonObjectParser + +`MarkdownJsonObjectParser` also uses the \```json and \``` tags in Markdown, but does not limit the content type. It can be a list, dictionary, number, string, etc., which can be parsed into a Python object via `json.loads`. + +##### Initialization & Format Instruction Template + +```python +from agentscope.parsers import MarkdownJsonObjectParser + +parser = MarkdownJsonObjectParser( + content_hint="{A list of numbers.}" +) + +print(parser.format_instruction) +``` + +```` +You should respond a json object in a json fenced code block as follows: +```json +{a list of numbers} +``` +```` + +##### Parse Function + +````python +res = parser.parse( + ModelResponse( + text="""Yes, here is the generated list +```json +[1,2,3,4,5] +``` +""") +) + +print(type(res)) +print(res) +```` + +``` + +[1, 2, 3, 4, 5] +``` + +## Typical Use Cases + +### WereWolf Game + +Werewolf game is a classic use case of dictionary parser. In different stages of the game, the same agent needs to generate different identification fields in addition to `"thought"` and `"speak"`, such as whether the discussion is over, whether the seer uses its ability, whether the witch uses the antidote and poison, and voting. + +AgentScope has built-in examples of [werewolf game](https://github.com/modelscope/agentscope/tree/main/examples/game_werewolf), which uses `DictDialogAgent` class and different parsers to achieve flexible target format switching. By using the post-processing function of the parser, it separates "thought" and "speak", and controls the progress of the game successfully. +More details can be found in the werewolf game [source code](https://github.com/modelscope/agentscope/tree/main/examples/game_werewolf). + +### ReAct Agent and Tool Usage + +`ReActAgent` is an agent class built for tool usage in AgentScope, based on the ReAct algorithm, and can be used with different tool functions. The tool call, format parsing, and implementation of `ReActAgent` are similar to the parser. For detailed implementation, please refer to the [source code](https://github.com/modelscope/agentscope/blob/main/src/agentscope/agents/react_agent.py). + + +## Customized Parser + +AgentScope provides a base class `ParserBase` for parsers. Developers can inherit this base class, and implement the `format_instruction` attribute and `parse` method to create their own parser. + +For dictionary type parsing, you can also inherit the `agentscope.parser.DictFilterMixin` class to implement post-processing for dictionary type. + +```python +from abc import ABC, abstractmethod + +from agentscope.models import ModelResponse + + +class ParserBase(ABC): + """The base class for model response parser.""" + + format_instruction: str + """The instruction for the response format.""" + + @abstractmethod + def parse(self, response: ModelResponse) -> ModelResponse: + """Parse the response text to a specific object, and stored in the + parsed field of the response object.""" + + # ... +``` diff --git a/docs/sphinx_doc/en/source/tutorial/204-service.md b/docs/sphinx_doc/en/source/tutorial/204-service.md index 30d82242d..88fccc4b4 100644 --- a/docs/sphinx_doc/en/source/tutorial/204-service.md +++ b/docs/sphinx_doc/en/source/tutorial/204-service.md @@ -26,7 +26,10 @@ The following table outlines the various Service functions by type. These functi | | `arxiv_search` | Perform arXiv search | | | `download_from_url` | Download file from given URL. | | | `load_web` | Load and parse the web page of the specified url (currently only supports HTML). | -| | `digest_webpage` | Digest the content of a already loaded web page (currently only supports HTML). | +| | `digest_webpage` | Digest the content of a already loaded web page (currently only supports HTML). +| | `dblp_search_publications` | Search publications in the DBLP database +| | `dblp_search_authors` | Search for author information in the DBLP database | +| | `dblp_search_venues` | Search for venue information in the DBLP database | | File | `create_file` | Create a new file at a specified path, optionally with initial content. | | | `delete_file` | Delete a file specified by a file path. | | | `move_file` | Move or rename a file from one path to another. | diff --git a/docs/sphinx_doc/en/source/tutorial/206-prompt.md b/docs/sphinx_doc/en/source/tutorial/206-prompt.md index 23467c0ef..28a785ed5 100644 --- a/docs/sphinx_doc/en/source/tutorial/206-prompt.md +++ b/docs/sphinx_doc/en/source/tutorial/206-prompt.md @@ -44,6 +44,7 @@ generation model APIs. - [OllamaChatWrapper](#ollamachatwrapper) - [OllamaGenerationWrapper](#ollamagenerationwrapper) - [GeminiChatWrapper](#geminichatwrapper) +- [ZhipuAIChatWrapper](#zhipuaichatwrapper) These strategies are implemented in the `format` functions of the model wrapper classes. @@ -63,6 +64,8 @@ dictionaries as input, where the dictionary must obey the following rules #### Prompt Strategy +##### Non-Vision Models + In OpenAI Chat API, the `name` field enables the model to distinguish different speakers in the conversation. Therefore, the strategy of `format` function in `OpenAIChatWrapper` is simple: @@ -99,6 +102,75 @@ print(prompt) ] ``` +##### Vision Models + +For vision models (gpt-4-turbo, gpt-4o, ...), if the input message contains image urls, the generated `content` field will be a list of dicts, which contains text and image urls. + +Specifically, the web image urls will be pass to OpenAI Chat API directly, while the local image urls will be converted to base64 format. More details please refer to the [official guidance](https://platform.openai.com/docs/guides/vision). + +Note the invalid image urls (e.g. `/Users/xxx/test.mp3`) will be ignored. + +```python +from agentscope.models import OpenAIChatWrapper +from agentscope.message import Msg + +model = OpenAIChatWrapper( + config_name="", # empty since we directly initialize the model wrapper + model_name="gpt-4o", +) + +prompt = model.format( + Msg("system", "You're a helpful assistant", role="system"), # Msg object + [ # a list of Msg objects + Msg(name="user", content="Describe this image", role="user", url="https://xxx.png"), + Msg(name="user", content="And these images", role="user", url=["/Users/xxx/test.png", "/Users/xxx/test.mp3"]), + ], +) +print(prompt) +``` + +```python +[ + { + "role": "system", + "name": "system", + "content": "You are a helpful assistant" + }, + { + "role": "user", + "name": "user", + "content": [ + { + "type": "text", + "text": "Describe this image" + }, + { + "type": "image_url", + "image_url": { + "url": "https://xxx.png" + } + }, + ] + }, + { + "role": "user", + "name": "user", + "content": [ + { + "type": "text", + "text": "And these images" + }, + { + "type": "image_url", + "image_url": { + "url": "..." # for /Users/xxx/test.png + } + }, + ] + }, +] +``` + ### DashScopeChatWrapper `DashScopeChatWrapper` encapsulates the DashScope chat API, which takes a list of messages as input. The message must obey the following rules (updated in 2024/03/22): @@ -228,6 +300,56 @@ print(prompt) ] ``` + +### LiteLLMChatWrapper + +`LiteLLMChatWrapper` encapsulates the litellm chat API, which takes a list of +messages as input. The litellm supports different types of models, and each model +might need to obey different formats. To simplify the usage, we provide a format +that could be compatible with most models. If more specific formats are needed, +you can refer to the specific model you use as well as the +[litellm](https://github.com/BerriAI/litellm) documentation to customize your +own format function for your model. + + +- format all the messages in the chat history, into a single message with `"user"` as `role` + +#### Prompt Strategy + +- Messages will consist dialogue history in the `user` message prefixed by the system message and "## Dialogue History". + +```python +from agentscope.models import LiteLLMChatWrapper + +model = LiteLLMChatWrapper( + config_name="", # empty since we directly initialize the model wrapper + model_name="gpt-3.5-turbo", +) + +prompt = model.format( + Msg("system", "You are a helpful assistant", role="system"), + [ + Msg("user", "What is the weather today?", role="user"), + Msg("assistant", "It is sunny today", role="assistant"), + ], +) + +print(prompt) +``` + +```bash +[ + { + "role": "user", + "content": ( + "You are a helpful assistant\n\n" + "## Dialogue History\nuser: What is the weather today?\n" + "assistant: It is sunny today" + ), + }, +] +``` + ### OllamaChatWrapper `OllamaChatWrapper` encapsulates the Ollama chat API, which takes a list of @@ -240,11 +362,11 @@ messages as input. The message must obey the following rules (updated in #### Prompt Strategy -Given a list of messages, we will parse each message as follows: - -- `Msg`: Fill the `role` and `content` fields directly. If it has an `url` - field, which refers to an image, we will add it to the message. -- `List`: Parse each element in the list according to the above rules. +- If the role field of the first input message is `"system"`, +it will be treated as system prompt and the other messages will consist +dialogue history in the system message prefixed by "## Dialogue History". +- If the `url` attribute of messages is not `None`, we will gather all urls in +the `"images"` field in the returned dictionary. ```python from agentscope.models import OllamaChatWrapper @@ -267,9 +389,11 @@ print(prompt) ```bash [ - {"role": "system", "content": "You are a helpful assistant"}, - {"role": "assistant", "content": "Hi."}, - {"role": "assistant", "content": "Nice to meet you!", "images": ["https://example.com/image.jpg"]}, + { + "role": "system", + "content": "You are a helpful assistant\n\n## Dialogue History\nBob: Hi.\nAlice: Nice to meet you!", + "images": ["https://example.com/image.jpg"] + }, ] ``` @@ -365,6 +489,47 @@ print(prompt) ] ``` +### `ZhipuAIChatWrapper` + +`ZhipuAIChatWrapper` encapsulates the ZhipuAI chat API, which takes a list of messages as input. The message must obey the following rules: + +- Require `role` and `content` fields, and `role` must be either `"user"` + `"system"` or `"assistant"`. +- There must be at least one `user` message. + +#### Prompt Strategy + +If the role field of the first message is `"system"`, it will be converted into a single message with the `role` field as `"system"` and the `content` field as the system message. The rest of the messages will be converted into a message with the `role` field as `"user"` and the `content` field as the dialogue history. + +An example is shown below: + +```python +from agentscope.models import ZhipuAIChatWrapper +from agentscope.message import Msg + +model = ZhipuAIChatWrapper( + config_name="", # empty since we directly initialize the model wrapper + model_name="glm-4", + api_key="your api key", +) + +prompt = model.format( + Msg("system", "You're a helpful assistant", role="system"), # Msg object + [ # a list of Msg objects + Msg(name="Bob", content="Hi!", role="assistant"), + Msg(name="Alice", content="Nice to meet you!", role="assistant"), + ], +) +print(prompt) +``` + +```bash +[ + {"role": "system", "content": "You are a helpful assistant"}, + {"role": "user", "content": "## Dialogue History\nBob: Hi!\nAlice: Nice to meet you!"}, +] +``` + ## Prompt Engine (Will be deprecated in the future) AgentScope provides the `PromptEngine` class to simplify the process of crafting diff --git a/docs/sphinx_doc/en/source/tutorial/207-monitor.md b/docs/sphinx_doc/en/source/tutorial/207-monitor.md index e43f67b4f..76c4d08b1 100644 --- a/docs/sphinx_doc/en/source/tutorial/207-monitor.md +++ b/docs/sphinx_doc/en/source/tutorial/207-monitor.md @@ -35,8 +35,10 @@ Get a monitor instance from `MonitorFactory` to begin monitoring, and note that monitor = MonitorFactory.get_monitor() ``` -> Currently the above code returns a `SqliteMonitor` instance, which is initialized in `agentscope.init`. -> The `SqliteMonitor` class is the default implementation of `MonitorBase` class, which is based on Sqlite3. +Currently the above code returns a `SqliteMonitor` instance, which is initialized in `agentscope.init`. +The `SqliteMonitor` class is the default implementation of `MonitorBase` class, which is based on Sqlite3. + +If you don't want to use monitor, you can set `use_monitor=False` in `agentscope.init` to disable the monitor. And in this case, the `MonitorFactory.get_monitor` method will return an instance of `DummyMonitor` which has the same interface as the `SqliteMonitor` class, but does nothing inside. ### Basic Usage diff --git a/docs/sphinx_doc/en/source/tutorial/208-distribute.md b/docs/sphinx_doc/en/source/tutorial/208-distribute.md index 34321f62c..0381a13f1 100644 --- a/docs/sphinx_doc/en/source/tutorial/208-distribute.md +++ b/docs/sphinx_doc/en/source/tutorial/208-distribute.md @@ -12,70 +12,173 @@ This tutorial will introduce the implementation and usage of AgentScope distribu ## Usage -In AgentScope, the process that runs the application flow is called the "main process", and all agents will run in separate processes. -According to the different relationships between the main process and the agent process, AgentScope supports two distributed modes: Master-Slave and Peer-to-Peer mode. -In the Master-Slave mode, developers can start all agent processes from the main process, while in the Peer-to-Peer mode, the agent process is independent of the main process and developers need to start the agent service on the corresponding machine. +In AgentScope, the process that runs the application flow is called the **main process**, and each agent can run in a separate process named **agent server process**. +According to the different relationships between the main process and the agent server process, AgentScope supports two modes for each agent: **Child Process** and **Independent Process** mode. -The above concepts may seem complex, but don't worry, for application developers, they only have minor differences when creating agents. Below we introduce how to create distributed agents. +- In the Child Process Mode, agent server processes will be automatically started as sub-processes from the main process. +- While in the Independent Process Mode, the agent server process is independent of the main process and developers need to start the agent server process on the corresponding machine. -### Step 1: Create a Distributed Agent +The above concepts may seem complex, but don't worry, for application developers, you only need to convert your existing agent to its distributed version. -First, the developer's agent must inherit the `agentscope.agents.AgentBase` class. `AgentBase` provides the `to_dist` method to convert the agent into its distributed version. `to_dist` mainly relies on the following parameters to implement the distributed deployment of the agent: +### Step 1: Convert your agent to its distributed version -- `host`: the hostname or IP address of the machine where the agent runs, defaults to `localhost`. -- `port`: the port of this agent's RPC server, defaults to `80`. -- `launch_server`: whether to launch an RPC server locally, defaults to `True`. +All agents in AgentScope can automatically convert to its distributed version by calling its {func}`to_dist` method. +But note that your agent must inherit from the {class}`agentscope.agents.AgentBase` class, because the `to_dist` method is provided by the `AgentBase` class. Suppose there are two agent classes `AgentA` and `AgentB`, both of which inherit from `AgentBase`. -#### Master-Slave Mode +```python +a = AgentA( + name="A" + # ... +) +b = AgentB( + name="B" + # ... +) +``` -In the Master-Slave mode, since all agent processes depend on the main process, all processes actually run on the same machine. -We can start all agent processes from the main process, that is, the default parameters `launch_server=True` and `host="localhost"`, and we can omit the `port` parameter. AgentScope will automatically find an available local port for the agent process. +Next we will introduce the conversion details of both modes. + +#### Child Process Mode + +To use this mode, you only need to call each agent's `to_dist()` method without any input parameter. AgentScope will automatically start all agent server processes from the main process. ```python +# Child Process mode a = AgentA( name="A" # ... ).to_dist() +b = AgentB( + name="B" + # ... +).to_dist() ``` -#### Peer-to-Peer Mode +#### Independent Process Mode -In the Peer-to-Peer mode, we need to start the service of the corresponding agent on the target machine first. For example, deploy an instance of `AgentA` on the machine with IP `a.b.c.d`, and its corresponding port is 12001. Run the following code on this target machine: +In the Independent Process Mode, we need to start the agent server process on the target machine first. +When starting the agent server process, you need to specify a model config file, which contains the models which can be used in the agent server, the IP address and port of the agent server process +For example, start two agent server processes on the two different machines with IP `ip_a` and `ip_b`(called `Machine1` and `Machine2` accrodingly). +You can run the following code on `Machine1`.Before running, make sure that the machine has access to all models that used in your application, specifically, you need to put your model config file in `model_config_path_a` and set environment variables such as your model API key correctly in `Machine1`. The example model config file instances are located under `examples/model_configs_template`. ```python -from agentscope.agents import RpcAgentServerLauncher +# import some packages +# register models which can be used in the server +agentscope.init( + model_configs=model_config_path_a, +) # Create an agent service process -server_a = RpcAgentServerLauncher( - agent_class=AgentA, - agent_kwargs={ - "name": "A" - ... - }, - host="a.b.c.d", - port=12001, +server = RpcAgentServerLauncher( + host="ip_a", + port=12001, # choose an available port +) + +# Start the service +server.launch() +server.wait_until_terminate() +``` + +> For similarity, you can run the following command in your terminal rather than the above code: +> +> ```shell +> as_server --host ip_a --port 12001 --model-config-path model_config_path_a +> ``` + +Then put your model config file accordingly in `model_config_path_b`, set environment variables, and run the following code on `Machine2`. + +```python +# import some packages + +# register models which can be used in the server +agentscope.init( + model_configs=model_config_path_b, +) +# Create an agent service process +server = RpcAgentServerLauncher( + host="ip_b", + port=12002, # choose an available port ) # Start the service -server_a.launch() -server_a.wait_until_terminate() +server.launch() +server.wait_until_terminate() ``` -Then, we can connect to the agent service in the main process with the following code. At this time, the object `a` created in the main process can be used as a local proxy for the agent, allowing developers to write the application flow in a centralized way in the main process. +> Similarly, you can run the following command in your terminal to setup the agent server: +> +> ```shell +> as_server --host ip_b --port 12002 --model-config-path model_config_path_b +> ``` + +Then, you can connect to the agent servers from the main process with the following code. ```python a = AgentA( name="A", # ... ).to_dist( - host="a.b.c.d", + host="ip_a", port=12001, - launch_server=False, +) +b = AgentB( + name="B", + # ... +).to_dist( + host="ip_b", + port=12002, +) +``` + +The above code will deploy `AgentA` on the agent server process of `Machine1` and `AgentB` on the agent server process of `Machine2`. +And developers just need to write the application flow in a centralized way in the main process. + +#### Advanced Usage of `to_dist` + +All examples described above convert initialized agents into their distributed version through the {func}`to_dist` method, which is equivalent to initialize the agent twice, once in the main process and once in the agent server process. +For agents whose initialization process is time-consuming, the `to_dist` method is inefficient. Therefore, AgentScope also provides a method to convert the Agent instance into its distributed version while initializing it, that is, passing in `to_dist` parameter to the Agent's initialization function. + +In Child Process Mode, just pass `to_dist=True` to the Agent's initialization function. + +```python +# Child Process mode +a = AgentA( + name="A", + # ... + to_dist=True +) +b = AgentB( + name="B", + # ... + to_dist=True +) +``` + +In Independent Process Mode, you need to encapsulate the parameters of the `to_dist()` method in {class}`DistConf` instance and pass it into the `to_dist` field, for example: + +```python +a = AgentA( + name="A", + # ... + to_dist=DistConf( + host="ip_a", + port=12001, + ), +) +b = AgentB( + name="B", + # ... + to_dist=DistConf( + host="ip_b", + port=12002, + ), ) ``` +Compared with the original `to_dist()` function call, this method just initializes the agent once in the agent server process. + ### Step 2: Orchestrate Distributed Application Flow In AgentScope, the orchestration of distributed application flow is exactly the same as non-distributed programs, and developers can write the entire application flow in a centralized way. @@ -83,7 +186,7 @@ At the same time, AgentScope allows the use of a mixture of locally and distribu The following is the complete code for two agents to communicate with each other in different modes. It can be seen that AgentScope supports zero-cost migration of distributed application flow from centralized to distributed. -- All agents are centralized: +- All agents are centralized ```python # Create agent objects @@ -104,7 +207,9 @@ while x is None or x.content == "exit": x = b(x) ``` -- Agents are deployed in a distributed manner (Master-Slave mode): +- Agents are deployed in a distributed manner + - `AgentA` in Child Process mode + - `AgentB` in Independent Process Mode ```python # Create agent objects @@ -116,7 +221,10 @@ a = AgentA( b = AgentB( name="B", # ... -).to_dist() +).to_dist( + host="ip_b", + port=12002, +) # Application flow orchestration x = None @@ -148,9 +256,23 @@ By implementing each Agent as an Actor, an Agent will automatically wait for its #### PlaceHolder -Meanwhile, to support centralized application orchestration, AgentScope introduces the concept of Placeholder. A Placeholder is a special message that contains the address and port number of the agent that generated the Placeholder, which is used to indicate that the input message of the Agent is not ready yet. -When the input message of the Agent is ready, the Placeholder will be replaced by the real message, and then the actual `reply` method will be executed. +Meanwhile, to support centralized application orchestration, AgentScope introduces the concept of {class}`Placeholder`. +A Placeholder is a special message that contains the address and port number of the agent that generated the placeholder, which is used to indicate that the output message of the Agent is not ready yet. +When calling the `reply` method of a distributed agent, a placeholder is returned immediately without blocking the main process. +The interface of placeholder is exactly the same as the message, so that the orchestration flow can be written in a centralized way. +When getting values from a placeholder, the placeholder will send a request to get the real values from the source agent. +A placeholder itself is also a message, and it can be sent to other agents, and let other agents to get the real values, which can avoid sending the real values multiple times. About more detailed technical implementation solutions, please refer to our [paper](https://arxiv.org/abs/2402.14034). +#### Agent Server + +In agentscope, the agent server provides a running platform for various types of agents. +Multiple agents can run in the same agent server and hold independent memory and other local states but they will share the same computation resources. + +After installing the distributed version of AgentScope, you can use the `as_server` command to start the agent server, and the detailed startup arguments can be found in the documentation of the {func}`as_server` function. + +As long as the code is not modified, an agent server can provide services for multiple main processes. +This means that when running mutliple applications, you only need to start the agent server for the first time, and it can be reused subsequently. + [[Back to the top]](#208-distribute-en) diff --git a/docs/sphinx_doc/en/source/tutorial/209-rag.md b/docs/sphinx_doc/en/source/tutorial/209-rag.md index df11e9bcd..5aaa093de 100644 --- a/docs/sphinx_doc/en/source/tutorial/209-rag.md +++ b/docs/sphinx_doc/en/source/tutorial/209-rag.md @@ -15,10 +15,10 @@ When a `LlamaIndexKnowledge` object is initialized, the `LlamaIndexKnowledge.__i * generating index with the processed nodes. * If the indexing already exists, then `LlamaIndexKnowledge._load_index(...)` will be invoked to load the index and avoid repeating embedding calls. - A RAG module can be created with a JSON configuration to specify 1) data path, 2) data loader, 3) data preprocessing methods, and 4) embedding model (model config name). + A Knowledge object can be created with a JSON configuration to specify 1) data path, 2) data loader, 3) data preprocessing methods, and 4) embedding model (model config name). A detailed example can refer to the following:
- A detailed example of RAG module configuration + A detailed example of Knowledge object configuration ```json [ @@ -78,9 +78,9 @@ If users want to avoid the detailed configuration, we also provide a quick way i ### Knowledge Bank The knowledge bank maintains a collection of Knowledge objects (e.g., on different datasets) as a set of *knowledge*. Thus, -different agents can reuse the RAG modules without unnecessary "re-initialization". -Considering that configuring the RAG module may be too complicated for most users, the knowledge bank also provides an easy function call to create RAG modules. - * `KnowledgeBank.add_data_as_knowledge`: create RAG module. An easy way only requires to provide `knowledge_id`, `emb_model_name` and `data_dirs_and_types` +different agents can reuse the Knowledge object without unnecessary "re-initialization". +Considering that configuring the Knowledge object may be too complicated for most users, the knowledge bank also provides an easy function call to create Knowledge objects. + * `KnowledgeBank.add_data_as_knowledge`: create Knowledge object. An easy way only requires to provide `knowledge_id`, `emb_model_name` and `data_dirs_and_types` ```python knowledge_bank.add_data_as_knowledge( knowledge_id="agentscope_tutorial_rag", diff --git a/docs/sphinx_doc/en/source/tutorial/advance.rst b/docs/sphinx_doc/en/source/tutorial/advance.rst index ff483b9b2..64bd86508 100644 --- a/docs/sphinx_doc/en/source/tutorial/advance.rst +++ b/docs/sphinx_doc/en/source/tutorial/advance.rst @@ -7,6 +7,7 @@ Advanced Exploration 201-agent.md 202-pipeline.md 203-model.md + 203-parser.md 204-service.md 205-memory.md 206-prompt.md diff --git a/docs/sphinx_doc/zh_CN/source/index.rst b/docs/sphinx_doc/zh_CN/source/index.rst index 7f6c48275..662fb267c 100644 --- a/docs/sphinx_doc/zh_CN/source/index.rst +++ b/docs/sphinx_doc/zh_CN/source/index.rst @@ -38,6 +38,7 @@ AgentScope 文档 agentscope.pipelines agentscope.service agentscope.rpc + agentscope.server agentscope.web agentscope.prompt agentscope.utils diff --git a/docs/sphinx_doc/zh_CN/source/tutorial/201-agent.md b/docs/sphinx_doc/zh_CN/source/tutorial/201-agent.md index a14ee55c8..6959bc929 100644 --- a/docs/sphinx_doc/zh_CN/source/tutorial/201-agent.md +++ b/docs/sphinx_doc/zh_CN/source/tutorial/201-agent.md @@ -17,6 +17,8 @@ * `sys_prompt`(系统提示)和`engine`(引擎):系统提示作为预定义的指令,指导agent在其互动中的行为;而engine用于动态生成合适的提示。关于它们的更多细节,我们会在[提示引擎部分](206-prompt)讨论。 +* `to_dist`(分布式):用于创建 agent 的分布式版本,以支持多 agent 的高效协作。请注意`to_dist`是一个保留字段,将自动添加到`AgentBase`所有子类的初始化函数中。关于 `to_dist` 的更多细节,请见[分布式部分](208-distribute)。 + 除了这些属性,`AgentBase` 还为agent提供了一些关键方法,如 `observe` 和 `reply`: * `observe()`:通过这个方法,一个agent可以注意到消息而不立即回复,允许它根据观察到的消息更新它的记忆。 diff --git a/docs/sphinx_doc/zh_CN/source/tutorial/203-model.md b/docs/sphinx_doc/zh_CN/source/tutorial/203-model.md index 3c4705752..7b912cbf2 100644 --- a/docs/sphinx_doc/zh_CN/source/tutorial/203-model.md +++ b/docs/sphinx_doc/zh_CN/source/tutorial/203-model.md @@ -11,7 +11,9 @@ AgentScope中,模型的部署和调用是通过`ModelWrapper`来解耦开的 - OpenAI API,包括对话(Chat),图片生成(DALL-E)和文本嵌入(Embedding)。 - DashScope API,包括对话(Chat)和图片生成(Image Sythesis)和文本嵌入(Text Embedding)。 - Gemini API,包括对话(Chat)和嵌入(Embedding)。 +- ZhipuAi API,包括对话(Chat)和嵌入(Embedding)。 - Ollama API,包括对话(Chat),嵌入(Embedding)和生成(Generation)。 +- LiteLLM API, 包括对话(Chat), 支持各种模型的API. - Post请求API,基于Post请求实现的模型推理服务,包括Huggingface/ModelScope Inference API和各种符合Post请求格式的API。 @@ -101,9 +103,12 @@ API如下: | | Multimodal | [`DashScopeMultiModalWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/dashscope_model.py) | `"dashscope_multimodal"` | qwen-vl-plus, qwen-vl-max, qwen-audio-turbo, ... | | Gemini API | Chat | [`GeminiChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/gemini_model.py) | `"gemini_chat"` | gemini-pro, ... | | | Embedding | [`GeminiEmbeddingWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/gemini_model.py) | `"gemini_embedding"` | models/embedding-001, ... | +| ZhipuAI API | Chat | [`ZhipuAIChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/zhipu_model.py) | `"zhipuai_chat"` | glm-4, ... | +| | Embedding | [`ZhipuAIEmbeddingWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/zhipu_model.py) | `"zhipuai_embedding"` | embedding-2, ... | | ollama | Chat | [`OllamaChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/ollama_model.py) | `"ollama_chat"` | llama2, ... | | | Embedding | [`OllamaEmbeddingWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/ollama_model.py) | `"ollama_embedding"` | llama2, ... | | | Generation | [`OllamaGenerationWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/ollama_model.py) | `"ollama_generate"` | llama2, ... | +| LiteLLM API | Chat | [`LiteLLMChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/litellm_model.py) | `"litellm_chat"` | - | | Post Request based API | - | [`PostAPIModelWrapperBase`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/post_model.py) | `"post_api"` | - | | | Chat | [`PostAPIChatModelWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/post_model.py) | `"post_api_chat"` | meta-llama/Meta-Llama-3-8B-Instruct, ... | @@ -323,6 +328,48 @@ API如下:
+ +#### ZhipuAI API + +
+ZhipuAI Chat API (agentscope.models.ZhipuAIChatWrapper) + +```python +{ + "config_name": "my_zhipuai_chat_config", + "model_type": "zhipuai_chat", + + # Required parameters + "model_name": "{model_name}", # The model name in ZhipuAI API, e.g. glm-4 + + # Optional parameters + "api_key": "{your_api_key}" +} +``` + +
+ +
+ZhipuAI Embedding API (agentscope.models.ZhipuAIEmbeddingWrapper) + +```python +{ + "config_name": "my_zhipuai_embedding_config", + "model_type": "zhipuai_embedding", + + # Required parameters + "model_name": "{model_name}", # The model name in ZhipuAI API, e.g. embedding-2 + + # Optional parameters + "api_key": "{your_api_key}", +} +``` + +
+ +
+ + #### Ollama API
@@ -390,6 +437,26 @@ API如下:
+ +#### LiteLLM Chat API + +
+LiteLLM Chat API (agentscope.models.LiteLLMChatModelWrapper) + +```python +{ + "config_name": "lite_llm_openai_chat_gpt-3.5-turbo", + "model_type": "litellm_chat", + "model_name": "gpt-3.5-turbo" # You should note that for different models, you should set the corresponding environment variables, such as OPENAI_API_KEY, etc. You may refer to https://docs.litellm.ai/docs/ for this. +}, +``` + +
+ +
+ + #### Post Request API
diff --git a/docs/sphinx_doc/zh_CN/source/tutorial/203-parser.md b/docs/sphinx_doc/zh_CN/source/tutorial/203-parser.md new file mode 100644 index 000000000..527f2960e --- /dev/null +++ b/docs/sphinx_doc/zh_CN/source/tutorial/203-parser.md @@ -0,0 +1,456 @@ +(203-parser-zh)= + +# 模型结果解析 + +## 目录 + +- [背景](#背景) +- [解析器模块](#解析器模块) + - [功能说明](#功能说明) + - [字符串类型](#字符串str类型) + - [MarkdownCodeBlockParser](#markdowncodeblockparser) + - [初始化](#初始化) + - [响应格式模版](#响应格式模版) + - [解析函数](#解析函数) + - [字典类型](#字典dict类型) + - [MarkdownJsonDictParser](#markdownjsondictparser) + - [初始化 & 响应格式模版](#初始化--响应格式模版) + - [MultiTaggedContentParser](#multitaggedcontentparser) + - [初始化 & 响应格式模版](#初始化--响应格式模版-1) + - [解析函数](#解析函数-1) + - [JSON / Python 对象类型](#json--python-对象类型) + - [MarkdownJsonObjectParser](#markdownjsonobjectparser) + - [初始化 & 响应格式模版](#初始化--响应格式模版-2) + - [解析函数](#解析函数-2) +- [典型使用样例](#典型使用样例) + - [狼人杀游戏](#狼人杀游戏) + - [ReAct 智能体和工具使用](#react-智能体和工具使用) +- [自定义解析器](#自定义解析器) + + +## 背景 + +利用LLM构建应用的过程中,将 LLM 产生的字符串解析成指定的格式,提取出需要的信息,是一个非常重要的环节。 +但同时由于下列原因,这个过程也是一个非常复杂的过程: + +1. **多样性**:解析的目标格式多种多样,需要提取的信息可能是一段特定文本,一个JSON对象,或者是一个复杂的数据结构。 +2. **复杂性**:结果解析不仅仅是将 LLM 产生的文本转换成目标格式,还涉及到提示工程(提醒 LLM 应该产生什么格式的输出),错误处理等一些列问题。 +3. **灵活性**:同一个应用中,不同阶段也可能需要智能体产生不同格式的输出。 + +为了让开发者能够便捷、灵活的地进行结果解析,AgentScope设计并提供了解析器模块(Parser)。利用该模块,开发者可以通过简单的配置,实现目标格式的解析,同时可以灵活的切换解析的目标格式。 + +AgentScope中,解析器模块的设计原则是: +1. **灵活**:开发者可以灵活设置所需返回格式、灵活地切换解析器,实现不同格式的解析,而无需修改智能体类的代码,即具体的“目标格式”与智能体类内`reply`函数的处理逻辑解耦 +2. **自由**:用户可以自由选择是否使用解析器。解析器所提供的响应格式提示、解析结果等功能都是在`reply`函数内显式调用的,用户可以自由选择使用解析器或是自己实现代码实现结果解析 +3. **透明**:利用解析器时,提示(prompt)构建的过程和结果在`reply`函数内对开发者完全可见且透明,开发者可以精确调试自己的应用。 + +## 解析器模块 + +### 功能说明 + +解析器模块(Parser)的主要功能包括: + +1. 提供“响应格式说明”(format instruction),即提示 LLM 应该在什么位置产生什么输出,例如 + +```` +You should generate python code in a fenced code block as follows +```python +{your_python_code} +``` +```` + + +2. 提供解析函数(parse function),直接将 LLM 产生的文本解析成目标数据格式 + +3. 针对字典格式的后处理功能。在将文本解析成字典后,其中不同的字段可能有不同的用处 + +AgentScope提供了多种不同解析器,开发者可以根据自己的需求进行选择。 + +| 目标格式 | 解析器 | 说明 | +|-------------------|----------------------------|-----------------------------------------------------------------------------| +| 字符串(`str`)类型 | `MarkdownCodeBlockParser` | 要求 LLM 将指定的文本生成到Markdown中以 ``` 标识的代码块中,解析结果为字符串。 | +| 字典(`dict`)类型 | `MarkdownJsonDictParser` | 要求 LLM 在 \```json 和 \``` 标识的代码块中产生指定内容的字典,解析结果为 Python 字典。 | +| | `MultiTaggedContentParser` | 要求 LLM 在多个标签中产生指定内容,这些不同标签中的内容将一同被解析成一个 Python 字典,并填入不同的键值对中。 | +| JSON / Python对象类型 | `MarkdownJsonObjectParser` | 要求 LLM 在 \```json 和 \``` 标识的代码块中产生指定的内容,解析结果将通过 `json.loads` 转换成 Python 对象。 | + +> **NOTE**: 相比`MarkdownJsonDictParser`,`MultiTaggedContentParser`更适合于模型能力不强,以及需要 LLM 返回内容过于复杂的情况。例如 LLM 返回 Python 代码,如果直接在字典中返回代码,那么 LLM 需要注意特殊字符的转义(\t,\n,...),`json.loads`读取时对双引号和单引号的区分等问题。而`MultiTaggedContentParser`实际是让大模型在每个单独的标签中返回各个键值,然后再将它们组成字典,从而降低了LLM返回的难度。 + +下面我们将根据不同的目标格式,介绍这些解析器的用法。 + +### 字符串(`str`)类型 + +#### MarkdownCodeBlockParser + +##### 初始化 + +- `MarkdownCodeBlockParser`采用 Markdown 代码块的形式,要求 LLM 将指定文本产生到指定的代码块中。可以通过`language_name`参数指定不同的语言,从而利用大模型代码能力产生对应的输出。例如要求大模型产生 Python 代码时,初始化如下: + + ```python + from agentscope.parsers import MarkdownCodeBlockParser + + parser = MarkdownCodeBlockParser(language_name="python", content_hint="your python code") + ``` + +##### 响应格式模版 + +- `MarkdownCodeBlockParser`类提供如下的“响应格式说明”模版,在用户调用`format_instruction`属性时,会将`{language_name}`替换为初始化时输入的字符串: + + ```` + You should generate {language_name} code in a {language_name} fenced code block as follows: + ```{language_name} + {content_hint} + ``` + ```` + +- 例如上述对`language_name`为`"python"`的初始化,调用`format_instruction`属性时,会返回如下字符串: + + ```python + print(parser.format_instruction) + ``` + + ```` + You should generate python code in a python fenced code block as follows + ```python + your python code + ``` + ```` + +##### 解析函数 + +- `MarkdownCodeBlockParser`类提供`parse`方法,用于解析LLM产生的文本,返回的是字符串。 + + ````python + res = parser.parse( + ModelResponse( + text="""The following is generated python code + ```python + print("Hello world!") + ``` + """ + ) + ) + + print(res.parsed) + ```` + + ``` + print("hello world!") + ``` + +### 字典(`dict`)类型 + +与字符串和一般的 JSON / Python 对象不同,作为LLM应用中常用的数据格式,AgentScope为字典类型提供了额外的后处理功能。初始化解析器时,可以通过额外设置`keys_to_content`,`keys_to_memory`,`keys_to_metadata`三个参数,从而实现在调用`parser`的`to_content`,`to_memory`和`to_metadata`方法时,对字典键值对的过滤。 +其中 + - `keys_to_content` 指定的键值对将被放置在返回`Msg`对象中的`content`字段,这个字段内容将会被返回给其它智能体,参与到其他智能体的提示构建中,同时也会被`self.speak`函数调用,用于显式输出 + - `keys_to_memory` 指定的键值对将被存储到智能体的记忆中 + - `keys_to_metadata` 指定的键值对将被放置在`Msg`对象的`metadata`字段,可以用于应用的控制流程判断,或挂载一些不需要返回给其它智能体的信息。 + +三个参数接收布尔值、字符串和字符串列表。其值的含义如下: +- `False`: 对应的过滤函数将返回`None`。 +- `True`: 整个字典将被返回。 +- `str`: 对应的键值将被直接返回,注意返回的会是对应的值而非字典。 +- `List[str]`: 根据键值对列表返回过滤后的字典。 + +AgentScope中,`keys_to_content` 和 `keys_to_memory` 默认为 `True`,即整个字典将被返回。`keys_to_metadata` 默认为 `False`,即对应的过滤函数将返回 `None`。 + +下面是狼人杀游戏的样例,在白天讨论过程中 LLM 扮演狼人产生的字典。在这个例子中, +- `"thought"`字段不应该返回给其它智能体,但是应该存储在智能体的记忆中,从而保证狼人策略的延续; +- `"speak"`字段应该被返回给其它智能体,并且存储在智能体记忆中; +- `"finish_discussion"`字段用于应用的控制流程中,判断讨论是否已经结束。为了节省token,该字段不应该被返回给其它的智能体,同时也不应该存储在智能体的记忆中。 + + ```python + { + "thought": "The others didn't realize I was a werewolf. I should end the discussion soon.", + "speak": "I agree with you.", + "finish_discussion": True + } + ``` + +AgentScope中,我们通过调用`to_content`,`to_memory`和`to_metadata`方法实现后处理功能,示意代码如下: + +- 应用中的控制流代码,创建对应的解析器对象并装载 + + ```python + from agentscope.parsers import MarkdownJsonDictParser + + # ... + + agent = DictDialogAgent(...) + + # 以MarkdownJsonDictParser为例 + parser = MarkdownJsonDictParser( + content_hint={ + "thought": "what you thought", + "speak": "what you speak", + "finish_discussion": "whether the discussion is finished" + }, + keys_to_content="speak", + keys_to_memory=["thought", "speak"], + keys_to_metadata=["finish_discussion"] + ) + + # 装载解析器,即相当于指定了要求的相应格式 + agent.set_parser(parser) + + # 讨论过程 + while True: + # ... + x = agent(x) + # 根据metadata字段,获取LLM对当前是否应该结束讨论的判断 + if x.metadata["finish_discussion"]: + break + ``` + + +- 智能体内部`reply`函数内实现字典的过滤 + + ```python + # ... + def reply(x: dict = None) -> None: + + # ... + res = self.model(prompt, parse_func=self.parser.parse) + + # 过滤后拥有 thought 和 speak 字段的字典,存储到智能体记忆中 + self.memory.add( + Msg( + self.name, + content=self.parser.to_memory(res.parsed), + role="assistant", + ) + ) + + # 存储到content中,同时存储到metadata中 + msg = Msg( + self.name, + content=self.parser.to_content(res.parsed), + role="assistant", + metadata=self.parser.to_metadata(res.parsed), + ) + self.speak(msg) + + return msg + ``` + + + + +> **Note**: `keys_to_content`,`keys_to_memory`和`keys_to_metadata`参数可以是列表,字符串,也可以是布尔值。 +> - 如果是`True`,则会直接返回整个字典,即不进行过滤 +> - 如果是`False`,则会直接返回`None`值 +> - 如果是字符串类型,则`to_content`,`to_memory`和`to_metadata`方法将会把字符串对应的键值直接放入到对应的位置,例如`keys_to_content="speak"`,则`to_content`方法将会把`res.parsed["speak"]`放入到`Msg`对象的`content`字段中,`content`字段会是字符串而不是字典。 +> - 如果是列表类型,则`to_content`,`to_memory`和`to_metadata`方法实现的将是过滤功能,对应过滤后的结果是字典 +> ```python +> parser = MarkdownJsonDictParser( +> content_hint={ +> "thought": "what you thought", +> "speak": "what you speak", +> }, +> keys_to_content="speak", +> keys_to_memory=["thought", "speak"], +> ) +> +> example_dict = {"thought": "abc", "speak": "def"} +> print(parser.to_content(example_dict)) # def +> print(parser.to_memory(example_dict)) # {"thought": "abc", "speak": "def"} +> print(parser.to_metadata(example_dict)) # None +> ``` +> ``` +> def +> {"thought": "abc", "speak": "def"} +> None +> ``` + +下面我们具体介绍两种字典类型的解析器。 + +#### MarkdownJsonDictParser + +##### 初始化 & 响应格式模版 + +- `MarkdownJsonDictParser`要求 LLM 在 \```json 和 \``` 标识的代码块中产生指定内容的字典。 +- 除了`to_content`,`to_memory`和`to_metadata`参数外,可以通过提供 `content_hint` 参数提供响应结果样例和说明,即提示LLM应该产生什么样子的字典,该参数可以是字符串,也可以是字典,在构建响应格式提示的时候将会被自动转换成字符串进行拼接。 + + ```python + from agentscope.parsers import MarkdownJsonDictParser + + # 字典 + MarkdownJsonDictParser( + content_hint={ + "thought": "what you thought", + "speak": "what you speak", + } + ) + # 或字符串 + MarkdownJsonDictParser( + content_hint="""{ + "thought": "what you thought", + "speak": "what you speak", + }""" + ) + ``` + - 对应的`instruction_format`属性 + + ```` + You should respond a json object in a json fenced code block as follows: + ```json + {content_hint} + ``` + ```` + +#### MultiTaggedContentParser + +`MultiTaggedContentParser`要求 LLM 在多个指定的标签对中产生指定的内容,这些不同标签的内容将一同被解析为一个 Python 字典。使用方法与`MarkdownJsonDictParser`类似,只是初始化方法不同,更适合能力较弱的LLM,或是比较复杂的返回内容。 + +##### 初始化 & 响应格式模版 + +`MultiTaggedContentParser`中,每一组标签将会以`TaggedContent`对象的形式传入,其中`TaggedContent`对象包含了 +- 标签名(`name`),即返回字典中的key值 +- 开始标签(`tag_begin`) +- 标签内容提示(`content_hint`) +- 结束标签(`tag_end`) +- 内容解析功能(`parse_json`),默认为`False`。当置为`True`时,将在响应格式提示中自动添加提示,并且提取出的内容将通过`json.loads`解析成 Python 对象 + +```python +from agentscope.parsers import MultiTaggedContentParser, TaggedContent +parser = MultiTaggedContentParser( + TaggedContent( + name="thought", + tag_begin="[THOUGHT]", + content_hint="what you thought", + tag_end="[/THOUGHT]" + ), + TaggedContent( + name="speak", + tag_begin="[SPEAK]", + content_hint="what you speak", + tag_end="[/SPEAK]" + ), + TaggedContent( + name="finish_discussion", + tag_begin="[FINISH_DISCUSSION]", + content_hint="true/false, whether the discussion is finished", + tag_end="[/FINISH_DISCUSSION]", + parse_json=True, # 我们希望这个字段的内容直接被解析成 True 或 False 的 Python 布尔值 + ) +) + +print(parser.format_instruction) +``` + +``` +Respond with specific tags as outlined below, and the content between [FINISH_DISCUSSION] and [/FINISH_DISCUSSION] MUST be a JSON object: +[THOUGHT]what you thought[/THOUGHT] +[SPEAK]what you speak[/SPEAK] +[FINISH_DISCUSSION]true/false, whether the discussion is finished[/FINISH_DISCUSSION] +``` + +##### 解析函数 + +- `MultiTaggedContentParser`的解析结果为字典,其中key为`TaggedContent`对象的`name`的值,以下是狼人杀中解析 LLM 返回的样例: + +```python +res_dict = parser.parse( + ModelResponse(text="""As a werewolf, I should keep pretending to be a villager +[THOUGHT]The others didn't realize I was a werewolf. I should end the discussion soon.[/THOUGHT] +[SPEAK]I agree with you.[/SPEAK] +[FINISH_DISCUSSION]true[/FINISH_DISCUSSION] +""" + ) +) + +print(res_dict) +``` + +``` +{ + "thought": "The others didn't realize I was a werewolf. I should end the discussion soon.", + "speak": "I agree with you.", + "finish_discussion": true +} +``` + +### JSON / Python 对象类型 + +#### MarkdownJsonObjectParser + +`MarkdownJsonObjectParser`同样采用 Markdown 的\```json和\```标识,但是不限制解析的内容的类型,可以是列表,字典,数值,字符串等可以通过`json.loads`进行解析字符串。 + +##### 初始化 & 响应格式模版 + +```python +from agentscope.parsers import MarkdownJsonObjectParser + +parser = MarkdownJsonObjectParser( + content_hint="{A list of numbers.}" +) + +print(parser.format_instruction) +``` + +```` +You should respond a json object in a json fenced code block as follows: +```json +{a list of numbers} +``` +```` + +##### 解析函数 + +````python +res = parser.parse( + ModelResponse(text="""Yes, here is the generated list +```json +[1,2,3,4,5] +``` +""" + ) +) + +print(type(res)) +print(res) +```` + +``` + +[1, 2, 3, 4, 5] +``` + +## 典型使用样例 + +### 狼人杀游戏 + +狼人杀(Werewolf)是字典解析器的一个经典使用场景,在游戏的不同阶段内,需要同一个智能体在不同阶段产生除了`"thought"`和`"speak"`外其它的标识字段,例如是否结束讨论,预言家是否使用能力,女巫是否使用解药和毒药,投票等。 + +AgentScope中已经内置了[狼人杀](https://github.com/modelscope/agentscope/tree/main/examples/game_werewolf)的样例,该样例采用`DictDialogAgent`类,配合不同的解析器,实现了灵活的目标格式切换。同时利用解析器的后处理功能,实现了“想”与“说”的分离,同时控制游戏流程的推进。 +详细实现请参考狼人杀[源码](https://github.com/modelscope/agentscope/tree/main/examples/game_werewolf)。 + +### ReAct 智能体和工具使用 + +`ReActAgent`是AgentScope中为了工具使用构建的智能体类,基于 ReAct 算法进行搭建,可以配合不同的工具函数进行使用。其中工具的调用,格式解析,采用了和解析器同样的实现思路。详细实现请参考[代码](https://github.com/modelscope/agentscope/blob/main/src/agentscope/agents/react_agent.py)。 + + +## 自定义解析器 + +AgentScope中提供了解析器的基类`ParserBase`,开发者可以通过继承该基类,并实现其中的`format_instruction`属性和`parse`方法来实现自己的解析器。 + +针对目标格式是字典类型的解析,可以额外继承`agentscope.parser.DictFilterMixin`类实现对字典类型的后处理。 + +```python +from abc import ABC, abstractmethod + +from agentscope.models import ModelResponse + + +class ParserBase(ABC): + """The base class for model response parser.""" + + format_instruction: str + """The instruction for the response format.""" + + @abstractmethod + def parse(self, response: ModelResponse) -> ModelResponse: + """Parse the response text to a specific object, and stored in the + parsed field of the response object.""" + + # ... +``` diff --git a/docs/sphinx_doc/zh_CN/source/tutorial/204-service.md b/docs/sphinx_doc/zh_CN/source/tutorial/204-service.md index 68f165cba..6892b6339 100644 --- a/docs/sphinx_doc/zh_CN/source/tutorial/204-service.md +++ b/docs/sphinx_doc/zh_CN/source/tutorial/204-service.md @@ -23,7 +23,10 @@ | | `arxiv_search` | 使用arxiv搜索。 | | | `download_from_url` | 从指定的 URL 下载文件。 | | | `load_web` | 爬取并解析指定的网页链接 (目前仅支持爬取 HTML 页面) | -| | `digest_webpage` | 对已经爬取好的网页生成摘要信息(目前仅支持 HTML 页面) | +| | `digest_webpage` | 对已经爬取好的网页生成摘要信息(目前仅支持 HTML 页面 +| | `dblp_search_publications` | 在dblp数据库里搜索文献。 +| | `dblp_search_authors` | 在dblp数据库里搜索作者。 | +| | `dblp_search_venues` | 在dblp数据库里搜索期刊,会议及研讨会。 | | 文件处理 | `create_file` | 在指定路径创建一个新文件,并可选择添加初始内容。 | | | `delete_file` | 删除由文件路径指定的文件。 | | | `move_file` | 将文件从一个路径移动或重命名到另一个路径。 | diff --git a/docs/sphinx_doc/zh_CN/source/tutorial/206-prompt.md b/docs/sphinx_doc/zh_CN/source/tutorial/206-prompt.md index 1bd05ad4e..c2767d902 100644 --- a/docs/sphinx_doc/zh_CN/source/tutorial/206-prompt.md +++ b/docs/sphinx_doc/zh_CN/source/tutorial/206-prompt.md @@ -29,6 +29,7 @@ AgentScope为以下的模型API提供了内置的提示构建策略。 - [OllamaChatWrapper](#ollamachatwrapper) - [OllamaGenerationWrapper](#ollamagenerationwrapper) - [GeminiChatWrapper](#geminichatwrapper) +- [ZhipuAIChatWrapper](#zhipuaichatwrapper) 这些策略是在对应Model Wrapper类的`format`函数中实现的。它接受`Msg`对象,`Msg`对象的列表或它们的混合作为输入。在`format`函数将会把输入重新组织成一个`Msg`对象的列表,因此为了方便解释,我们在下面的章节中认为`format`函数的输入是`Msg`对象的列表。 @@ -41,6 +42,8 @@ AgentScope为以下的模型API提供了内置的提示构建策略。 #### 提示的构建策略 +##### 非视觉(Vision)模型 + 在OpenAI Chat API中,`name`字段使模型能够区分对话中的不同发言者。因此,`OpenAIChatWrapper`中`format`函数的策略很简单: - `Msg`: 直接将带有`role`、`content`和`name`字段的字典传递给API。 @@ -75,6 +78,75 @@ print(prompt) ] ``` +##### 视觉(Vision)模型 + +对支持视觉的模型而言,如果输入消息包含图像url,生成的`content`字段将是一个字典的列表,其中包含文本和图像url。 + +具体来说,如果是网络图片url,将直接传递给OpenAI Chat API,而本地图片url将被转换为base64格式。更多细节请参考[官方指南](https://platform.openai.com/docs/guides/vision)。 + +注意无效的图片url(例如`/Users/xxx/test.mp3`)将被忽略。 + +```python +from agentscope.models import OpenAIChatWrapper +from agentscope.message import Msg + +model = OpenAIChatWrapper( + config_name="", # 为空,因为我们直接初始化model wrapper + model_name="gpt-4o", +) + +prompt = model.format( + Msg("system", "You're a helpful assistant", role="system"), # Msg 对象 + [ # Msg 对象的列表 + Msg(name="user", content="Describe this image", role="user", url="https://xxx.png"), + Msg(name="user", content="And these images", role="user", url=["/Users/xxx/test.png", "/Users/xxx/test.mp3"]), + ], +) +print(prompt) +``` + +```python +[ + { + "role": "system", + "name": "system", + "content": "You are a helpful assistant" + }, + { + "role": "user", + "name": "user", + "content": [ + { + "type": "text", + "text": "Describe this image" + }, + { + "type": "image_url", + "image_url": { + "url": "https://xxx.png" + } + }, + ] + }, + { + "role": "user", + "name": "user", + "content": [ + { + "type": "text", + "text": "And these images" + }, + { + "type": "image_url", + "image_url": { + "url": "..." # 对应 /Users/xxx/test.png + } + }, + ] + }, +] +``` + ### `DashScopeChatWrapper` `DashScopeChatWrapper`封装了DashScope聊天API,它接受消息列表作为输入。消息必须遵守以下规则: @@ -199,6 +271,46 @@ print(prompt) ] ``` +### LiteLLMChatWrapper + +`LiteLLMChatWrapper`封装了litellm聊天API,它接受消息列表作为输入。Litellm支持不同类型的模型,每个模型可能需要遵守不同的格式。为了简化使用,我们提供了一种与大多数模型兼容的格式。如果需要更特定的格式,您可以参考您所使用的特定模型以及[litellm](https://github.com/BerriAI/litellm)文档,来定制适合您模型的格式函数。 +- 格式化聊天历史中的所有消息,将其整合成一个以`"user"`作为`role`的单一消息 +#### 提示策略 +- 消息将包括对话历史,`user`消息由系统消息(system message)和"## Dialog History"前缀。 + + +```python +from agentscope.models import LiteLLMChatWrapper + +model = LiteLLMChatWrapper( + config_name="", # empty since we directly initialize the model wrapper + model_name="gpt-3.5-turbo", +) + +prompt = model.format( + Msg("system", "You are a helpful assistant", role="system"), + [ + Msg("user", "What is the weather today?", role="user"), + Msg("assistant", "It is sunny today", role="assistant"), + ], +) + +print(prompt) +``` + +```bash +[ + { + "role": "user", + "content": ( + "You are a helpful assistant\n\n" + "## Dialogue History\nuser: What is the weather today?\n" + "assistant: It is sunny today" + ), + }, +] +``` + ### `OllamaChatWrapper` `OllamaChatWrapper`封装了Ollama聊天API,它接受消息列表作为输入。消息必须遵守以下规则(更新于2024/03/22): @@ -210,8 +322,10 @@ print(prompt) 给定一个消息列表,我们将按照以下规则解析每个消息: -- `Msg`:直接填充`role`和`content`字段。如果它有一个`url`字段,指向一个图片,我们将把它添加到消息中。 -- `List`:根据上述规则解析列表中的每个元素。 +- 如果输入的第一条信息的`role`字段是`"system"`,该条信息将被视为系统提示(system + prompt),其他信息将一起组成对话历史。对话历史将添加`"## Dialogue History"`的前缀,并与 +系统提示一起组成一条`role`为`"system"`的信息。 +- 如果输入信息中的`url`字段不为`None`,则这些url将一起被置于`"images"`对应的键值中。 ```python from agentscope.models import OllamaChatWrapper @@ -234,9 +348,11 @@ print(prompt) ```bash [ - {"role": "system", "content": "You are a helpful assistant"}, - {"role": "assistant", "content": "Hi."}, - {"role": "assistant", "content": "Nice to meet you!", "images": ["https://example.com/image.jpg"]}, + { + "role": "system", + "content": "You are a helpful assistant\n\n## Dialogue History\nBob: Hi.\nAlice: Nice to meet you!", + "images": ["https://example.com/image.jpg"] + }, ] ``` @@ -326,6 +442,49 @@ print(prompt) ] ``` + +### `ZhipuAIChatWrapper` + +`ZhipuAIChatWrapper`封装了ZhipuAi聊天API,它接受消息列表或字符串提示作为输入。与DashScope聊天API类似,如果我们传递消息列表,它必须遵守以下规则: + +- 必须有 role 和 content 字段,且 role 必须是 "user"、"system" 或 "assistant" 中的一个。 +- 至少有一个 user 消息。 + +当代理可能扮演多种不同角色并连续发言时,这些要求使得构建多代理对话变得困难。 +因此,我们决定在内置的`format`函数中将消息列表转换为字符串提示,并且封装在一条user信息中。 + +#### 提示的构建策略 + +如果第一条消息的 role 字段是 "system",它将被转换为带有 role 字段为 "system" 和 content 字段为系统消息的单个消息。其余的消息会被转化为带有 role 字段为 "user" 和 content 字段为对话历史的消息。 +下面展示了一个示例: + +```python +from agentscope.models import ZhipuAIChatWrapper +from agentscope.message import Msg + +model = ZhipuAIChatWrapper( + config_name="", # empty since we directly initialize the model wrapper + model_name="glm-4", + api_key="your api key", +) + +prompt = model.format( + Msg("system", "You're a helpful assistant", role="system"), # Msg object + [ # a list of Msg objects + Msg(name="Bob", content="Hi!", role="assistant"), + Msg(name="Alice", content="Nice to meet you!", role="assistant"), + ], +) +print(prompt) +``` + +```bash +[ + {"role": "system", "content": "You are a helpful assistant"}, + {"role": "user", "content": "## Dialogue History\nBob: Hi!\nAlice: Nice to meet you!"}, +] +``` + ## 关于`PromptEngine`类 (将会在未来版本弃用) `PromptEngine`类提供了一种结构化的方式来合并不同的提示组件,比如指令、提示、对话历史和用户输入,以适合底层语言模型的格式。 diff --git a/docs/sphinx_doc/zh_CN/source/tutorial/207-monitor.md b/docs/sphinx_doc/zh_CN/source/tutorial/207-monitor.md index 73e0daf4c..dc863b834 100644 --- a/docs/sphinx_doc/zh_CN/source/tutorial/207-monitor.md +++ b/docs/sphinx_doc/zh_CN/source/tutorial/207-monitor.md @@ -35,8 +35,10 @@ monitor = MonitorFactory.get_monitor() ``` -> 目前上述代码返回的是 `SqliteMonitor` 实例,它在 `agentscope.init` 中初始化。 -> `SqliteMonitor` 类是基于Sqlite3的 `MonitorBase` 类的默认实现。 +目前上述代码将会返回一个 `SqliteMonitor` 实例,该实例在 `agentscope.init` 中初始化。 +`SqliteMonitor` 是一个基于 Sqlite3 的 `MonitorBase` 实现,也是当前的默认 Monitor。 + +如果不需要使用 Monitor 的相关功能,可以通过向 `agentscope.init` 中传入 `use_monitor=False` 来关闭 monitor 组件。在这种情况下,`MonitorFactory.get_monitor` 将返回一个 `DummyMonitor` 实例,该实例对外接口与 `SqliteMonitor` 完全相同,但内部不会执行任何操作。 ### 基本使用 diff --git a/docs/sphinx_doc/zh_CN/source/tutorial/208-distribute.md b/docs/sphinx_doc/zh_CN/source/tutorial/208-distribute.md index d882b7690..a185bd5da 100644 --- a/docs/sphinx_doc/zh_CN/source/tutorial/208-distribute.md +++ b/docs/sphinx_doc/zh_CN/source/tutorial/208-distribute.md @@ -12,69 +12,170 @@ AgentScope实现了基于Actor模式的智能体分布式部署和并行优化 ## 使用方法 -AgentScope中,我们将运行应用流程的进程称为“主进程”,而所有的智能体都会运行在独立的进程当中。 -根据主进程和智能体进程之间关系的不同,AgentScope支持两种分布式模式:主从模式(Master-Slave)和对等模式(Peer-to-Peer,P2P)。 -主从模式中,开发者可以从主进程中启动所有的智能体进程,而对等模式中,智能体进程相对主进程来说是独立的,需要在对应的机器上启动智能体的服务。 +AgentScope中,我们将运行应用流程的进程称为**主进程 (Main Process)**,而所有的智能体都会运行在额外的 **智能体服务器进程 (Agent Server Process)** 中。 +根据主进程域智能体服务器进程之间的关系,AgentScope 为每个 Agent 提供了两种启动模式:**子进程模式 (Child)** 和 **独立进程模式 (Indpendent)**。 +子进程模式中,开发者可以从主进程中启动所有的智能体服务器进程,而独立进程模式中,智能体服务器进程相对主进程来说是独立的,需要在对应的机器上启动智能体服务器进程。 -上述概念有些复杂,但是不用担心,对于应用开发者而言,它们仅仅在创建智能体阶段有微小的差别。下面我们介绍如何创建分布式智能体。 +上述概念有些复杂,但是不用担心,对于应用开发者而言,仅需将已有的智能体转化为对应的分布式版本,其余操作都和正常的单机版本完全一致。 -### 步骤1: 创建分布式智能体 +### 步骤1: 转化为分布式版本 -首先,开发者的智能体必须继承`agentscope.agents.AgentBase`类,`AgentBase`提供了`to_dist`方法将该Agent转化为其分布式版本。`to_dist`主要依靠以下的参数实现智能体分布式部署: - -- `host`: 用于部署智能体的机器IP地址,默认为`localhost`。 -- `port`: 智能体的RPC服务器端口,默认为`80`。 -- `launch_server`: 是否在本地启动RPC服务器,默认为`True`。 +AgentScope 中所有智能体都可以通过 {func}`to_dist` 方法转化为对应的分布式版本。 +但需要注意,你的智能体必须继承自 {class}`agentscope.agents.AgentBase` 类,因为是 `AgentBase` 提供了 `to_dist` 方法。 假设有两个智能体类`AgentA`和`AgentB`,它们都继承自 `AgentBase`。 -#### 主从模式 +```python +a = AgentA( + name="A" + # ... +) +b = AgentB( + name="B" + # ... +) +``` + +接下来我们将介绍如何将智能体转化到两种分布式模式。 + +#### 子进程模式 -主从模式中,由于所有智能体进程依赖于主进程,因此所有进程实际运行在一台机器上。 -我们可以在主进程中启动所有智能体进程,即默认参数`launch_server=True`和`host="localhost"`,同时我们可以省略`port`参数,AgentScope将会为智能体进程自动寻找空闲的本地端口。 +要使用该模式,你只需要调用各智能体的 `to_dist()` 方法,并且不需要提供任何参数。 +AgentScope 会自动帮你从主进程中启动智能体服务器进程并将智能体部署到对应的子进程上。 ```python +# Subprocess mode a = AgentA( name="A" # ... ).to_dist() +b = AgentB( + name="B" + # ... +).to_dist() ``` -#### 对等模式 +#### 独立进程模式 -对等模式中,我们需要首先在目标机器上启动对应智能体的服务,例如将`AgentA`的实例部署在IP为`a.b.c.d`的机器上,其对应的端口为12001。在这台目标机器上运行以下代码: +在独立进程模式中,需要首先在目标机器上启动智能体服务器进程,启动时需要提供该服务器能够使用的模型的配置信息,以及服务器的 IP 和端口号。 +例如想要将两个智能体服务进程部署在 IP 分别为 `ip_a` 和 `ip_b` 的机器上(假设这两台机器分别为`Machine1` 和 `Machine2`)。 +你可以在 `Machine1` 上运行如下代码。在运行之前请确保该机器能够正确访问到应用中所使用的所有模型。具体来讲,需要将用到的所有模型的配置信息放置在 `model_config_path_a` 文件中,并检查API key 等环境变量是否正确设置,模型配置文件样例可参考 `examples/model_configs_template`。 ```python -from agentscope.agents import RpcAgentServerLauncher - -# 创建智能体服务进程 -server_a = RpcAgentServerLauncher( - agent_class=AgentA, - agent_kwargs={ - "name": "A" - ... - }, - host="a.b.c.d", - port=12001, +# import some packages + +# register models which can be used in the server +agentscope.init( + model_configs=model_config_path_a, +) +# Create an agent service process +server = RpcAgentServerLauncher( + host="ip_a", + port=12001, # choose an available port ) -# 启动服务 -server_a.launch() -server_a.wait_until_terminate() + +# Start the service +server.launch() +server.wait_until_terminate() ``` -然后,我们可以在主进程当中用以下的代码连接智能体服务,此时主进程中创建的对象`a`可以当做智能体的本地代理,允许开发者可以在主进程中采取中心化的方式编写应用流程。 +> 为了进一步简化使用,可以在命令行中输入如下指令来代替上述代码: +> +> ```shell +> as_server --host ip_a --port 12001 --model-config-path model_config_path_a +> ``` + +在 `Machine2` 上运行如下代码,这里同样要确保已经将模型配置文件放置在 `model_config_path_b` 位置并设置环境变量,从而确保运行在该机器上的 Agent 能够正常访问到模型。 + +```python +# import some packages + +# register models which can be used in the server +agentscope.init( + model_configs=model_config_path_b, +) +# Create an agent service process +server = RpcAgentServerLauncher( + host="ip_b", + port=12002, # choose an available port +) + +# Start the service +server.launch() +server.wait_until_terminate() +``` + +> 这里也同样可以用如下指令来代替上面的代码。 +> +> ```shell +> as_server --host ip_b --port 12002 --model-config-path model_config_path_b +> ``` + +接下来,就可以使用如下代码从主进程中连接这两个智能体服务器进程。 ```python a = AgentA( name="A", - ... + # ... ).to_dist( - host="a.b.c.d", + host="ip_a", port=12001, - launch_server=False, +) +b = AgentB( + name="B", + # ... +).to_dist( + host="ip_b", + port=12002, ) ``` +上述代码将会把 `AgentA` 部署到 `Machine1` 的智能体服务器进程上,并将 `AgentB` 部署到 `Machine2` 的智能体服务器进程上。 +开发者在这之后只需要用中心化的方法编排各智能体的交互逻辑即可。 + +#### `to_dist` 进阶用法 + +上面介绍的案例都是将一个已经初始化的 Agent 通过 {func}`to_dist` 方法转化为其分布式版本,相当于要执行两次初始化操作,一次在主进程中,一次在智能体进程中。如果 Agent 的初始化过程耗时较长,直接使用 `to_dist` 方法会严重影响运行效率。为此 AgentScope 也提供了在初始化 Agent 实例的同时将其转化为其分布式版本的方法,即在原 Agent 实例初始化时传入 `to_dist` 参数。 + +子进程模式下,只需要在 Agent 初始化函数中传入 `to_dist=True` 即可: + +```python +# Child Process mode +a = AgentA( + name="A", + # ... + to_dist=True +) +b = AgentB( + name="B", + # ... + to_dist=True +) +``` + +独立进程模式下, 则需要将原来 `to_dist()` 函数的参数以 {class}`DistConf` 实例的形式传入 Agent 初始化函数的 `to_dist` 域: + +```python +a = AgentA( + name="A", + # ... + to_dist=DistConf( + host="ip_a", + port=12001, + ), +) +b = AgentB( + name="B", + # ... + to_dist=DistConf( + host="ip_b", + port=12002, + ), +) +``` + +相较于原有的 `to_dist()` 函数调用,该方法只会在智能体进程中初始化一次 Agent,避免了重复初始化现象。 + ### 步骤2: 编排分布式应用流程 在AgentScope中,分布式应用流程的编排和非分布式的程序完全一致,开发者可以用中心化的方式编写全部应用流程。 @@ -103,7 +204,9 @@ while x is None or x.content == "exit": x = b(x) ``` -- 智能体分布式部署(主从模式下): +- 智能体分布式部署 + - `AgentA` 使用子进程模式部署 + - `AgentB` 使用独立进程模式部署 ```python # 创建智能体对象 @@ -115,7 +218,10 @@ a = AgentA( b = AgentB( name="B", # ... -).to_dist() +).to_dist( + host="ip_b", + port=12002, +) # 应用流程编排 x = None @@ -148,9 +254,21 @@ D-->F #### PlaceHolder -同时,为了支持中心化的应用编排,AgentScope引入了Placeholder这一概念。Placeholder是一个特殊的消息,它包含了产生该Placeholder的智能体的地址和端口号,用于表示Agent的输入消息还未准备好。 -当Agent的输入消息准备好后,Placeholder会被替换为真实的消息,然后运行实际的`reply`方法 +同时,为了支持中心化的应用编排,AgentScope 引入了 {class}`Placeholder` 这一概念。 +Placeholder 可以理解为消息的指针,指向消息真正产生的位置,其对外接口与传统模式中的消息完全一致,因此可以按照传统中心化的消息使用方式编排应用。 +Placeholder 内部包含了该消息产生方的联络方法,可以通过网络获取到被指向消息的真正值。 +每个分布式部署的 Agent 在收到其他 Agent 发来的消息时都会立即返回一个 Placeholder,从而避免阻塞请求发起方。 +而请求发起方可以借助返回的 Placeholder 在真正需要消息内容时再去向原 Agent 发起请求,请求发起方甚至可以将 Placholder 发送给其他 Agent 让其他 Agent 代为获取消息内容,从而减少消息真实内容的不必要转发。 关于更加详细的技术实现方案,请参考我们的[论文](https://arxiv.org/abs/2402.14034)。 +#### Agent Server + +Agent Server 也就是智能体服务器。在 AgentScope 中,Agent Server 提供了一个让不同 Agent 实例运行的平台。多个不同类型的 Agent 可以运行在同一个 Agent Server 中并保持独立的记忆以及其他本地状态信息,但是他们将共享同一份计算资源。 + +在安装 AgentScope 的分布式版本后就可以通过 `as_server` 命令来启动 Agent Server,具体的启动参数在 {func}`as_server` 函数文档中可以找到。 + +只要没有对代码进行修改,一个已经启动的 Agent Server 可以为多个主流程提供服务。 +这意味着在运行多个应用时,只需要在第一次运行前启动 Agent Server,后续这些 Agent Server 进程就可以持续复用。 + [[回到顶部]](#208-distribute-zh) diff --git a/docs/sphinx_doc/zh_CN/source/tutorial/209-rag.md b/docs/sphinx_doc/zh_CN/source/tutorial/209-rag.md index 565726c45..1d14dab9d 100644 --- a/docs/sphinx_doc/zh_CN/source/tutorial/209-rag.md +++ b/docs/sphinx_doc/zh_CN/source/tutorial/209-rag.md @@ -2,7 +2,7 @@ # 简要介绍AgentScope中的RAG -我们在此介绍AgentScope与RAG相关的三个概念:知识(Knowledge),知识库(Knowledge Bank)和RAG agent。 +我们在此介绍AgentScope与RAG相关的三个概念:知识(Knowledge),知识库(Knowledge Bank)和RAG 智能体。 ### Knowledge 知识模块(目前仅有“LlamaIndexKnowledge”;即将支持对LangChain)负责处理所有与RAG相关的操作。 @@ -15,7 +15,7 @@ * 生成处理后的节点的索引。 * 如果索引已经存在,则会调用 `LlamaIndexKnowledge._load_index(...)` 来加载索引,并避免重复的嵌入调用。 - 用户可以使用JSON配置来创建一个RAG模块,以指定1)数据路径,2)数据加载器,3)数据预处理方法,以及4)嵌入模型(模型配置名称)。 + 用户可以使用JSON配置来创建一个Knowledge模块,以指定1)数据路径,2)数据加载器,3)数据预处理方法,以及4)嵌入模型(模型配置名称)。 一个详细的示例可以参考以下内容:
详细的配置示例 @@ -77,9 +77,9 @@
### Knowledge Bank -知识库将一组Knowledge模块(例如,来自不同数据集的知识)作为知识的集合进行维护。因此,不同的代理可以在没有不必要的重新初始化的情况下重复使用知识模块。考虑到配置RAG模块可能对大多数用户来说过于复杂,知识库还提供了一个简单的函数调用来创建RAG模块。 +知识库将一组Knowledge模块(例如,来自不同数据集的知识)作为知识的集合进行维护。因此,不同的智能体可以在没有不必要的重新初始化的情况下重复使用知识模块。考虑到配置Knowledge模块可能对大多数用户来说过于复杂,知识库还提供了一个简单的函数调用来创建Knowledge模块。 -* `KnowledgeBank.add_data_as_knowledge`: 创建RAG模块。一种简单的方式只需要提供knowledge_id、emb_model_name和data_dirs_and_types。 +* `KnowledgeBank.add_data_as_knowledge`: 创建Knowledge模块。一种简单的方式只需要提供knowledge_id、emb_model_name和data_dirs_and_types。 ```python knowledge_bank.add_data_as_knowledge( knowledge_id="agentscope_tutorial_rag", @@ -101,18 +101,18 @@ * `KnowledgeBank.get_knowledge`: 它接受两个参数,knowledge_id和duplicate。 如果duplicate为true,则返回提供的knowledge_id对应的知识对象;否则返回深拷贝的对象。 * `KnowledgeBank.equip`: 它接受两个参数,`agent`和`duplicate`。 -该函数首先会检查代理是否具有rag_config;如果有,则根据rag_config中的knowledge_id提供相应的知识,并为代理初始化检索器。 +该函数首先会检查智能体是否具有rag_config;如果有,则根据rag_config中的knowledge_id提供相应的知识,并为智能体初始化检索器。 `duplicate` 同样决定是否是深拷贝。 -### RAG agent -RAG agent是可以基于检索到的知识生成答案的agent。 - * 让Agent使用RAG: RAG agent在其配置中需要·`rag_config`,其中有一个`knowledge_id`的列表 +### RAG 智能体 +RAG 智能体是可以基于检索到的知识生成答案的智能体。 + * 让智能体使用RAG: RAG agent在其配置中需要`rag_config`,其中有一个`knowledge_id`的列表 * Agent可以通过将其传递给`KnowledgeBank.equip`函数来从`KnowledgeBank`加载特定的知识。 - * Agent 代理可以在`reply`函数中使用检索器(retriever)从`Knowledge`中检索,并将其提示组合到LLM中 + * Agent 智能体可以在`reply`函数中使用检索器(retriever)从`Knowledge`中检索,并将其提示组合到LLM中 -**Building RAG agent yourself.** 只要您的代理配置具有`rag_config`属性并且是字典型,里面有一个`knowledge_id`列表,您就可以将其传递给`KnowledgeBank.equip`, -为它配置`knowledge_id`列表和相应的知识和检索器(retriever),您的代理将配备一系列知识。 +**自己搭建 RAG 智能体.** 只要您的智能体配置具有`rag_config`属性并且是字典型,里面有一个`knowledge_id`列表,您就可以将其传递给`KnowledgeBank.equip`, +为它配置`knowledge_id`列表和相应的知识和检索器(retriever),您的智能体将配备一系列知识。 您可以在`reply`函数中决定如何使用检索器,甚至更新和刷新索引。 [[Back to the top]](#209-rag-zh) diff --git a/docs/sphinx_doc/zh_CN/source/tutorial/advance.rst b/docs/sphinx_doc/zh_CN/source/tutorial/advance.rst index 9de74f5cd..17ab3d8c8 100644 --- a/docs/sphinx_doc/zh_CN/source/tutorial/advance.rst +++ b/docs/sphinx_doc/zh_CN/source/tutorial/advance.rst @@ -7,6 +7,7 @@ 201-agent.md 202-pipeline.md 203-model.md + 203-parser.md 204-service.md 205-memory.md 206-prompt.md diff --git a/examples/conversation_basic/README.md b/examples/conversation_basic/README.md index 1bdd093a2..eb89720a0 100644 --- a/examples/conversation_basic/README.md +++ b/examples/conversation_basic/README.md @@ -1,5 +1,6 @@ # Multi-Agent Conversation in AgentScope -This is a demo of how to program a multi-agent conversation in AgentScope. + +This example will show how to program a multi-agent conversation in AgentScope. Complete code is in `conversation.py`, which set up a user agent and an assistant agent to have a conversation. When user input "exit", the conversation ends. @@ -8,5 +9,13 @@ You can modify the `sys_prompt` to change the role of assistant agent. # Note: Set your api_key in conversation.py first python conversation.py ``` +## Tested Models + +These models are tested in this example. For other models, some modifications may be needed. +- dashscope_chat (qwen-max) +- ollama_chat (ollama_llama3_8b) +- gemini_chat (models/gemini-pro) + +## Prerequisites To set up model serving with open-source LLMs, follow the guidance in -[scripts/REAMDE.md](../../scripts/README.md). \ No newline at end of file +[scripts/REAMDE.md](../../scripts/README.md). diff --git a/examples/conversation_self_organizing/README.md b/examples/conversation_self_organizing/README.md new file mode 100644 index 000000000..0c1fd6d45 --- /dev/null +++ b/examples/conversation_self_organizing/README.md @@ -0,0 +1,29 @@ +# Self-Organizing Conversation Example + +This example will show +- How to set up a self-organizing conversation using the `DialogAgent` and `agent_builder` +- How to extract the discussion scenario and participant agents from the `agent_builder`'s response +- How to conduct a multi-round discussion among the participant agents + + +## Background + +In this example, we demonstrate how to create a self-organizing conversation where the `agent_builder` automatically sets up the agents participating in the discussion based on a given question. The `agent_builder` provides the discussion scenario and the characteristics of the participant agents. The participant agents then engage in a multi-round discussion to solve the given question. + + +## Tested Models + +These models are tested in this example. For other models, some modifications may be needed. +- `dashscope_chat` with `qwen-turbo` +- `ollama_chat` with `llama3_8b` +- `gemini_chat` with `models/gemini-1.0-pro-latest` + + +## Prerequisites + +Fill the next cell to meet the following requirements +- Set up the `model_configs` with the appropriate API keys and endpoints +- Provide the path to the `agent_builder_instruct.txt` file in the `load_txt` function +- Set the desired `max_round` for the discussion +- Provide the `query` or question for the discussion +- [Optional] Adjust the `generate_args` such as `temperature` for the `openai_chat` model \ No newline at end of file diff --git a/examples/conversation_with_RAG_agents/configs/model_config.json b/examples/conversation_with_RAG_agents/configs/model_config.json index d1d5f8829..25ba628cd 100644 --- a/examples/conversation_with_RAG_agents/configs/model_config.json +++ b/examples/conversation_with_RAG_agents/configs/model_config.json @@ -1,17 +1,4 @@ [ - { - "model_type": "post_api_chat", - "config_name": "gpt_postapi_config", - "api_url": "http://47.88.8.18:8088/api/ask", - "headers": { - "Content-Type": "application/json", - "Authorization": "" - }, - "messages_key": "messages", - "json_args": { - "model": "gpt-4" - } - }, { "model_type": "dashscope_text_embedding", "config_name": "qwen_emb_config", diff --git a/examples/conversation_with_RAG_agents/rag_example.py b/examples/conversation_with_RAG_agents/rag_example.py index 7604b3af2..8d44280cf 100644 --- a/examples/conversation_with_RAG_agents/rag_example.py +++ b/examples/conversation_with_RAG_agents/rag_example.py @@ -53,15 +53,6 @@ def main() -> None: # prepare models with open("configs/model_config.json", "r", encoding="utf-8") as f: model_configs = json.load(f) - # for internal API - for config in model_configs: - if config.get("model_type", "") == "post_api_chat": - config["headers"]["Authorization"] = ( - "Bearer " + f"{os.environ.get('HTTP_LLM_API_KEY')}" - ) - else: - # for dashscope - config["api_key"] = f"{os.environ.get('DASHSCOPE_API_KEY')}" # load config of the agents with open("configs/agent_config.json", "r", encoding="utf-8") as f: @@ -75,13 +66,7 @@ def main() -> None: guide_agent = agent_list[4] # the knowledge bank can be configured by loading config file - with open( - "configs/knowledge_config.json", - "r", - encoding="utf-8", - ) as f: - knowledge_configs = json.load(f) - knowledge_bank = KnowledgeBank(configs=knowledge_configs) + knowledge_bank = KnowledgeBank(configs="configs/knowledge_config.json") # alternatively, we can easily input the configs to add data to RAG knowledge_bank.add_data_as_knowledge( diff --git a/examples/conversation_with_customized_services/generated_image_0.png b/examples/conversation_with_customized_services/generated_image_0.png new file mode 100644 index 000000000..41b50fc0d Binary files /dev/null and b/examples/conversation_with_customized_services/generated_image_0.png differ diff --git a/examples/conversation_with_customized_services/image.png b/examples/conversation_with_customized_services/image.png new file mode 100644 index 000000000..2a96d01c9 Binary files /dev/null and b/examples/conversation_with_customized_services/image.png differ diff --git a/examples/conversation_with_customized_services/main.ipynb b/examples/conversation_with_customized_services/main.ipynb new file mode 100644 index 000000000..5b5b096d1 --- /dev/null +++ b/examples/conversation_with_customized_services/main.ipynb @@ -0,0 +1,625 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Conversation with Customized Services\n", + "\n", + "This example will show\n", + "- how to create new service functions in AgentScope\n", + "- how to incorporate multimodal service functions in AgentScope\n", + "\n", + "We will take langchain tools and dashscope multimodal APIs as example, including:\n", + "- langchain tools\n", + "- dashscope text-to-image\n", + "- dashscope image-to-text\n", + "- dashscope text-to-audio\n", + "\n", + "We are working on integrating multimodal service functions into AgentScope, the progress is as follows:\n", + "\n", + "- [🚧] dashscope_text_to_image\n", + "- [🚧] dashscope_image_to_text\n", + "- [🚧] dashscope_text_to_audio\n", + "\n", + "## Background\n", + "\n", + "AgentScope has built in service toolkit module ([tutorial](https://modelscope.github.io/agentscope/en/tutorial/204-service.html#how-to-use-service-functions), [source code](https://github.com/modelscope/agentscope/blob/cccecb23f56591d859403f86ff0360a5cb4dda1c/src/agentscope/service/service_toolkit.py)) for tools usage, helping developers to integrate service functions into their agents. It provides: \n", + "- format instruction for LLMs\n", + "- automatic function description generation in JSON schema format\n", + "- LLM response parsing \n", + "- function calling and error handling\n", + "\n", + "The service toolkit module supports all built-in service functions within AgentScope, and developers can also create their own service functions.\n", + "\n", + "\n", + "## Note\n", + "\n", + "The example is tested with the following models. For other models, some modifications may be needed. \n", + "- gpt-4\n", + "- gpt-3.5-turbo \n", + "\n", + "\n", + "## Prerequisites\n", + "\n", + "- Install the latest AgentScope from source:\n", + "\n", + "```bash\n", + "git clone https://github.com/modelscope/agentscope.git\n", + "cd agentscope\n", + "pip install -e .\n", + "```\n", + "- Install [LangChain](https://python.langchain.com/v0.1/docs/get_started/quickstart/) and [DashScope](https://dashscope.aliyun.com/) libraries:\n", + "\n", + "```bash\n", + "pip install langchain dashscope pyowm\n", + "```\n", + "\n", + "- Fill the next cell to meet the following requirements:\n", + " - OpenWeatherMap API key to experiment with the weather service\n", + " - OpenAI API key\n", + " - DashScope API key" + ] + }, + { + "cell_type": "code", + "execution_count": 22, + "metadata": {}, + "outputs": [], + "source": [ + "# OpenWeatherMap API key\n", + "YOUR_OPENWEATHER_API = \"YOUR OPENWEATHER API KEY\"\n", + "\n", + "# AgentScope model config\n", + "YOUR_MODEL_CONFIGURATION_NAME = \"gpt-4\"\n", + "YOUR_MODEL_CONFIGURATION = {\n", + " \"config_name\": \"gpt-4\",\n", + " \"model_type\": \"openai_chat\",\n", + " \"model_name\": \"gpt-4\",\n", + " \"api_key\": \"YOUR OPENAI API KEY\"\n", + "}\n", + "\n", + "# DashScope API key\n", + "YOUR_API_KEY = \"YOUR DASHSCOPE API KEY\"" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Implement New Service Functions\n", + "\n", + "Implementing a customized service is as simple as writing a custom function. Just meet the following requirements:\n", + "\n", + "1. Your new service function should return a `ServiceResponse` object, which wraps the execution status and results. It contains two fields: `status` and `content`. \n", + " - When the Service function runs to completion normally, `status` is `ServiceExecStatus.SUCCESS`, and `content` is the return value of the function. \n", + " - When an error occurs during execution, `status` is `ServiceExecStatus.Error`, and content contains the error message.\n", + "\n", + "```python\n", + "from agentscope.service.service_response import ServiceResponse, ServiceExecStatus\n", + "\n", + "def your_customized_services(argument_1: str, argument_2: int) -> ServiceResponse:\n", + " pass\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "2. Your new service function should provide a well-formatted docstring (Google style is recommended), and the `ServiceToolkit` will automatically parse the docstring and generate description in JSON schema format. \n", + "\n", + "```python\n", + "\"\"\"\n", + "This function is used to... (a brief description of the function)\n", + "\n", + "Args: \n", + " argument_1: (`str`): \n", + " the description of argument_1\n", + " argument_2: (`int`)\n", + " the description of argument_2\n", + "\"\"\"\n", + "```\n", + "\n", + "Then, you can register your function in the `ServiceToolkit` object as follows. Note LLM will be required to specify the not provided argument (e.g. `argument_2` in the following example). \n", + "\n", + "```python\n", + "from agentscope.service import ServiceToolkit\n", + "\n", + "service_toolkit = ServiceToolKit()\n", + "service_toolkit.add(your_customize_services, argument_1)\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Langchain tools\n", + "The package of Langchain also offers a variety of tools that may not be covered in **AgentScope**. You may find a full list of tools supported in Langchain here [Supported Toolkits in Langchain](https://python.langchain.com/v0.1/docs/integrations/tools/)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's take the tool `OpenWeatherMap` as an example. `Langchain` provides wrappers for all tool in `.utilities.Wrapper` " + ] + }, + { + "cell_type": "code", + "execution_count": 23, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "In London,GB, the current weather is as follows:\n", + "Detailed status: mist\n", + "Wind speed: 1.03 m/s, direction: 0°\n", + "Humidity: 93%\n", + "Temperature: \n", + " - Current: 11.75°C\n", + " - High: 12.92°C\n", + " - Low: 10.12°C\n", + " - Feels like: 11.41°C\n", + "Rain: {}\n", + "Heat index: None\n", + "Cloud cover: 100%\n" + ] + } + ], + "source": [ + "from langchain_community.utilities import OpenWeatherMapAPIWrapper\n", + "import os\n", + "os.environ['OPENWEATHERMAP_API_KEY'] = YOUR_OPENWEATHER_API\n", + "weather = OpenWeatherMapAPIWrapper()\n", + "weather_data = weather.run(\"London,GB\")\n", + "\n", + "print(weather_data)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We can wrap the call of `Wrapper` into a function to make it available as an `Agentscope` service. " + ] + }, + { + "cell_type": "code", + "execution_count": 24, + "metadata": {}, + "outputs": [], + "source": [ + "from agentscope.service.service_response import ServiceResponse\n", + "from agentscope.service.service_status import ServiceExecStatus\n", + "\n", + "def get_weather_data(city: str, \n", + " country: str, \n", + " api_key: str) -> ServiceResponse:\n", + " \"\"\"Search the weather data of a city\n", + "\n", + " Args:\n", + " city (`str`): the name of the city.\n", + " country (`str`): the name of the country.\n", + " api_key (str): The api key for the openweathermap api.\n", + " Returns:\n", + " ServiceResponse: \n", + " A dictionary with two variables: `status` and`content`. \n", + " If `status` is ServiceExecStatus.SUCCESS, \n", + " the `content` contains the weather data of the queried city.\n", + " Example: \n", + " city = \"London\"\n", + " country = \"GB\"\n", + " print(get_weather_data(city, country, api_key)) gives:\n", + " In London,GB, the current weather is as follows:\n", + " Detailed status: scattered clouds\n", + " Wind speed: 3.6 m/s, direction: 210°\n", + " Humidity: 85%\n", + " Temperature: \n", + " - Current: 12.34°C\n", + " - High: 13.08°C\n", + " - Low: 11.39°C\n", + " - Feels like: 11.85°C\n", + " Rain: {}\n", + " Heat index: None\n", + " Cloud cover: 40%\n", + " \n", + " \"\"\"\n", + " os.environ['OPENWEATHERMAP_API_KEY'] = api_key\n", + " weather = OpenWeatherMapAPIWrapper()\n", + " try:\n", + " weather_data = weather.run(f\"{city},{country}\")\n", + " return ServiceResponse(ServiceExecStatus.SUCCESS, weather_data)\n", + " except Exception as e:\n", + " return ServiceResponse(ServiceExecStatus.FAILURE, str(e))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now, let's add it to the `ServiceToolKit`. " + ] + }, + { + "cell_type": "code", + "execution_count": 25, + "metadata": {}, + "outputs": [], + "source": [ + "from agentscope.service import service_toolkit\n", + "\n", + "toolkit = service_toolkit.ServiceToolkit()\n", + "toolkit.add(get_weather_data, api_key=YOUR_OPENWEATHER_API)" + ] + }, + { + "cell_type": "code", + "execution_count": 26, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + ">>> Executing function get_weather_data with arguments:\n", + ">>> \tcountry: UK\n", + ">>> \tcity: London\n", + ">>> END \n", + ">>> Executing function get_weather_data with arguments:\n", + ">>> \tcountry: France\n", + ">>> \tcity: Paris\n", + ">>> END \n" + ] + } + ], + "source": [ + "from agentscope.message import Msg\n", + "from agentscope.agents import ReActAgent\n", + "\n", + "import agentscope\n", + "\n", + "agentscope.init(model_configs=YOUR_MODEL_CONFIGURATION)\n", + "agent = ReActAgent(\n", + " name=\"assistant\",\n", + " model_config_name=YOUR_MODEL_CONFIGURATION_NAME,\n", + " service_toolkit=toolkit, \n", + " verbose=True # set verbose to True to show reasoning process\n", + ")\n", + "\n", + "msg_question = Msg(\n", + " name=\"user\", \n", + " content=\"Which city is better to visit today, London or Paris?\", \n", + " role=\"user\"\n", + ")\n", + "\n", + "res = agent(msg_question)" + ] + }, + { + "cell_type": "code", + "execution_count": 27, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Based on the current weather conditions, both London and Paris seem comfortable for a visit as there is no rain predicted in both cities. However, London might be slightly more pleasant today as it has lighter winds compared to Paris.\n" + ] + } + ], + "source": [ + "print(res.content)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Looks Good." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Dashscope" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Through flexible and user-friendly model API services, **Dashscope** enables the capabilities of various large AI models to be easily accessible to AI developers. [Click here for a full list of supported models and tasks (in zh-cn)](https://dashscope.console.aliyun.com/model?spm=5176.28630291.0.0.24bf7eb5lzPBes)." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The procedure of wraping **Dashscope** as services in **Agentscope** is the same. We need to define a function with clear documentation to call the api. In this section, we implement three services: text-to-image, image-to-text, text-to-audio." + ] + }, + { + "cell_type": "code", + "execution_count": 28, + "metadata": {}, + "outputs": [], + "source": [ + "from http import HTTPStatus\n", + "import dashscope\n", + "import requests\n", + "\n", + "def text_to_image(prompt:str, \n", + " api_key:str,\n", + " number_of_images:int=1,\n", + " size:str='1024*1024',\n", + " model:str='wanx-v1') -> ServiceResponse:\n", + " \"\"\"Generate an image based on a text prompt.\n", + "\n", + " Args:\n", + " prompt (`str`): the text prompt.\n", + " api_key (`str`): The api key for the dashscope api.\n", + " number_of_images (`int`, defaults to `1`): the number of images to generate. \n", + " size (`str`, defaults to `1024*1024`): the size of the image.\n", + " model (`str`, defaults to 'wanx-v1'): the model to use.\n", + " Returns:\n", + " ServiceResponse: \n", + " A dictionary with two variables: `status` and`content`. \n", + " If `status` is ServiceExecStatus.SUCCESS, \n", + " the `content` is a dict with key urls and a list of the urls of the generated images.\n", + "\n", + " Example:\n", + " prompt = \"A beautiful sunset in the mountains\"\n", + " print(text_to_image(prompt)) gives:\n", + " {'status': 'SUCCESS', 'content': {'urls': ['URL1', 'URL2']}}\n", + " \"\"\"\n", + " dashscope.api_key = api_key\n", + " response = dashscope.ImageSynthesis.call(\n", + " model=model,\n", + " prompt=prompt,\n", + " n=number_of_images,\n", + " size=size)\n", + " if response.status_code == HTTPStatus.OK:\n", + " urls = []\n", + " for i, result in enumerate(response.output.results):\n", + " file_name = f\"generated_image_{i}.png\"\n", + " urls.append(file_name)\n", + " with open('./%s' % file_name, 'wb+') as f:\n", + " f.write(requests.get(result.url).content)\n", + " return ServiceResponse(ServiceExecStatus.SUCCESS, {\"urls\": urls})\n", + " else:\n", + " err_msg = f\"status_code: {response.status_code}, code: {response.code}, message: {response.message}\"\n", + " return ServiceResponse(ServiceExecStatus.FAILURE, err_msg)" + ] + }, + { + "cell_type": "code", + "execution_count": 29, + "metadata": {}, + "outputs": [], + "source": [ + "def image_to_text(image_url:str,\n", + " query_prompt:str, \n", + " api_key:str,\n", + " model:str=\"qwen-vl-plus\") -> ServiceResponse:\n", + " \"\"\"Generate text based on an image.\n", + " \n", + " Args:\n", + " image_url (`str`): the url of the image.\n", + " query_prompt (`str`): the text prompt.\n", + " api_key (`str`): The api key for the dashscope api.\n", + " model (`str`, defaults to 'qwen-vl-plus'): the model to use.\n", + " Returns:\n", + " ServiceResponse: \n", + " A dictionary with two variables: `status` and`content`. \n", + " If `status` is ServiceExecStatus.SUCCESS, \n", + " the `content` is the generated text.\n", + " Example:\n", + " image_url = \"image.jpg\"\n", + " query_prompt = \"Describe the image\"\n", + " print(image_to_text(image_url, query_prompt)) gives:\n", + " {'status': 'SUCCESS', 'content': 'A beautiful sunset in the mountains'}\n", + " \"\"\"\n", + " dashscope.api_key = api_key\n", + " # get absolute path of the image\n", + " image_path = os.path.abspath(image_url)\n", + " image_url = f\"file://{image_path}\"\n", + " message = [\n", + " {\"role\":\"user\", \n", + " \"content\":[\n", + " {\"image\": image_url},\n", + " {\"text\": query_prompt}\n", + " ]},\n", + " ]\n", + " response = dashscope.MultiModalConversation.call(model=model, \n", + " messages=message)\n", + " if response.status_code == HTTPStatus.OK:\n", + " description = response.output.choices[0].message.content[0]['text']\n", + " return ServiceResponse(ServiceExecStatus.SUCCESS, description)\n", + " else:\n", + " err_msg = f\"status_code: {response.status_code}, code: {response.code}, message: {response.message}\"\n", + " return ServiceResponse(ServiceExecStatus.FAILURE, err_msg) \n", + " " + ] + }, + { + "cell_type": "code", + "execution_count": 30, + "metadata": {}, + "outputs": [], + "source": [ + "def text_to_audio(text:str, \n", + " api_key:str,\n", + " model:str='sambert-zhichu-v1',\n", + " sample_rate:int=48000) -> ServiceResponse:\n", + " \"\"\"Convert text to audio.\n", + " \n", + " Args:\n", + " text (`str`): the text to convert.\n", + " api_key (`str`): The api key for the dashscope api.\n", + " model (`str`, defaults to 'sambert-zhichu-v1'): the model to use.\n", + " sample_rate (`int`, defaults to 48000): the sample rate of the audio.\n", + " Returns:\n", + " ServiceResponse: \n", + " A dictionary with two variables: `status` and`content`. \n", + " If `status` is ServiceExecStatus.SUCCESS, \n", + " the `content` is the URL of the generated audio file.\n", + " Example:\n", + " text = \"How is the weather today?\"\n", + " print(text_to_audio(text)) gives:\n", + " {'status': 'SUCCESS', 'content': 'AUDIO_URL'}\n", + " \"\"\"\n", + " from dashscope.audio.tts import SpeechSynthesizer\n", + " import nest_asyncio\n", + " nest_asyncio.apply()\n", + " dashscope.api_key = api_key\n", + " result = SpeechSynthesizer.call(model=model,\n", + " text=text,\n", + " sample_rate=sample_rate)\n", + " if result.get_audio_data() is not None:\n", + " with open('output.wav', 'wb') as f:\n", + " f.write(result.get_audio_data())\n", + " return ServiceResponse(ServiceExecStatus.SUCCESS, 'output.wav')\n", + " else:\n", + " return ServiceResponse(ServiceExecStatus.FAILURE, \"Failed to generate audio file\")" + ] + }, + { + "cell_type": "code", + "execution_count": 31, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + ">>> Executing function image_to_text with arguments:\n", + ">>> \timage_url: image.png\n", + ">>> \tquery_prompt: What's the creature in the image?\n", + ">>> END \n", + ">>> Executing function text_to_image with arguments:\n", + ">>> \tprompt: rabbit's natural habitat\n", + ">>> \tnumber_of_images: 1\n", + ">>> END \n", + ">>> Executing function text_to_audio with arguments:\n", + ">>> \ttext: The rabbit's natural habitat includes meadows, woo...\n", + ">>> END \n" + ] + } + ], + "source": [ + "toolkit = service_toolkit.ServiceToolkit()\n", + "\n", + "\n", + "toolkit.add(text_to_image, api_key=YOUR_API_KEY, model=\"wanx-v1\", size=\"1024*1024\")\n", + "toolkit.add(image_to_text, api_key=YOUR_API_KEY, model=\"qwen-vl-plus\")\n", + "toolkit.add(text_to_audio, api_key=YOUR_API_KEY, model=\"sambert-zhichu-v1\", sample_rate=48000)\n", + "agentscope.init(model_configs=YOUR_MODEL_CONFIGURATION)\n", + "agent = ReActAgent(\n", + " name=\"assistant\",\n", + " model_config_name=YOUR_MODEL_CONFIGURATION_NAME,\n", + " service_toolkit=toolkit, \n", + " verbose=True, # set verbose to True to show reasoning process\n", + ")\n", + "msg_question = Msg(\n", + " name=\"user\", \n", + " content=(\"What's the creature inside the image: image.png\" \n", + " \"draw me a picture of the environment of this creature's habitat\"\n", + " \"and describe the environment in audio\"), \n", + " role=\"user\",\n", + " verbose=True\n", + ")\n", + "res = agent(msg_question)" + ] + }, + { + "cell_type": "code", + "execution_count": 32, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "I have identified the creature as a rabbit. I have generated an image of a typical rabbit's habitat and created an audio description of it.\n" + ] + } + ], + "source": [ + "print(res.content)" + ] + }, + { + "cell_type": "code", + "execution_count": 33, + "metadata": {}, + "outputs": [ + { + "data": { + "image/png": "", + "text/plain": [ + "" + ] + }, + "execution_count": 33, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "import IPython\n", + "# display the generated image\n", + "IPython.display.Image(\"generated_image_0.png\")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "IPython.display.Audio(\"output.wav\")" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "agentscope", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.9.19" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} diff --git a/examples/conversation_with_customized_services/output.wav b/examples/conversation_with_customized_services/output.wav new file mode 100644 index 000000000..d64062016 Binary files /dev/null and b/examples/conversation_with_customized_services/output.wav differ diff --git a/examples/conversation_with_gpt-4o/README.md b/examples/conversation_with_gpt-4o/README.md new file mode 100644 index 000000000..715d57f58 --- /dev/null +++ b/examples/conversation_with_gpt-4o/README.md @@ -0,0 +1,54 @@ +# Conversation with gpt-4o (OpenAI Vision Model) + +This example will show +- How to use gpt-4o and other OpenAI vision models in AgentScope + +In this example, +- you can have a conversation with OpenAI vision models. +- you can show gpt-4o with your drawings or web ui designs and look for its suggestions. +- you can share your pictures with gpt-4o and ask for its comments, + +Just input your image url (both local and web URLs are supported) and talk with gpt-4o. + + +## Background + +In May 13, 2024, OpenAI released their new model, gpt-4o, which is a large multimodal model that can process both text and multimodal data. + + +## Tested Models + +The following models are tested in this example. For other models, some modifications may be needed. +- gpt-4o +- gpt-4-turbo +- gpt-4-vision + + +## Prerequisites + +You need to satisfy the following requirements to run this example. +- Install the latest version of AgentScope by + ```bash + git clone https://github.com/modelscope/agentscope.git + cd agentscope + pip install -e . + ``` +- Prepare an OpenAI API key + +## Running the Example + +First fill your OpenAI API key in `conversation_with_gpt-4o.py`, then execute the following command to run the conversation with gpt-4o. + +```bash +python conversation_with_gpt-4o.py +``` + +## A Running Example + +- Conversation history with gpt-4o. + +conversation history + +- My picture + +my picture diff --git a/examples/conversation_with_gpt-4o/conversation_with_gpt-4o.py b/examples/conversation_with_gpt-4o/conversation_with_gpt-4o.py new file mode 100644 index 000000000..470f1de32 --- /dev/null +++ b/examples/conversation_with_gpt-4o/conversation_with_gpt-4o.py @@ -0,0 +1,36 @@ +# -*- coding: utf-8 -*- +"""An example for conversation with OpenAI vision models, especially for +GPT-4o.""" +import agentscope +from agentscope.agents import UserAgent, DialogAgent + +# Fill in your OpenAI API key +YOUR_OPENAI_API_KEY = "xxx" + +model_config = { + "config_name": "gpt-4o_config", + "model_type": "openai_chat", + "model_name": "gpt-4o", + "api_key": YOUR_OPENAI_API_KEY, + "generate_args": { + "temperature": 0.7, + }, +} + +agentscope.init(model_configs=model_config) + +# Require user to input URL, and press enter to skip the URL input +user = UserAgent("user", require_url=True) + +agent = DialogAgent( + "Friday", + sys_prompt="You're a helpful assistant named Friday.", + model_config_name="gpt-4o_config", +) + +x = None +while True: + x = agent(x) + x = user(x) + if x.content == "exit": # type "exit" to break the loop + break diff --git a/examples/conversation_with_mentions/README.md b/examples/conversation_with_mentions/README.md index 6359b3413..858915710 100644 --- a/examples/conversation_with_mentions/README.md +++ b/examples/conversation_with_mentions/README.md @@ -1,73 +1,36 @@ +### # Multi-Agent Group Conversation in AgentScope -This example demonstrates a multi-agent group conversation facilitated by AgentScope. The script `main.py` sets up a virtual chat room where a user agent interacts with several NPC (non-player character) agents. The chat utilizes a special **"@"** mention functionality, which allows participants to address specific agents and have a more directed conversation. +This example demonstrates a multi-agent group conversation facilitated by AgentScope. The script sets up a virtual chat room where a user agent interacts with several NPC (non-player character) agents. Participants can utilize a special "@" mention functionality to address specific agents directly. -## Key Features +## Background -- **Real-time Group Conversation**: Engage in a chat with multiple agents responding in real time. -- **@ Mention Functionality**: Use the "@" symbol followed by an agent's name to specifically address that agent within the conversation. -- **Dynamic Flow**: User-driven conversation with agents responding based on the context and mentions. -- **Configurable Agent Roles**: Easily modify agent roles and behaviors by editing the `sys_prompt` in the configuration files. -- **User Timeout**: If the user does not respond within a specified time, the conversation continues with the next agent. +The conversation takes place in a simulated chat room environment with predefined roles for each participant. Topics are open-ended and evolve based on the user's input and agents' responses. -## How to Use - -To start the group conversation, follow these steps: - -1. Make sure to set your `api_key` in the `configs/model_configs.json` file. -2. Run the script using the following command: - -```bash -python main.py +## Tested Models -# or launch agentscope studio -as_studio main.py -``` +These models are tested in this example. For other models, some modifications may be needed. +- gemini_chat (models/gemini-pro, models/gemini-1.0-pro) +- dashscope_chat (qwen-max, qwen-turbo) +- ollama_chat (ollama_llama3_8b) -1. To address a specific agent in the chat, type "@" followed by the agent's name in your message. -2. To exit the chat, simply type "exit" when it's your turn to speak. +## Prerequisites -## Background and Conversation Flow +Fill the next cell to meet the following requirements: +- Set your `api_key` in the `configs/model_configs.json` file +- Optional: Launch agentscope studio with `as_studio main.py` -The conversation takes place in a simulated chat room environment with roles defined for each participant. The user acts as a regular chat member with the ability to speak freely and address any agent. NPC agents are pre-configured with specific roles that determine their responses and behavior in the chat. The topic of the conversation is open-ended and can evolve organically based on the user's input and agents' programmed personas. - -### Example Interaction +## How to Use -``` -User input: Hi, everyone! I'm excited to join this chat. -AgentA: Welcome! We're glad to have you here. -User input: @AgentB, what do you think about the new technology trends? -AgentB: It's an exciting time for tech! There are so many innovations on the horizon. -... -``` +1. Run the script using the command: `python main.py` +2. Address specific agents by typing "@" followed by the agent's name. +3. Type "exit" to leave the chat. ## Customization Options -The group conversation script provides several options for customization, allowing you to tailor the chat experience to your preferences. - -You can customize the conversation by editing the agent configurations and model parameters. The `agent_configs.json` file allows you to set specific behaviors for each NPC agent, while `model_configs.json` contains the parameters for the conversation model. +You can adjust the behavior and parameters of the NPC agents and conversation model by editing the `agent_configs.json` and `model_configs.json` files, respectively. ### Changing User Input Time Limit -The `USER_TIME_TO_SPEAK` variable sets the time limit (in seconds) for the user to input their message during each round. By default, this is set to 10 seconds. You can adjust this time limit by modifying the value of `USER_TIME_TO_SPEAK` in the `main.py` script. - -For example, to change the time limit to 20 seconds, update the line in `main.py` as follows: - -``` -USER_TIME_TO_SPEAK = 20 # User has 20 seconds to type their message -``` - -### Setting a Default Topic for the Chat Room - -The `DEFAULT_TOPIC` variable defines the initial message or topic of the chat room. It sets the stage for the conversation and is announced at the beginning of the chat session. You can change this message to prompt a specific discussion topic or to provide instructions to the agents. - -To customize this message, modify the `DEFAULT_TOPIC` variable in the `main.py` script. For instance, if you want to set the default topic to discuss "The Future of Artificial Intelligence," you would change the code as follows: - -```python -DEFAULT_TOPIC = """ -This is a chat room about the Future of Artificial Intelligence and you can -speak freely and briefly. -""" -``` - -With these customizations, the chat room can be tailored to fit specific themes or time constraints, enhancing the user's control over the chat experience. +Adjust the `USER_TIME_TO_SPEAK` variable in the `main.py` script to change the time limit for user input. +### diff --git a/examples/distributed_basic/README.md b/examples/distributed_basic/README.md index 62754f71a..1ee3e4d86 100644 --- a/examples/distributed_basic/README.md +++ b/examples/distributed_basic/README.md @@ -1,6 +1,6 @@ # Distributed Basic -This example run a assistant agent and a user agent as seperate processes and use rpc to communicate between them. +This example run a assistant agent and a user agent as separate processes and use rpc to communicate between them. Before running the example, please install the distributed version of Agentscope, fill in your model configuration correctly in `configs/model_configs.json`, and modify the `model_config_name` field in `distributed_dialog.py` accordingly. diff --git a/examples/distributed_basic/distributed_dialog.py b/examples/distributed_basic/distributed_dialog.py index d3c99cfa5..ab0de4235 100644 --- a/examples/distributed_basic/distributed_dialog.py +++ b/examples/distributed_basic/distributed_dialog.py @@ -7,7 +7,7 @@ import agentscope from agentscope.agents.user_agent import UserAgent from agentscope.agents.dialog_agent import DialogAgent -from agentscope.agents.rpc_agent import RpcAgentServerLauncher +from agentscope.server import RpcAgentServerLauncher def parse_args() -> argparse.Namespace: @@ -37,13 +37,6 @@ def setup_assistant_server(assistant_host: str, assistant_port: int) -> None: model_configs="configs/model_configs.json", ) assistant_server_launcher = RpcAgentServerLauncher( - agent_class=DialogAgent, - agent_kwargs={ - "name": "Assitant", - "sys_prompt": "You are a helpful assistant.", - "model_config_name": "qwen", - "use_memory": True, - }, host=assistant_host, port=assistant_port, ) @@ -64,7 +57,6 @@ def run_main_process(assistant_host: str, assistant_port: int) -> None: ).to_dist( host=assistant_host, port=assistant_port, - launch_server=False, ) user_agent = UserAgent( name="User", diff --git a/examples/distributed_debate/distributed_debate.py b/examples/distributed_debate/distributed_debate.py index f7bf35db6..a4e0a4287 100644 --- a/examples/distributed_debate/distributed_debate.py +++ b/examples/distributed_debate/distributed_debate.py @@ -2,14 +2,13 @@ """ An example of distributed debate """ import argparse -import json from user_proxy_agent import UserProxyAgent import agentscope +from agentscope.agents import DialogAgent from agentscope.msghub import msghub -from agentscope.agents.dialog_agent import DialogAgent -from agentscope.agents.rpc_agent import RpcAgentServerLauncher +from agentscope.server import RpcAgentServerLauncher from agentscope.message import Msg from agentscope.utils.logging_utils import logger @@ -76,29 +75,10 @@ def setup_server(parsed_args: argparse.Namespace) -> None: ) host = getattr(parsed_args, f"{parsed_args.role}_host") port = getattr(parsed_args, f"{parsed_args.role}_port") - if parsed_args.is_human: - agent_class = UserProxyAgent - config = {"name": parsed_args.role} - else: - with open( - "configs/debate_agent_configs.json", - "r", - encoding="utf-8", - ) as f: - configs = json.load(f) - configs = { - "pro": configs[0]["args"], - "con": configs[1]["args"], - "judge": configs[2]["args"], - } - config = configs[parsed_args.role] - agent_class = DialogAgent - server_launcher = RpcAgentServerLauncher( - agent_class=agent_class, - agent_kwargs=config, host=host, port=port, + custom_agents=[UserProxyAgent, DialogAgent], ) server_launcher.launch(in_subprocess=False) server_launcher.wait_until_terminate() @@ -113,12 +93,10 @@ def run_main_process(parsed_args: argparse.Namespace) -> None: pro_agent = pro_agent.to_dist( host=parsed_args.pro_host, port=parsed_args.pro_port, - launch_server=False, ) con_agent = con_agent.to_dist( host=parsed_args.con_host, port=parsed_args.con_port, - launch_server=False, ) participants = [pro_agent, con_agent, judge_agent] announcements = [ diff --git a/examples/distributed_search/answerer_agent.py b/examples/distributed_search/answerer_agent.py index 5b441957b..56ba014e9 100644 --- a/examples/distributed_search/answerer_agent.py +++ b/examples/distributed_search/answerer_agent.py @@ -48,7 +48,7 @@ def reply(self, x: dict = None) -> dict: " the following web page:\n\n" f"{response['html_to_text']}" f"\n\nBased on the above web page," - " please answer my question\n{x.query}", + f" please answer my question\n{x.query}", ), ) # call llm and generate response diff --git a/examples/distributed_simulation/README.md b/examples/distributed_simulation/README.md new file mode 100644 index 000000000..50f20fe6c --- /dev/null +++ b/examples/distributed_simulation/README.md @@ -0,0 +1,103 @@ +# Distributed Large Scale Simulation + +> **WARNING:** +> **This example will consume a huge amount of tokens.** +> **Using paid model API with this example can introduce a high cost.** +> **Users with powerful GPUs (A800 or better) can use local inference services (such as vLLM) to run this example,** +> **while CPU inference services such as ollama is not recommended.** + +This example is a large scale simulation to demonstrate the scalability of AgentScope's distributed mode. From this example, you can learn: + +- How to run a large number of agent servers in a GPU cluster. +- How to connect to those agent servers and run a huge number of agents in them. + +> Based on this example, we deploy 64,000 agents evenly on 4 machines, and each machine has 64 CPU cores and 8 A100 GPUs. The running time is about 30s (excluding initialization time). + +## Background + +This example simulates the following scenario: + +A large number of people participate in a game in which the moderator asks each participant to provide a number between 0 and N. The moderator will calculate the average of all numbers and announce it. The person closest to the average will win. + +## Tested Models + +Only vLLM local inference service is tested for this example. + +This example will consume a huge amount of tokens. Please do not use model API that requires payment. + +## Prerequisites + +- The distribute version of AgentScope is installed +- Use MacOS or Linux (Windows requires some modifiations to the scripts) +- [Optional] Have multiple machines with powerful GPUs (A800 or better) and install [vLLM](https://github.com/vllm-project/vllm) + +## How to Run + +### Step 1: start local inference service + +> If you only have one machine and don't have a powerful GPU (A800 or better), you can ignore this step. + +You can use `start_vllm.sh` to start vllm inference services on each of your machines. +Before running the script, please set `gpu_num`, `model_path` and `base_port` properly. + +- `gpu_num`: number of GPUs for this machine. +- `model_path`: the model checkpoint path. +- `base_port`: The starting point of the port number used by the local inference services. + +For example, if `base_port` is `8010` and `gpu_num` is `4`, 4 inference services will be started, and the port numbers are `8010`, `8011`, `8012` and `8013` respectively. + +vLLM inference services start slowly, so you need to wait for these servers to actually start before proceeding to the next step. + +> The above configuration requires that the model checkpoint can be loaded by a single GPU. +> If you need to use a model that must be loaded by multiple GPUs, you need to modify the script. + +### Step 2: start agent server + +> If you only have one machine and don't have a powerful GPU, you can just use the default setting of the scripts. + +You can use `start_all_server.sh` to start multiple agent servers on each of your machine. +Before running the script, please set `base_port`, `host_name` and `moderator_num` properly. + +- `base_port`: The starting point of the port number used by the agent servers. +- `host_name`: The hostname of this machine, and must be accessible to other machines in the cluster (The default value `localhost` is only used for single machine scenario). +- `moderator_num`: Number of moderators. When the number of participants is large, this value needs to be expanded to avoid bottlenecks. + +After setting the above values correctly, you can use the script to start multiple agent server on your machine. The following command will start 10 agent servers on your machine with port numbers starting from `base_port` to `base_port + 9`, and will also start `moderator_num` agent servers for moderators with port numbers starting from `base_port + 10` to `base_port + moderator_num + 9`. + +```shell +#./start_all_server.sh +./start_all_server.sh 10 +``` + +If you have multiple machines, please make sure the `base_port` and `moderator_num` parameters are exactly the same on all machines, and start the same number of agent servers. + +### Step 3: run simulation + +You can use `run_simulation.sh` to start the simulation. +Before running the script, please set the following setting correctly: + +- `base_port`: the base port for agent servers, must be the same as used in Step 2. +- `hosts`: hostnames of all machines. If you only have one machine, use the default value `localhost`. +- `moderator_per_host`: Consistent with `moderator_num` in Step 2. +- `agent_type`: `random` or `llm`. Please use `random` if you don't have local inference service. +- `max_value`: The upper bound of numbers generated in the game. + +The command below will run a simulation with 1000 participant agents and evenly distributed those agents to the 10 agent servers started in Step 2. + +```shell +#./run_simulation.sh +./run_simulation.sh 10 1000 +``` + +The following is sample output from a single-machine (16 CPU cores) simulation scenario: + +```log +2024-04-16 10:31:53.786 | INFO | agentscope.models:read_model_configs:178 - Load configs for model wrapper: model_1, model_2, model_3, model_4, model_5, model_6, model_7, model_8 +2024-04-16 10:31:53.822 | INFO | agentscope.utils.monitor:_create_monitor_table:343 - Init [monitor_metrics] as the monitor table +2024-04-16 10:31:53.822 | INFO | agentscope.utils.monitor:_create_monitor_table:344 - Init [monitor_metrics_quota_exceeded] as the monitor trigger +2024-04-16 10:31:53.822 | INFO | agentscope.utils.monitor:__init__:313 - SqliteMonitor initialization completed at [./runs/run_20240416-103153_h0xuo5/agentscope.db] +2024-04-16 10:31:53.829 | INFO | __main__:run_main_process_new:106 - init 1000 random participant agents... +2024-04-16 10:31:53.829 | INFO | __main__:run_main_process_new:139 - init 4 moderator agents... +2024-04-16 10:31:54.211 | INFO | __main__:run_main_process_new:163 - [init takes 0.38274645805358887 s] +Moderator: The average value is 49.561 [takes 4.197571277618408 s] +``` diff --git a/examples/distributed_simulation/configs/model_configs.json b/examples/distributed_simulation/configs/model_configs.json new file mode 100644 index 000000000..327bdd76b --- /dev/null +++ b/examples/distributed_simulation/configs/model_configs.json @@ -0,0 +1,98 @@ +[ + { + "model_type": "openai_chat", + "config_name": "model_1", + "model_name": "path-to-your-model-dir", + "api_key": "EMPTY", + "client_args": { + "base_url": "http://127.0.0.1:8010/v1/" + }, + "generate_args": { + "temperature": 1.0 + } + }, + { + "model_type": "openai_chat", + "config_name": "model_2", + "model_name": "path-to-your-model-dir", + "api_key": "EMPTY", + "client_args": { + "base_url": "http://127.0.0.1:8011/v1/" + }, + "generate_args": { + "temperature": 1.0 + } + }, + { + "model_type": "openai_chat", + "config_name": "model_3", + "model_name": "path-to-your-model-dir", + "api_key": "EMPTY", + "client_args": { + "base_url": "http://127.0.0.1:8012/v1/" + }, + "generate_args": { + "temperature": 1.0 + } + }, + { + "model_type": "openai_chat", + "config_name": "model_4", + "model_name": "path-to-your-model-dir", + "api_key": "EMPTY", + "client_args": { + "base_url": "http://127.0.0.1:8013/v1/" + }, + "generate_args": { + "temperature": 1.0 + } + }, + { + "model_type": "openai_chat", + "config_name": "model_5", + "model_name": "path-to-your-model-dir", + "api_key": "EMPTY", + "client_args": { + "base_url": "http://127.0.0.1:8014/v1/" + }, + "generate_args": { + "temperature": 1.0 + } + }, + { + "model_type": "openai_chat", + "config_name": "model_6", + "model_name": "path-to-your-model-dir", + "api_key": "EMPTY", + "client_args": { + "base_url": "http://127.0.0.1:8015/v1/" + }, + "generate_args": { + "temperature": 1.0 + } + }, + { + "model_type": "openai_chat", + "config_name": "model_7", + "model_name": "path-to-your-model-dir", + "api_key": "EMPTY", + "client_args": { + "base_url": "http://127.0.0.1:8016/v1/" + }, + "generate_args": { + "temperature": 1.0 + } + }, + { + "model_type": "openai_chat", + "config_name": "model_8", + "model_name": "path-to-your-model-dir", + "api_key": "EMPTY", + "client_args": { + "base_url": "http://127.0.0.1:8017/v1/" + }, + "generate_args": { + "temperature": 1.0 + } + } +] \ No newline at end of file diff --git a/examples/distributed_simulation/main.py b/examples/distributed_simulation/main.py new file mode 100644 index 000000000..bb26fe533 --- /dev/null +++ b/examples/distributed_simulation/main.py @@ -0,0 +1,216 @@ +# -*- coding: utf-8 -*- +""" A large-scale social simulation experiment """ + +import argparse +import time +from concurrent import futures +from concurrent.futures import as_completed +from loguru import logger + +from participant import Moderator, RandomParticipant, LLMParticipant + +import agentscope +from agentscope.agents import AgentBase +from agentscope.server import RpcAgentServerLauncher +from agentscope.message import Msg + + +def parse_args() -> argparse.Namespace: + """Parse arguments""" + parser = argparse.ArgumentParser() + parser.add_argument( + "--role", + choices=["participant", "main"], + default="main", + ) + parser.add_argument( + "--agent-type", + choices=["random", "llm"], + default="random", + ) + parser.add_argument("--max-value", type=int, default=100) + parser.add_argument("--sleep-time", type=float, default=1.0) + parser.add_argument( + "--hosts", + type=str, + nargs="+", + default=["localhost"], + ) + parser.add_argument("--participant-num", type=int, default=100) + parser.add_argument("--base-port", type=int, default=12010) + parser.add_argument( + "--server-per-host", + type=int, + ) + parser.add_argument("--model-per-host", type=int, default=1) + parser.add_argument("--moderator-per-host", type=int, default=1) + return parser.parse_args() + + +def setup_participant_agent_server(host: str, port: int) -> None: + """Set up agent server""" + agentscope.init( + project="simulation", + name="server", + runtime_id=str(port), + save_code=False, + save_api_invoke=False, + model_configs="configs/model_configs.json", + use_monitor=False, + ) + assistant_server_launcher = RpcAgentServerLauncher( + host=host, + port=port, + max_pool_size=16384, + custom_agents=[Moderator, RandomParticipant, LLMParticipant], + ) + assistant_server_launcher.launch(in_subprocess=False) + assistant_server_launcher.wait_until_terminate() + + +def init_moderator( + name: str, + configs: list[dict], + host: str, + port: int, + agent_type: str, + max_value: int, + sleep_time: float, +) -> AgentBase: + """Init moderator""" + return Moderator( # pylint: disable=E1123 + name=name, + part_configs=configs, + agent_type=agent_type, + max_value=max_value, + sleep_time=sleep_time, + to_dist={ + "host": host, + "port": port, + }, + ) + + +def run_main_process( + hosts: list[str], + base_port: int, + server_per_host: int, + model_per_host: int, + participant_num: int, + moderator_per_host: int = 10, + agent_type: str = "random", + max_value: int = 100, + sleep_time: float = 1.0, +) -> None: + """Run main process""" + agentscope.init( + project="simulation", + name="main", + save_code=False, + save_api_invoke=False, + model_configs="configs/model_configs.json", + use_monitor=False, + ) + host_num = len(hosts) + total_agent_server_num = server_per_host * host_num + participant_per_agent_server = participant_num // total_agent_server_num + ist = time.time() + configs = [] + logger.info(f"init {participant_num} {agent_type} participant agents...") + # build init configs of participants + for i in range(participant_num): + idx = i // participant_per_agent_server + host_id = idx // server_per_host + port_id = idx % server_per_host + model_id = i % model_per_host + host = hosts[host_id] + port = base_port + port_id + config_name = f"model_{model_id + 1}" + if agent_type == "random": + configs.append( + { + "name": f"P{i}", + "host": host, + "port": port, + }, + ) + else: + configs.append( + { + "name": f"P{i}", + "model_config_name": config_name, + "host": host, + "port": port, + }, + ) + + mods = [] + moderator_num = moderator_per_host * host_num + participant_per_moderator = participant_num // moderator_num + tasks = [] + + logger.info(f"init {moderator_num} moderator agents...") + # init moderators + with futures.ThreadPoolExecutor(max_workers=None) as executor: + for i in range(moderator_num): + tasks.append( + executor.submit( + init_moderator, + name=f"mod_{i}", + configs=configs[ + i + * participant_per_moderator : (i + 1) # noqa + * participant_per_moderator + ], + host=hosts[i // moderator_per_host], + port=base_port + server_per_host + i % moderator_per_host, + agent_type=agent_type, + max_value=max_value, + sleep_time=sleep_time, + ), + ) + for task in as_completed(tasks): + mods.append(task.result()) + + iet = time.time() + logger.info(f"[init takes {iet - ist} s]") + + # run te + st = time.time() + results = [] + for p in mods: + results.append(p()) + summ = 0 + cnt = 0 + for r in results: + try: + summ += int(r["content"]["sum"]) + cnt += int(r["content"]["cnt"]) + except Exception: + logger.error(r["content"]) + et = time.time() + logger.chat( + Msg( + name="Moderator", + role="assistant", + content=f"The average value is {summ/cnt} [takes {et-st} s]", + ), + ) + + +if __name__ == "__main__": + args = parse_args() + if args.role == "participant": + setup_participant_agent_server(args.hosts[0], args.base_port) + elif args.role == "main": + run_main_process( + hosts=args.hosts, + base_port=args.base_port, + participant_num=args.participant_num, + server_per_host=args.server_per_host, + model_per_host=args.model_per_host, + moderator_per_host=args.moderator_per_host, + agent_type=args.agent_type, + sleep_time=args.sleep_time, + max_value=args.max_value, + ) diff --git a/examples/distributed_simulation/participant.py b/examples/distributed_simulation/participant.py new file mode 100644 index 000000000..dac3d17bf --- /dev/null +++ b/examples/distributed_simulation/participant.py @@ -0,0 +1,156 @@ +# -*- coding: utf-8 -*- +"""A general dialog agent.""" +import random +import time +import re +from loguru import logger + +from agentscope.message import Msg +from agentscope.agents import AgentBase + + +class RandomParticipant(AgentBase): + """A fake participant who generates number randomly.""" + + def __init__( + self, + name: str, + max_value: int = 100, + sleep_time: float = 1.0, + ) -> None: + """Initialize the participant.""" + super().__init__( + name=name, + ) + self.max_value = max_value + self.sleep_time = sleep_time + + def generate_random_response(self) -> str: + """generate a random int""" + time.sleep(self.sleep_time) + return str(random.randint(0, self.max_value)) + + def reply(self, x: dict = None) -> dict: + """Generate a random value""" + # generate a response in content + response = self.generate_random_response() + msg = Msg(self.name, content=response) + return msg + + +class LLMParticipant(AgentBase): + """A participant agent who generates number using LLM.""" + + def __init__( + self, + name: str, + model_config_name: str, + max_value: int = 100, + ) -> None: + """Initialize the participant.""" + super().__init__( + name=name, + model_config_name=model_config_name, + use_memory=True, + ) + self.max_value = max_value + self.prompt = Msg( + name="system", + role="system", + content="You are participating in a game where everyone " + f"provides a number between 0 and {max_value}. The person " + "closest to the average will win.", + ) + + def parse_value(self, txt: str) -> str: + """Parse the number from the response.""" + numbers = re.findall(r"\d+", txt) + if len(numbers) == 0: + logger.warning( + f"Fail to parse value from [{txt}], use " + f"{self.max_value // 2} instead.", + ) + return str(self.max_value // 2) + else: + return numbers[-1] + + def reply(self, x: dict = None) -> dict: + """Generate a value by LLM""" + if self.memory: + self.memory.add(x) + + # prepare prompt + prompt = self.model.format(self.prompt, self.memory.get_memory()) + + # call llm and generate response + response = self.model(prompt).text + + response = self.parse_value(response) + + msg = Msg(self.name, response, role="assistant") + + # Record the message in memory + if self.memory: + self.memory.add(msg) + + return msg + + +class Moderator(AgentBase): + """A Moderator to collect values from participants.""" + + def __init__( + self, + name: str, + part_configs: list[dict], + agent_type: str = "random", + max_value: int = 100, + sleep_time: float = 1.0, + ) -> None: + super().__init__(name) + self.max_value = max_value + if agent_type == "llm": + self.participants = [ + LLMParticipant( + name=config["name"], + model_config_name=config["model_config_name"], + max_value=max_value, + ).to_dist( + host=config["host"], + port=config["port"], + ) + for config in part_configs + ] + else: + self.participants = [ + RandomParticipant( + name=config["name"], + max_value=max_value, + sleep_time=sleep_time, + ).to_dist( + host=config["host"], + port=config["port"], + ) + for config in part_configs + ] + + def reply(self, x: dict = None) -> dict: + results = [] + msg = Msg( + name="moderator", + role="user", + content=f"Now give a number between 0 and {self.max_value}.", + ) + for p in self.participants: + results.append(p(msg)) + summ = 0 + for r in results: + try: + summ += int(r["content"]) + except Exception as e: + print(e) + return Msg( + name=self.name, + role="assistant", + content={"sum": summ, "cnt": len(self.participants)}, + ) diff --git a/examples/distributed_simulation/run_simulation.sh b/examples/distributed_simulation/run_simulation.sh new file mode 100755 index 000000000..6fac7c4c4 --- /dev/null +++ b/examples/distributed_simulation/run_simulation.sh @@ -0,0 +1,25 @@ +#!/bin/bash + +# default values +base_port=12330 +hosts="localhost" # or "server1 server2 server3 ..." +moderator_per_host=4 +model_per_host=8 +agent_type="random" # or "llm" +max_value=100 + +# check server-per-host +if ! [[ "$1" =~ ^[0-9]+$ ]]; then + echo "Usage: $0 " + exit 1 +fi + +# check participant-num +if ! [[ "$2" =~ ^[0-9]+$ ]]; then + echo "Usage: $0 " + exit 1 +fi + +mkdir -p log + +python main.py --role main --hosts ${hosts} --base-port ${base_port} --participant-num $2 --server-per-host $1 --model-per-host ${model_per_host} --moderator-per-host ${moderator_per_host} --agent-type ${agent_type} --max-value ${max_value} diff --git a/examples/distributed_simulation/start_all_server.sh b/examples/distributed_simulation/start_all_server.sh new file mode 100755 index 000000000..1c1f56aea --- /dev/null +++ b/examples/distributed_simulation/start_all_server.sh @@ -0,0 +1,29 @@ +#!/bin/bash + +# default values +base_port=12330 +host_name="localhost" +moderator_num=4 + +# get number of server +if ! [[ "$1" =~ ^[0-9]+$ ]]; then + echo "Usage: $0 " + exit 1 +fi + +participant_server_num=$1 + +# create files for pid +> .pid +# create log dir +mkdir -p log + +# start all agent servers +for ((i=0; i<(participant_server_num + moderator_num); i++)); do + port=$((base_port + i)) + python main.py --role participant --hosts ${host_name} --base-port ${port} > log/${port}.log 2>&1 & + echo $! >> .pid + echo "Started agent server on ${host_name}:${port} with PID $!" +done + +echo "All servers started" \ No newline at end of file diff --git a/examples/distributed_simulation/start_vllm.sh b/examples/distributed_simulation/start_vllm.sh new file mode 100755 index 000000000..11b92498c --- /dev/null +++ b/examples/distributed_simulation/start_vllm.sh @@ -0,0 +1,19 @@ +#!/bin/bash + +# default values +gpu_num=8 +model_path="path-to-your-model-dir" +base_port=8010 + +> .vllm_pid +mkdir -p log + +for ((i=0; i<8; i++)); do + port=$((base_port + i)) + export CUDA_VISIBLE_DEVICES=$i + python -m vllm.entrypoints.openai.api_server --model "${model_path}" --port ${port} --enforce-eager > log/vllm-${port}.log 2>&1 & + echo $! >> .vllm_pid + echo "Started vllm server on port ${port} with PID $!" +done + +echo "All vllm server started" \ No newline at end of file diff --git a/examples/distributed_simulation/stop_all_server.sh b/examples/distributed_simulation/stop_all_server.sh new file mode 100755 index 000000000..a9b72f72f --- /dev/null +++ b/examples/distributed_simulation/stop_all_server.sh @@ -0,0 +1,19 @@ +#!/bin/bash + +if [ ! -f .pid ]; then + echo "PID file not found. Are the servers running?" + exit 1 +fi + +while read pid; do + kill -9 $pid + if [ $? -eq 0 ]; then + echo "Killed server with PID $pid" + else + echo "Failed to kill server with PID $pid" + fi +done < .pid + +rm .pid + +echo "All servers stopped." \ No newline at end of file diff --git a/examples/distributed_simulation/stop_vllm.sh b/examples/distributed_simulation/stop_vllm.sh new file mode 100755 index 000000000..eaefbcfe7 --- /dev/null +++ b/examples/distributed_simulation/stop_vllm.sh @@ -0,0 +1,19 @@ +#!/bin/bash + +if [ ! -f .vllm_pid ]; then + echo "PID file not found. Are the servers running?" + exit 1 +fi + +while read pid; do + kill -9 $pid + if [ $? -eq 0 ]; then + echo "Killed vllm server with PID $pid" + else + echo "Failed to kill vllm server with PID $pid" + fi +done < .vllm_pid + +rm .vllm_pid + +echo "All vllm servers stopped." \ No newline at end of file diff --git a/examples/game_gomoku/code/board_agent.py b/examples/game_gomoku/code/board_agent.py index 87247111b..6cbef4ced 100644 --- a/examples/game_gomoku/code/board_agent.py +++ b/examples/game_gomoku/code/board_agent.py @@ -39,7 +39,7 @@ def board2img(board: np.ndarray, save_path: str) -> str: for y in range(size): for x in range(size): - if board[y, x] == NAME_TO_PIECE[NAME_WHITE]: # white player + if board[y, x] == NAME_TO_PIECE[NAME_BLACK]: # black player circle = patches.Circle( (x, y), 0.45, @@ -48,7 +48,7 @@ def board2img(board: np.ndarray, save_path: str) -> str: zorder=10, ) ax.add_patch(circle) - elif board[y, x] == NAME_TO_PIECE[NAME_BLACK]: # black player + elif board[y, x] == NAME_TO_PIECE[NAME_WHITE]: # white player circle = patches.Circle( (x, y), 0.45, @@ -85,33 +85,33 @@ def reply(self, x: dict = None) -> dict: if x is None: # Beginning of the game content = ( - "Welcome to the Gomoku game! Black player goes " - "first. Please make your move." + "Welcome to the Gomoku game! Black player goes first. " + "Please make your move." ) else: row, col = x["content"] self.assert_valid_move(row, col) - if self.check_win(row, col, NAME_TO_PIECE[x["name"]]): - content = f"The game ends, {x['name']} wins!" + # change the board + self.board[row, col] = NAME_TO_PIECE[x["name"]] + + # check if the game ends + if self.check_draw(): + content = "The game ends in a draw!" self.game_end = True else: - # change the board - self.board[row, col] = NAME_TO_PIECE[x["name"]] + next_player_name = ( + NAME_BLACK if x["name"] == NAME_WHITE else NAME_WHITE + ) + content = CURRENT_BOARD_PROMPT_TEMPLATE.format( + board=self.board2text(), + player=next_player_name, + ) - # check if the game ends - if self.check_draw(): - content = "The game ends in a draw!" + if self.check_win(row, col, NAME_TO_PIECE[x["name"]]): + content = f"The game ends, {x['name']} wins!" self.game_end = True - else: - next_player_name = ( - NAME_BLACK if x["name"] == NAME_WHITE else NAME_WHITE - ) - content = CURRENT_BOARD_PROMPT_TEMPLATE.format( - board=self.board2text(), - player=next_player_name, - ) msg_host = Msg(self.name, content, role="assistant") self.speak(msg_host) diff --git a/examples/game_gomoku/main.ipynb b/examples/game_gomoku/main.ipynb index df3149637..04be0c07b 100644 --- a/examples/game_gomoku/main.ipynb +++ b/examples/game_gomoku/main.ipynb @@ -82,13 +82,13 @@ " \n", " for y in range(size):\n", " for x in range(size):\n", - " if board[y, x] == NAME_TO_PIECE[NAME_WHITE]: # white player\n", + " if board[y, x] == NAME_TO_PIECE[NAME_BLACK]: # black player\n", " circle = patches.Circle((x, y), 0.45, \n", " edgecolor='black', \n", " facecolor='black',\n", " zorder=10)\n", " ax.add_patch(circle)\n", - " elif board[y, x] == NAME_TO_PIECE[NAME_BLACK]: # black player\n", + " elif board[y, x] == NAME_TO_PIECE[NAME_WHITE]: # white player\n", " circle = patches.Circle((x, y), 0.45, \n", " edgecolor='black', \n", " facecolor='white',\n", @@ -156,30 +156,38 @@ " # Record the status of the game\n", " self.game_end = False\n", " \n", - " def reply(self, input_: dict = None) -> dict:\n", - " if input_ is None:\n", + " def reply(self, x: dict = None) -> dict:\n", + " if x is None:\n", " # Beginning of the game\n", - " content = \"Welcome to the Gomoku game! Black player goes first. Please make your move.\" \n", + " content = (\n", + " \"Welcome to the Gomoku game! Black player goes first. \"\n", + " \"Please make your move.\"\n", + " )\n", " else:\n", - " x, y = input_[\"content\"]\n", - " \n", - " self.assert_valid_move(x, y)\n", - " \n", - " if self.check_win(x, y, NAME_TO_PIECE[input_[\"name\"]]):\n", - " content = f\"The game ends, {input_['name']} wins!\"\n", + " row, col = x[\"content\"]\n", + "\n", + " self.assert_valid_move(row, col)\n", + "\n", + " # change the board\n", + " self.board[row, col] = NAME_TO_PIECE[x[\"name\"]]\n", + "\n", + " # check if the game ends\n", + " if self.check_draw():\n", + " content = \"The game ends in a draw!\"\n", " self.game_end = True\n", " else:\n", - " # change the board\n", - " self.board[x, y] = NAME_TO_PIECE[input_[\"name\"]]\n", - " \n", - " # check if the game ends\n", - " if self.check_draw():\n", - " content = \"The game ends in a draw!\"\n", + " next_player_name = (\n", + " NAME_BLACK if x[\"name\"] == NAME_WHITE else NAME_WHITE\n", + " )\n", + " content = CURRENT_BOARD_PROMPT_TEMPLATE.format(\n", + " board=self.board2text(),\n", + " player=next_player_name,\n", + " )\n", + "\n", + " if self.check_win(row, col, NAME_TO_PIECE[x[\"name\"]]):\n", + " content = f\"The game ends, {x['name']} wins!\"\n", " self.game_end = True\n", - " else:\n", - " next_player_name = NAME_BLACK if input_[\"name\"] == NAME_WHITE else NAME_WHITE\n", - " content = CURRENT_BOARD_PROMPT_TEMPLATE.format(board=self.board2text(), player=next_player_name)\n", - " \n", + "\n", " msg_host = Msg(self.name, content, role=\"assistant\")\n", " self.speak(msg_host)\n", " \n", diff --git a/examples/game_werewolf/prompt.py b/examples/game_werewolf/prompt.py index c36291973..6f2c476e5 100644 --- a/examples/game_werewolf/prompt.py +++ b/examples/game_werewolf/prompt.py @@ -1,5 +1,6 @@ # -*- coding: utf-8 -*- """Used to record prompts, will be replaced by configuration""" +from agentscope.parsers.json_object_parser import MarkdownJsonDictParser class Prompts: @@ -7,56 +8,83 @@ class Prompts: to_wolves = ( "{}, if you are the only werewolf, eliminate a player. Otherwise, " - "discuss with your teammates and reach an agreement. Respond in the " - "following format which can be loaded by python json.loads()\n" - "{{\n" - ' "thought": "thought",\n' - ' "speak": "thoughts summary to say to others",\n' - ' "agreement": "whether the discussion reached an agreement or ' - 'not(true/false)"\n' - "}}" + "discuss with your teammates and reach an agreement." ) - to_wolves_vote = ( - "Which player do you vote to kill? Respond in the following format " - "which can be loaded by python json.loads()\n" - "{{\n" - ' "thought": "thought" ,\n' - ' "speak": "player_name"\n' - "}}" + wolves_discuss_parser = MarkdownJsonDictParser( + content_hint={ + "thought": "what you thought", + "speak": "what you speak", + "finish_discussion": "whether the discussion reached an " + "agreement or not (true/false)", + }, + required_keys=["thought", "speak", "finish_discussion"], + keys_to_memory="speak", + keys_to_content="speak", + keys_to_metadata=["finish_discussion"], + ) + + to_wolves_vote = "Which player do you vote to kill?" + + wolves_vote_parser = MarkdownJsonDictParser( + content_hint={ + "thought": "what you thought", + "speak": "player_name", + }, + required_keys=["thought", "speak"], + keys_to_memory="speak", + keys_to_content="speak", ) to_wolves_res = "The player with the most votes is {}." to_witch_resurrect = ( "{witch_name}, you're the witch. Tonight {dead_name} is eliminated. " - "Would you like to resurrect {dead_name}? Respond in the following " - "format which can be loaded by python json.loads()\n" - "{{\n" - ' "thought": "thought",\n' - ' "speak": "thoughts summary to say",\n' - ' "resurrect": true/false\n' - "}}" + "Would you like to resurrect {dead_name}?" ) - to_witch_poison = ( - "Would you like to eliminate one player? Respond in the following " - "json format which can be loaded by python json.loads()\n" - "{{\n" - ' "thought": "thought", \n' - ' "speak": "thoughts summary to say",\n' - ' "eliminate": ture/false\n' - "}}" + to_witch_resurrect_no = "The witch has chosen not to resurrect the player." + to_witch_resurrect_yes = "The witch has chosen to resurrect the player." + + witch_resurrect_parser = MarkdownJsonDictParser( + content_hint={ + "thought": "what you thought", + "speak": "whether to resurrect the player and the reason", + "resurrect": "whether to resurrect the player or not (true/false)", + }, + required_keys=["thought", "speak", "resurrect"], + keys_to_memory="speak", + keys_to_content="speak", + keys_to_metadata=["resurrect"], + ) + + to_witch_poison = "Would you like to eliminate one player?" + + witch_poison_parser = MarkdownJsonDictParser( + content_hint={ + "thought": "what you thought", + "speak": "what you speak", + "eliminate": "whether to eliminate a player or not (true/false)", + }, + required_keys=["thought", "speak", "eliminate"], + keys_to_memory="speak", + keys_to_content="speak", + keys_to_metadata=["eliminate"], ) to_seer = ( "{}, you're the seer. Which player in {} would you like to check " - "tonight? Respond in the following json format which can be loaded " - "by python json.loads()\n" - "{{\n" - ' "thought": "thought" ,\n' - ' "speak": "player_name"\n' - "}}" + "tonight?" + ) + + seer_parser = MarkdownJsonDictParser( + content_hint={ + "thought": "what you thought", + "speak": "player_name", + }, + required_keys=["thought", "speak"], + keys_to_memory="speak", + keys_to_content="speak", ) to_seer_result = "Okay, the role of {} is a {}." @@ -76,26 +104,34 @@ class Prompts: "based on the " "situation and the information you gain, to vote a player eliminated " "among alive players and to win the game, what do you want to say " - "to others? You can decide whether to reveal your role. Respond in " - "the following JSON format which can be loaded by python json.loads(" - ")\n" - "{{\n" - ' "thought": "thought" ,\n' - ' "speak": "thought summary to say to others"\n' - "}}" + "to others? You can decide whether to reveal your role. " + ) + + survivors_discuss_parser = MarkdownJsonDictParser( + content_hint={ + "thought": "what you thought", + "speak": "what you speak", + }, + required_keys=["thought", "speak"], + keys_to_memory="speak", + keys_to_content="speak", + ) + + survivors_vote_parser = MarkdownJsonDictParser( + content_hint={ + "thought": "what you thought", + "speak": "player_name", + }, + required_keys=["thought", "speak"], + keys_to_memory="speak", + keys_to_content="speak", ) to_all_vote = ( - "Now the alive players are {}. Given the game rules and your role, " - "based on the situation and the information you gain, to win the " - "game, it's time to vote one player eliminated among the alive " - "players, please cast your vote on who you believe is a werewolf. " - "Respond in the following format which can be loaded by python " - "json.loads()\n" - "{{\n" - ' "thought": "thought",\n' - ' "speak": "player_name"\n' - "}}" + "Given the game rules and your role, based on the situation and the" + " information you gain, to win the game, it's time to vote one player" + " eliminated among the alive players. Which player do you vote to " + "kill?" ) to_all_res = "{} has been voted out." diff --git a/examples/game_werewolf/werewolf.py b/examples/game_werewolf/werewolf.py index aaf360fd8..e88217906 100644 --- a/examples/game_werewolf/werewolf.py +++ b/examples/game_werewolf/werewolf.py @@ -9,6 +9,7 @@ majority_vote, extract_name_and_id, n2s, + set_parsers, ) from agentscope.message import Msg from agentscope.msghub import msghub @@ -29,6 +30,7 @@ def main() -> None: model_configs="./configs/model_configs.json", agent_configs="./configs/agent_configs.json", ) + roles = ["werewolf", "werewolf", "villager", "villager", "seer", "witch"] wolves, witch, seer = survivors[:2], survivors[-1], survivors[-2] @@ -37,11 +39,13 @@ def main() -> None: # night phase, werewolves discuss hint = HostMsg(content=Prompts.to_wolves.format(n2s(wolves))) with msghub(wolves, announcement=hint) as hub: + set_parsers(wolves, Prompts.wolves_discuss_parser) for _ in range(MAX_WEREWOLF_DISCUSSION_ROUND): x = sequentialpipeline(wolves) - if x.get("agreement", False): + if x.metadata.get("finish_discussion", False): break + set_parsers(wolves, Prompts.wolves_vote_parser) # werewolves vote hint = HostMsg(content=Prompts.to_wolves_vote) votes = [ @@ -65,14 +69,19 @@ def main() -> None: }, ), ) - if witch(hint).get("resurrect", False): + set_parsers(witch, Prompts.witch_resurrect_parser) + if witch(hint).metadata.get("recurrent", False): healing_used_tonight = True dead_player.pop() healing = False + HostMsg(content=Prompts.to_witch_resurrect_yes) + else: + HostMsg(content=Prompts.to_witch_resurrect_no) if poison and not healing_used_tonight: + set_parsers(witch, Prompts.witch_poison_parser) x = witch(HostMsg(content=Prompts.to_witch_poison)) - if x.get("eliminate", False): + if x.metadata.get("eliminate", False): dead_player.append(extract_name_and_id(x.content)[0]) poison = False @@ -81,6 +90,7 @@ def main() -> None: hint = HostMsg( content=Prompts.to_seer.format(seer.name, n2s(survivors)), ) + set_parsers(seer, Prompts.seer_parser) x = seer(hint) player, idx = extract_name_and_id(x.content) @@ -108,8 +118,10 @@ def main() -> None: ] with msghub(survivors, announcement=hints) as hub: # discuss + set_parsers(survivors, Prompts.survivors_discuss_parser) x = sequentialpipeline(survivors) + set_parsers(survivors, Prompts.survivors_vote_parser) # vote hint = HostMsg(content=Prompts.to_all_vote.format(n2s(survivors))) votes = [ diff --git a/examples/game_werewolf/werewolf_utils.py b/examples/game_werewolf/werewolf_utils.py index f4301bf44..c0e199ca6 100644 --- a/examples/game_werewolf/werewolf_utils.py +++ b/examples/game_werewolf/werewolf_utils.py @@ -65,3 +65,14 @@ def _get_name(agent_: Union[AgentBase, str]) -> str: + " and " + _get_name(agents[-1]) ) + + +def set_parsers( + agents: Union[AgentBase, list[AgentBase]], + parser_name: str, +) -> None: + """Add parser to agents""" + if not isinstance(agents, list): + agents = [agents] + for agent in agents: + agent.set_parser(parser_name) diff --git a/examples/model_configs_template/litellm_chat_template.json b/examples/model_configs_template/litellm_chat_template.json new file mode 100644 index 000000000..f1711dca9 --- /dev/null +++ b/examples/model_configs_template/litellm_chat_template.json @@ -0,0 +1,11 @@ +[{ + "config_name": "lite_llm_openai_chat_gpt-3.5-turbo", + "model_type": "litellm_chat", + "model_name": "gpt-3.5-turbo" +}, +{ + "config_name": "lite_llm_claude3", + "model_type": "litellm_chat", + "model_name": "claude-3-opus-20240229" +} +] diff --git a/examples/model_configs_template/openai_chat_template.json b/examples/model_configs_template/openai_chat_template.json index 8d3f78087..f5abccf00 100644 --- a/examples/model_configs_template/openai_chat_template.json +++ b/examples/model_configs_template/openai_chat_template.json @@ -1,25 +1,38 @@ -[{ - "config_name": "openai_chat_gpt-4", - "model_type": "openai_chat", - "model_name": "gpt-4", - "api_key": "{your_api_key}", - "client_args": { - "max_retries": 3 +[ + { + "config_name": "openai_chat_gpt-4", + "model_type": "openai_chat", + "model_name": "gpt-4", + "api_key": "{your_api_key}", + "client_args": { + "max_retries": 3 + }, + "generate_args": { + "temperature": 0.7 + } }, - "generate_args": { - "temperature": 0.7 - } -}, -{ - "config_name": "openai_chat_gpt-3.5-turbo", - "model_type": "openai_chat", - "model_name": "gpt-3.5-turbo", - "api_key": "{your_api_key}", - "client_args": { - "max_retries": 3 + { + "config_name": "openai_chat_gpt-3.5-turbo", + "model_type": "openai_chat", + "model_name": "gpt-3.5-turbo", + "api_key": "{your_api_key}", + "client_args": { + "max_retries": 3 + }, + "generate_args": { + "temperature": 0.7 + } }, - "generate_args": { - "temperature": 0.7 + { + "config_name": "openai_chat_gpt-4o", + "model_type": "openai_chat", + "model_name": "gpt-4o", + "api_key": "{your_api_key}", + "client_args": { + "max_retries": 3 + }, + "generate_args": { + "temperature": 0.7 + } } -} ] \ No newline at end of file diff --git a/examples/model_configs_template/zhipu_chat_template.json b/examples/model_configs_template/zhipu_chat_template.json new file mode 100644 index 000000000..b21f17307 --- /dev/null +++ b/examples/model_configs_template/zhipu_chat_template.json @@ -0,0 +1,7 @@ +[{ + "config_name": "zhipuai_chat-glm", + "model_type": "zhipuai_chat", + "model_name": "glm-4", + "api_key": "{your_api_key}" +} +] \ No newline at end of file diff --git a/examples/model_configs_template/zhipu_embedding_template.json b/examples/model_configs_template/zhipu_embedding_template.json new file mode 100644 index 000000000..6a544974e --- /dev/null +++ b/examples/model_configs_template/zhipu_embedding_template.json @@ -0,0 +1,7 @@ +[{ + "config_name": "zhipu-embedding", + "model_type": "zhipuai_embedding", + "model_name": "embedding-2", + "api_key": "{your_api_key}" +} +] \ No newline at end of file diff --git a/examples/swe_agent/main.ipynb b/examples/swe_agent/main.ipynb new file mode 100644 index 000000000..59d15bcf3 --- /dev/null +++ b/examples/swe_agent/main.ipynb @@ -0,0 +1,285 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Conversation with Software Engineering Agent\n", + "\n", + "SWE-agent(SoftWare Engineering Agent) is an agent designed for solving real world software engineering problems, such as fixing github issues.\n", + "More details can be found in the project's [homepage](https://swe-agent.com/) and related [github repo](https://swe-agent.com/).\n", + "\n", + "In the example here, we partially implement the SWE-agent, and provide a simple example of how to use the implemented SWE-agent to fix a bug in a python file.\n", + "You should note that currently how to enable agents with stronger programming capabilities remains an open challenge, and the performance of the paritially implemented SWE-agent is not guaranteed.\n", + "\n", + "## Prerequisites\n", + "\n", + "- Follow [READMD.md](https://github.com/modelscope/agentscope) to install AgentScope. We require the lastest version, so you should build from source by running `pip install -e .` instead of intalling from pypi. \n", + "- Prepare a model configuration. AgentScope supports both local deployed model services (CPU or GPU) and third-party services. More details and example model configurations please refer to our [tutorial](https://modelscope.github.io/agentscope/en/tutorial/203-model.html).\n", + "- Understand the ServiceToolkit module and how to use it to pre-process the tool functions for LLMs. You can refer to the [ReAct agent example](../conversation_with_react_agent/main.ipynb) and you should also refer to the [tutorial](https://modelscope.github.io/agentscope/en/tutorial/204-service.html) for service functions.\n", + "\n", + "\n", + "## Note\n", + "\n", + "- The example is tested with the following models. For other models, you may need to adjust the prompt.\n", + " - gpt-4\n", + "- How to enable agents with stronger programming capabilities remains an open challenge, and the current implementations are not perfect. Please feel free to explore it yourself." + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": {}, + "outputs": [], + "source": [ + "YOUR_MODEL_CONFIGURATION_NAME = \"{YOUR_MODEL_CONFIGURATION_NAME}\"\n", + "\n", + "YOUR_MODEL_CONFIGURATION = {\n", + " \"model_type\": \"xxx\", \n", + " \"config_name\": YOUR_MODEL_CONFIGURATION_NAME\n", + " \n", + " # ...\n", + "}" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Step 1: Initialize the AgentScope environment and SWE-agent\n", + "\n", + "Here we init the agentscope environment and load the SWE-agent.\n", + "\n", + "The code of SWE-agent is in `swe_agent.py`, and the related prompts are in `swe_agent_prompts.py`.\n", + "\n", + "If you are interested in the details, please refer to the code and the origianl SWE-agent repo [here](https://github.com/princeton-nlp/SWE-agent)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from swe_agent import SWEAgent\n", + "\n", + "import agentscope\n", + "\n", + "agentscope.init(model_configs=YOUR_MODEL_CONFIGURATION)\n", + "\n", + "agent = SWEAgent(\n", + " name=\"assistant\",\n", + " model_config_name=YOUR_MODEL_CONFIGURATION_NAME,\n", + ")\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Step 2: Create the code to be processed by the SWE-agent\n", + "\n", + "Here we use the `write_file` function to write the following code into `gcd.py`.\n", + "The code here is a wrong implementation of the [Greatest Common Divisor (GCD) algorithm](https://en.wikipedia.org/wiki/Euclidean_algorithm).\n", + "We will ask the SWE-agent to correct it in our next step." + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "{'status': ,\n", + " 'content': 'WRITE OPERATION:\\nYou have written to \"gcd.py\" on these lines: 0:-1.\\ndef gcd(a, b):\\n if a == 0:\\n return b\\n while a != 0:\\n a, b = b, a\\n return b\\n\\ndef lcm(a, b):\\n return (a * b) // gcd(a, b)\\n\\n# testing on GCD and LCM functions\\nprint(\"GCD of 12 and 18 is:\", gcd(12, 18))\\nprint(\"LCM of 12 and 18 is:\", lcm(12, 18))\\n\\n'}" + ] + }, + "execution_count": 3, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "from swe_agent_service_func import write_file\n", + "\n", + "# here the gcd code is written in a wrong way\n", + "wrong_gcd_text = \"\"\"\n", + "def gcd(a, b):\n", + " if a == 0:\n", + " return b\n", + " while a != 0:\n", + " a, b = b, a\n", + " return b\n", + "\n", + "def lcm(a, b):\n", + " return (a * b) // gcd(a, b)\n", + "\n", + "# testing on GCD and LCM functions\n", + "print(\"GCD of 12 and 18 is:\", gcd(12, 18))\n", + "print(\"LCM of 12 and 18 is:\", lcm(12, 18))\n", + "\"\"\"\n", + "\n", + "write_file(\"gcd.py\", wrong_gcd_text)\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Step 3: Ask the SWE-agent to correct the code\n", + "\n", + "Here we simple ask the SWE-agent to correct the code in `gcd.py`." + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "user: The file in gcd.py is not returning the correct answer, can you fix the file for me?\n", + "assistant: {\n", + " \"thought\": \"First thing I'll do is read the content of gcd.py file to understand the existing logic.\",\n", + " \"action\": {\n", + " \"name\": \"read_file\",\n", + " \"arguments\": {\n", + " \"file_path\": \"gcd.py\"\n", + " }\n", + " }\n", + "}\n", + "assistant: \n", + "====Observation====\n", + "\n", + "def gcd(a, b):\n", + " if a == 0:\n", + " return b\n", + " while a != 0:\n", + " a, b = b, a\n", + " return b\n", + "\n", + "def lcm(a, b):\n", + " return (a * b) // gcd(a, b)\n", + "\n", + "# testing on GCD and LCM functions\n", + "print(\"GCD of 12 and 18 is:\", gcd(12, 18))\n", + "print(\"LCM of 12 and 18 is:\", lcm(12, 18))\n", + "\n", + "\n", + "assistant: {\n", + " \"thought\": \"I can see the error in the gcd calculation. The gcd calculation in the file is incorrect. The correct way to find the gcd of two numbers using the Euclidean algorithm is to continually reduce a and b by finding the remainder of a % b until b is 0, at which point a will be the gcd of the original a and b values. In the given code, instead of finding the remainder, it is incorrectly swapping the values of a and b. I'll correct this error by modifying the gcd function using the write_file command, reflecting the correct Euclidean algorithm.\",\n", + " \"action\": {\n", + " \"name\": \"write_file\",\n", + " \"arguments\": {\n", + " \"start_line\": 0,\n", + " \"end_line\": 15,\n", + " \"content\": \"def gcd(a, b):\\n while b != 0:\\n a, b = b, a % b\\n return a\\n\\n\\ndef lcm(a, b):\\n return (a * b) // gcd(a, b)\\n\\n\\n# testing on GCD and LCM functions\\nprint(\\\"GCD of 12 and 18 is:\\\", gcd(12, 18))\\nprint(\\\"LCM of 12 and 18 is:\\\", lcm(12, 18))\\n\",\n", + " \"file_path\": \"gcd.py\"\n", + " }\n", + " }\n", + "}\n", + "assistant: \n", + "====Observation====\n", + "WRITE OPERATION:\n", + "You have written to \"gcd.py\" on these lines: 0:15.def gcd(a, b):\n", + " while b != 0:\n", + " a, b = b, a % b\n", + " return a\n", + "\n", + "\n", + "def lcm(a, b):\n", + " return (a * b) // gcd(a, b)\n", + "\n", + "\n", + "# testing on GCD and LCM functions\n", + "print(\"GCD of 12 and 18 is:\", gcd(12, 18))\n", + "print(\"LCM of 12 and 18 is:\", lcm(12, 18))\n", + "\n", + "\n", + "assistant: {\n", + " \"thought\": \"I should run the python file to see if the changes I made have fixed the gcd function.\",\n", + " \"action\": {\n", + " \"name\": \"execute_shell_command\",\n", + " \"arguments\": {\n", + " \"command\": \"python3 gcd.py\"\n", + " }\n", + " }\n", + "}\n", + "assistant: \n", + "====Observation====\n", + "GCD of 12 and 18 is: 6\n", + "LCM of 12 and 18 is: 36\n", + "assistant: {\n", + " \"thought\": \"The gcd.py file has been fixed and tested successfully. Therefore, no further actions are needed.\",\n", + " \"action\": {\n", + " \"name\": \"exit\",\n", + " \"arguments\": {}\n", + " }\n", + "}\n", + "assistant: \n", + "====Observation====\n", + "Current task finished, exitting.\n" + ] + } + ], + "source": [ + "from loguru import logger\n", + "from agentscope.message import Msg\n", + "\n", + "mss = Msg(\n", + " name=\"user\", \n", + " content=\"The file in gcd.py is not returning the correct answer, can you fix the file for me?\", \n", + " role=\"user\"\n", + ")\n", + "logger.chat(mss)\n", + "answer_mss = agent(mss)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Conlusion\n", + "\n", + "After a few iterations, the SWE-agent assistant finish the job successfully, and the code is now working fine." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Above we shown a example of how to use the SWE-agent to fix code errors.\n", + "Although the design of the SWE-agent is primarily aimed at addressing GitHub issues, with modifications, it can also be utilized for more general programming tasks.\n", + "\n", + "Currently, how to enable agent with general programming ablities remains a challenging open question, with the efficacy of agent programming potentially influenced by factors such as prompt construction, model capabilities, and the complexity of the task at hand. Here we just provide an interesting toy example. \n", + "\n", + "We encourage users to experiment by altering the prompts within this example or by assigning different tasks to the agent, among other methods of exploration. Please feel free to experiment and explore on your own. The AgentScope team will continue to provide updates, enhancing the capabilities of the Programming Agents in the future!" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "datajuicer", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.9.18" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} diff --git a/examples/swe_agent/swe_agent.py b/examples/swe_agent/swe_agent.py new file mode 100644 index 000000000..3b55431d5 --- /dev/null +++ b/examples/swe_agent/swe_agent.py @@ -0,0 +1,274 @@ +# -*- coding: utf-8 -*- +"""An agent class that partially implements the SWE-agent. +SWE-agent is an agent designed for solving github issues. +More details can be found in https://swe-agent.com/. + +Here we partially implement and modified the SWE-agent, +try to make it work with wider range of tasks then just fixing github issues. +""" + +from agentscope.agents import AgentBase +from agentscope.message import Msg +from agentscope.exception import ResponseParsingError +from agentscope.parsers import MarkdownJsonDictParser +from typing import List, Callable +import json +from agentscope.service import ( + ServiceFactory, + execute_shell_command, +) + +from swe_agent_service_func import ( + exec_py_linting, + write_file, + read_file, +) + +from swe_agent_prompts import ( + get_system_prompt, + get_context_prompt, + get_step_prompt, +) + + +def prepare_func_prompt(function: Callable) -> str: + func, desc = ServiceFactory.get(function) + func_name = desc["function"]["name"] + func_desc = desc["function"]["description"] + args_desc = desc["function"]["parameters"]["properties"] + + args_list = [f"{func_name}: {func_desc}"] + for args_name, args_info in args_desc.items(): + if "type" in args_info: + args_line = ( + f'\t{args_name} ({args_info["type"]}): ' + f'{args_info.get("description", "")}' + ) + else: + args_line = f'\t{args_name}: {args_info.get("description", "")}' + args_list.append(args_line) + + func_prompt = "\n".join(args_list) + return func_prompt + + +COMMANDS_DISCRIPTION_DICT = { + "exit": "exit: Executed when the current task is complete, takes no arguments", # noqa + "scroll_up": "scroll_up: Scrolls up the current open file, will scroll up and show you the 100 lines above your current lines, takes no arguments", # noqa + "scroll_down": "scroll_down: Scrolls down the current open file, will scroll down and show you the 100 lines below your current lines'takes no arguments", # noqa + "goto": "goto: This will take you directly to the line and show you the 100 lines below it. \n line_num (int): The line number to go to.", # noqa +} + +COMMANDS_DISCRIPTION_DICT["write_file"] = prepare_func_prompt(write_file) +COMMANDS_DISCRIPTION_DICT["read_file"] = prepare_func_prompt(read_file) +COMMANDS_DISCRIPTION_DICT["execute_shell_command"] = prepare_func_prompt( + execute_shell_command, +) +COMMANDS_DISCRIPTION_DICT["exec_py_linting"] = prepare_func_prompt( + exec_py_linting, +) + + +ERROR_INFO_PROMPT = """Your response is not a JSON object, and cannot be parsed by `json.loads` in parse function: +## Your Response: +[YOUR RESPONSE BEGIN] +{response} +[YOUR RESPONSE END] + +## Error Information: +{error_info} + +Analyze the reason, and re-correct your response in the correct format.""" # pylint: disable=all # noqa + + +def count_file_lines(file_path: str) -> int: + with open(file_path, "r") as file: + lines = file.readlines() + return len(lines) + + +class SWEAgent(AgentBase): + """ + The SWE-agent + """ + + def __init__( + self, + name: str, + model_config_name: str, + ) -> None: + """ """ + super().__init__( + name=name, + model_config_name=model_config_name, + ) + + self.memory_window = 6 + self.max_retries = 2 + self.running_memory: List[str] = [] + self.cur_file: str = "" + self.cur_line: int = 0 + self.cur_file_content: str = "" + + self.main_goal = "" + self.commands_prompt = "" + self.parser = MarkdownJsonDictParser() + self.get_commands_prompt() + + def get_current_file_content(self) -> None: + """ + Get the current file content. + """ + if self.cur_file == "": + return + start_line = self.cur_line - 50 + if start_line < 0: + start_line = 0 + end_line = self.cur_line + 50 + if end_line > count_file_lines(self.cur_file): + end_line = -1 + read_res = read_file(self.cur_file, start_line, end_line) + self.cur_file_content = read_res.content + + def step(self) -> Msg: + """ + Step the SWE-agent. + """ + message_list = [] + + # construct system prompt + system_prompt = get_system_prompt(self.commands_prompt) + message_list.append(Msg("user", system_prompt, role="system")) + + # construct context prompt, i.e. previous actions + context_prompt = get_context_prompt( + self.running_memory, + self.memory_window, + ) + message_list.append(Msg("user", context_prompt, role="user")) + + # construct step prompt for this instance + self.get_current_file_content() + step_prompt = get_step_prompt( + self.main_goal, + self.cur_file, + self.cur_line, + self.cur_file_content, + ) + message_list.append(Msg("user", step_prompt, role="user")) + + # get response from agent + try: + in_prompt = self.model.format(message_list) + res = self.model( + in_prompt, + parse_func=self.parser.parse, + max_retries=1, + ) + + except ResponseParsingError as e: + response_msg = Msg(self.name, e.raw_response, "assistant") + self.speak(response_msg) + + # Re-correct by model itself + error_msg = Msg( + name="system", + content={ + "action": {"name": "error"}, + "error_msg": ERROR_INFO_PROMPT.format( + parse_func=self.parser.parse, + error_info=e.message, + response=e.raw_response, + ), + }, + role="system", + ) + self.speak(error_msg) + # continue + self.running_memory.append(error_msg) + return error_msg + + msg_res = Msg(self.name, res.parsed, role="assistant") + + self.speak( + Msg(self.name, json.dumps(res.parsed, indent=4), role="assistant"), + ) + + # parse and execute action + action = res.parsed.get("action") + + obs = self.prase_command(res.parsed["action"]) + self.speak( + Msg(self.name, "\n====Observation====\n" + obs, role="assistant"), + ) + + # add msg to context windows + self.running_memory.append(str(action) + str(obs)) + return msg_res + + def reply(self, x: dict = None) -> dict: + action_name = None + self.main_goal = x.content + while not action_name == "exit": + msg = self.step() + action_name = msg.content["action"]["name"] + return msg + + def prase_command(self, command_call: dict) -> str: + command_name = command_call["name"] + command_args = command_call["arguments"] + if command_name == "exit": + return "Current task finished, exitting." + if command_name in ["goto", "scroll_up", "scroll_down"]: + if command_name == "goto": + line = command_call["arguments"]["line_num"] + command_str = f"Going to {self.cur_file} line \ + {command_args['line_mum']}." + command_failed_str = f"Failed to go to {self.cur_file} \ + line {command_args['line_num']}" + if command_name == "scroll_up": + line = self.cur_line - 100 + if line < 0: + line = 0 + command_str = ( + f"Scrolling up from file {self.cur_file} to line {line}." + ) + command_failed_str = ( + f"Failed to scroll up {self.cur_file} to line {line}" + ) + if command_name == "scroll_down": + line = self.cur_line + 100 + if line > count_file_lines(self.cur_file): + line = count_file_lines(self.cur_file) + command_str = ( + f"Scrolling down from file {self.cur_file} to line {line}." + ) + command_failed_str = ( + f"Failed to scrool down {self.cur_file} to line {line}" + ) + read_status = read_file(self.cur_file, line, line + 100) + if read_status.status == "success": + self.cur_line = line + obs = read_status.content + return f"{command_str}. Observe file content: {obs}" + else: + return command_failed_str + if command_name == "execute_shell_command": + return execute_shell_command(**command_args).content + if command_name == "write_file": + self.cur_file = command_args["file_path"] + self.cur_line = command_args.get("start_line", 0) + write_status = write_file(**command_args) + return write_status.content + if command_name == "read_file": + self.cur_file = command_args["file_path"] + self.cur_line = command_args.get("start_line", 0) + read_status = read_file(**command_args) + return read_status.content + if command_name == "exec_py_linting": + return exec_py_linting(**command_args).content + return "No such command" + + def get_commands_prompt(self) -> None: + for name, desc in COMMANDS_DISCRIPTION_DICT.items(): + self.commands_prompt += f"{name}: {desc}\n" diff --git a/examples/swe_agent/swe_agent_prompts.py b/examples/swe_agent/swe_agent_prompts.py new file mode 100644 index 000000000..4c30e48af --- /dev/null +++ b/examples/swe_agent/swe_agent_prompts.py @@ -0,0 +1,123 @@ +# -*- coding: utf-8 -*- +# pylint: disable=C0301 +"""The SWE-agent relay heavily on it's prompts. +This file contains the neccessary prompts for the SWE-agent. +Some prompts are taken and modified from the original SWE-agent repo +or the SWE-agent implementation from Open-Devin. +""" + +WINDOW = 100 + + +def get_system_prompt(command_prompt: str) -> str: + """ + Get the system prompt for SWE-agent. + """ + return f""" + SETTING: + You are an autonomous coding agent, here to perform codding tasks given the instruction. + You have been designed with a wide range of programming tasks, from code editing and debugging to testing and deployment. + You have access to a variety of tools and commands that you can use to help you solve problems efficiently. + + You're working directly in the command line with a special interface. + + The special interface consists of a file editor that shows you {WINDOW} lines of a file at a time. + In addition to typical bash commands, you can also use the following commands to help you navigate and edit files. + + COMMANDS: + {command_prompt} + + Please note that THE WRITE COMMAND REQUIRES PROPER INDENTATION. + If you'd like to add the line ' print(x)' you must fully write that out, with all those spaces before the code! + Indentation is important and code that is not indented correctly will fail and require fixing before it can be run. + + If you'd like to issue two commands at once, PLEASE DO NOT DO THAT! Please instead first submit just the first command, and then after receiving a response you'll be able to issue the second command. + You're free to use any other bash commands you want (e.g. find, grep, cat, ls) in addition to the special commands listed above. + + However, the environment does NOT support interactive session commands (e.g. vim, python), so please do not invoke them. + + {RESPONSE_FORMAT_PROMPT} + + """ # noqa + + +RESPONSE_FORMAT_PROMPT = """ +## Response Format: +You should respond with a JSON object in the following format. +```json +{ + "thought": "what you thought", + "action": {"name": "{command name}", "arguments": {"{argument1 name}": xxx, "{argument2 name}": xxx}} +} +``` + +For Example: +```json +{ + "thought": "First I'll start by using ls to see what files are in the current directory. Then maybe we can look at some relevant files to see what they look like.", + "action": {"name": "execute_shell_command", "arguments": {"command": "ls -a"}} +} +``` +OUTPUT the JSON format and ONLY OUTPUT the JSON format. +Your Response should always be a valid JSON string that can be parsed. +""" # noqa + + +def get_step_prompt( + task: str, + file: str, + line: int, + current_file_content: str, +) -> str: + """ + Get the step prompt for SWE-agent. + """ + return f""" + We're currently perform the following coding task. Here's the original task description from the user. + {task} + + CURRENT + Open File: {file} on line {line} + + Current File Content: + {current_file_content} + + You can use these commands with the current file: + Navigation: `scroll_up`, `scroll_down`, and `goto ` + + + INSTRUCTIONS: + + 1. If you run a command and it doesn't work, try running a different command. A command that did not work once will not work the second time unless you modify it! + + 2. If you open a file and need to get to an area around a specific line that is not in the first 100 lines, say line 583, don't just use the scroll_down command multiple times. Instead, use the goto 583 command. It's much quicker. + + 3. Always make sure to look at the currently open file and the current working directory (which appears right after the currently open file). The currently open file might be in a different directory! Note that some commands, such as 'write_file' and 'read_file', open files, so they might change the current open file. + + 4. When editing files, it is easy to accidentally specify a wrong line number or to write code with incorrect indentation. Always check the code after you issue an edit to make sure that it reflects what you wanted to accomplish. If it didn't, issue another command to fix it. + + 5. After modifying python files, you can run `exec_py_linting` to check for errors. If there are errors, fix them and repeat the previous step. + + NOTE THAT THIS ENVIRONMENT DOES NOT SUPPORT INTERACTIVE SESSION COMMANDS, such as "vim" or "python", or "python3". So DONOT execute them by running `execute_shell_command` with `python` command or `python3` command if the code need additional inputs. + If you want to check whether a python file is valid, you can use `exec_py_linting` to check for errors. + + You should always notice your response format and respond with a JSON object in the following format. + {RESPONSE_FORMAT_PROMPT} +""" # noqa + + +def get_context_prompt(memory: list, window: int) -> str: + """ + Get the context prompt for the given memory and window. + """ + res = f"These are your past {window} actions:\n" + window_size = window if len(memory) > window else len(memory) + cur_mems = memory[-window_size:] + res += "===== Previous Actions =====\n" + for idx, mem in enumerate(cur_mems): + res += f"\nMemory {idx}:\n{mem}\n" + res += "======= End Actions =======\n" + res += "Use these memories to provide additional context to \ + the problem you are solving.\nRemember that you have already \ + completed these steps so you do not need to perform them again." + return res diff --git a/examples/swe_agent/swe_agent_service_func.py b/examples/swe_agent/swe_agent_service_func.py new file mode 100644 index 000000000..76bdaecce --- /dev/null +++ b/examples/swe_agent/swe_agent_service_func.py @@ -0,0 +1,147 @@ +# -*- coding: utf-8 -*- +# pylint: disable=C0301 +""" +Tools for swe-agent, such as checking files with linting and formatting, +writing and reading files by lines, etc. +""" +import subprocess +import os + +from agentscope.service.service_response import ServiceResponse +from agentscope.service.service_status import ServiceExecStatus + + +def exec_py_linting(file_path: str) -> ServiceResponse: + """ + Executes flake8 linting on the given .py file with specified checks and + returns the linting result. + + Args: + file_path (`str`): The path to the Python file to lint. + + Returns: + ServiceResponse: Contains either the output from the flake8 command as + a string if successful, or an error message including the error type. + """ + command = f"flake8 --isolated --select=F821,F822,F831,\ + E111,E112,E113,E999,E902 {file_path}" + + try: + result = subprocess.run( + command, + shell=True, + check=True, + stdout=subprocess.PIPE, + stderr=subprocess.PIPE, + text=True, + ) + return ServiceResponse( + status=ServiceExecStatus.SUCCESS, + content=result.stdout.strip() + if result.stdout + else "No lint errors found.", + ) + except subprocess.CalledProcessError as e: + error_message = ( + e.stderr.strip() + if e.stderr + else "An error occurred while linting the file." + ) + return ServiceResponse( + status=ServiceExecStatus.ERROR, + content=error_message, + ) + except Exception as e: + return ServiceResponse( + status=ServiceExecStatus.ERROR, + content=str(e), + ) + + +def write_file( + file_path: str, + content: str, + start_line: int = 0, + end_line: int = -1, +) -> ServiceResponse: + """ + Write content to a file by replacing the current lines between and with . Default start_line = 0 and end_line = -1. Calling this with no args will replace the whole file, so besure to use this with caution when writing to a file that already exists. + + Args: + file_path (`str`): The path to the file to write to. + content (`str`): The content to write to the file. + start_line (`Optional[int]`, defaults to `0`): The start line of the file to be replace with . + end_line (`Optional[int]`, defaults to `-1`): The end line of the file to be replace with . end_line = -1 means the end of the file, otherwise it should be a positive integer indicating the line number. + """ # noqa + try: + mode = "w" if not os.path.exists(file_path) else "r+" + insert = content.split("\n") + with open(file_path, mode, encoding="utf-8") as file: + if mode != "w": + all_lines = file.readlines() + new_file = [""] if start_line == 0 else all_lines[:start_line] + new_file += [i + "\n" for i in insert] + last_line = end_line + 1 + new_file += [""] if end_line == -1 else all_lines[last_line:] + else: + new_file = insert + + file.seek(0) + file.writelines(new_file) + file.truncate() + obs = f'WRITE OPERATION:\nYou have written to "{file_path}" \ + on these lines: {start_line}:{end_line}.' + return ServiceResponse( + status=ServiceExecStatus.SUCCESS, + content=obs + "".join(new_file), + ) + except Exception as e: + error_message = f"{e.__class__.__name__}: {e}" + return ServiceResponse( + status=ServiceExecStatus.ERROR, + content=error_message, + ) + + +def read_file( + file_path: str, + start_line: int = 0, + end_line: int = -1, +) -> ServiceResponse: + """ + Shows a given file's contents starting from up to . Default: start_line = 0, end_line = -1. By default the whole file will be read. + + Args: + file_path (`str`): The path to the file to read. + start_line (`Optional[int]`, defaults to `0`): The start line of the file to be read. + end_line (`Optional[int]`, defaults to `-1`): The end line of the file to be read. + """ # noqa + start_line = max(start_line, 0) + try: + with open(file_path, "r", encoding="utf-8") as file: + if end_line == -1: + if start_line == 0: + code_view = file.read() + else: + all_lines = file.readlines() + code_slice = all_lines[start_line:] + code_view = "".join(code_slice) + else: + all_lines = file.readlines() + num_lines = len(all_lines) + begin = max(0, min(start_line, num_lines - 2)) + end_line = ( + -1 if end_line > num_lines else max(begin + 1, end_line) + ) + code_slice = all_lines[begin:end_line] + code_view = "".join(code_slice) + return ServiceResponse( + status=ServiceExecStatus.SUCCESS, + content=f"{code_view}", + ) + except Exception as e: + error_message = f"{e.__class__.__name__}: {e}" + return ServiceResponse( + status=ServiceExecStatus.ERROR, + content=error_message, + ) diff --git a/setup.py b/setup.py index 0af62f5fd..2949a488c 100644 --- a/setup.py +++ b/setup.py @@ -22,6 +22,7 @@ "grpcio-tools==1.60.0", "protobuf==4.25.0", "expiringdict", + "dill", ] service_requires = [ @@ -67,11 +68,14 @@ "Flask==3.0.0", "Flask-Cors==4.0.0", "Flask-SocketIO==5.3.6", + "flake8", # TODO: move into other requires "dashscope==1.14.1", "openai>=1.3.0", "ollama>=0.1.7", "google-generativeai>=0.4.0", + "zhipuai", + "litellm", ] distribute_requires = minimal_requires + rpc_requires @@ -124,6 +128,7 @@ "console_scripts": [ "as_studio=agentscope.web.studio.studio:run_app", "as_workflow=agentscope.web.workstation.workflow:main", + "as_server=agentscope.server.launcher:as_server", ], }, ) diff --git a/src/agentscope/_init.py b/src/agentscope/_init.py index eb249ec8e..dff68e585 100644 --- a/src/agentscope/_init.py +++ b/src/agentscope/_init.py @@ -25,8 +25,10 @@ def init( save_dir: str = _DEFAULT_DIR, save_log: bool = True, save_code: bool = True, - save_api_invoke: bool = True, + save_api_invoke: bool = False, + use_monitor: bool = True, logger_level: LOG_LEVEL = _DEFAULT_LOG_LEVEL, + runtime_id: Optional[str] = None, agent_configs: Optional[Union[str, list, dict]] = None, ) -> Sequence[AgentBase]: """A unified entry to initialize the package, including model configs, @@ -40,6 +42,9 @@ def init( The project name, which is used to identify the project. name (`Optional[str]`, defaults to `None`): The name for runtime, which is used to identify this runtime. + runtime_id (`Optional[str]`, defaults to `None`): + The id for runtime, which is used to identify this runtime. Use + `None` will generate a random id. save_dir (`str`, defaults to `./runs`): The directory to save logs, files, codes, and api invocations. If `dir` is `None`, when saving logs, files, codes, and api @@ -51,6 +56,8 @@ def init( save_api_invoke (`bool`, defaults to `False`): Whether to save api invocations locally, including model and web search invocation. + use_monitor (`bool`, defaults to `True`): + Whether to activate the monitor. logger_level (`LOG_LEVEL`, defaults to `"INFO"`): The logging level of logger. agent_configs (`Optional[Union[str, list, dict]]`, defaults to `None`): @@ -63,9 +70,11 @@ def init( model_configs=model_configs, project=project, name=name, + runtime_id=runtime_id, save_dir=save_dir, save_api_invoke=save_api_invoke, save_log=save_log, + use_monitor=use_monitor, logger_level=logger_level, ) @@ -117,6 +126,7 @@ def init_process( save_dir: str = _DEFAULT_DIR, save_api_invoke: bool = False, save_log: bool = False, + use_monitor: bool = True, logger_level: LOG_LEVEL = _DEFAULT_LOG_LEVEL, ) -> None: """An entry to initialize the package in a process. @@ -139,17 +149,11 @@ def init_process( A sequence of pre-init model configs. save_log (`bool`, defaults to `False`): Whether to save logs locally. + use_monitor (`bool`, defaults to `True`): + Whether to activate the monitor. logger_level (`LOG_LEVEL`, defaults to `"INFO"`): The logging level of logger. """ - # Init logger - dir_log = str(file_manager.dir_log) if save_log else None - setup_logger(dir_log, logger_level) - - # Load model configs if needed - if model_configs is not None: - read_model_configs(model_configs) - # Init the runtime if project is not None: _runtime.project = project @@ -158,8 +162,19 @@ def init_process( if runtime_id is not None: _runtime.runtime_id = runtime_id + # Init logger + dir_log = str(file_manager.dir_log) if save_log else None + setup_logger(dir_log, logger_level) + + # Load model configs if needed + if model_configs is not None: + read_model_configs(model_configs) + # Init file manager and save configs by default file_manager.init(save_dir, save_api_invoke) # Init monitor - _ = MonitorFactory.get_monitor(db_path=file_manager.path_db) + _ = MonitorFactory.get_monitor( + db_path=file_manager.path_db, + impl_type="sqlite" if use_monitor else "dummy", + ) diff --git a/src/agentscope/_version.py b/src/agentscope/_version.py index aea9da892..512ade252 100644 --- a/src/agentscope/_version.py +++ b/src/agentscope/_version.py @@ -1,4 +1,4 @@ # -*- coding: utf-8 -*- """ Version of AgentScope.""" -__version__ = "0.0.4-alpha" +__version__ = "0.0.4" diff --git a/src/agentscope/agents/__init__.py b/src/agentscope/agents/__init__.py index 7ddd84b5f..b4d6a6927 100644 --- a/src/agentscope/agents/__init__.py +++ b/src/agentscope/agents/__init__.py @@ -1,12 +1,12 @@ # -*- coding: utf-8 -*- """ Import all agent related modules in the package. """ -from .agent import AgentBase +from .agent import AgentBase, DistConf from .operator import Operator from .dialog_agent import DialogAgent from .dict_dialog_agent import DictDialogAgent from .user_agent import UserAgent from .text_to_image_agent import TextToImageAgent -from .rpc_agent import RpcAgent, RpcAgentServerLauncher +from .rpc_agent import RpcAgent from .react_agent import ReActAgent from .rag_agents import LlamaIndexAgent @@ -19,7 +19,7 @@ "TextToImageAgent", "UserAgent", "ReActAgent", + "DistConf", "RpcAgent", - "RpcAgentServerLauncher", "LlamaIndexAgent", ] diff --git a/src/agentscope/agents/agent.py b/src/agentscope/agents/agent.py index dc32dcd5c..4341952d9 100644 --- a/src/agentscope/agents/agent.py +++ b/src/agentscope/agents/agent.py @@ -7,6 +7,7 @@ from typing import Sequence from typing import Union from typing import Any +from typing import Type import uuid from loguru import logger @@ -15,16 +16,116 @@ from agentscope.memory import TemporaryMemory -class _RecordInitSettingMeta(ABCMeta): - """A wrapper to record the init args into `_init_settings` field.""" +class _AgentMeta(ABCMeta): + """The meta-class for agent. + + 1. record the init args into `_init_settings` field. + 2. register class name into `registry` field. + """ + + def __init__(cls, name: Any, bases: Any, attrs: Any) -> None: + if not hasattr(cls, "_registry"): + cls._registry = {} + else: + if name in cls._registry: + logger.warning( + f"Agent class with name [{name}] already exists.", + ) + else: + cls._registry[name] = cls + super().__init__(name, bases, attrs) def __call__(cls, *args: tuple, **kwargs: dict) -> Any: + to_dist = kwargs.pop("to_dist", False) + if to_dist is True: + to_dist = DistConf() + if to_dist is not False and to_dist is not None: + from .rpc_agent import RpcAgent + + if cls is not RpcAgent and not issubclass(cls, RpcAgent): + return RpcAgent( + name=( + args[0] + if len(args) > 0 + else kwargs["name"] # type: ignore[arg-type] + ), + host=to_dist.pop( # type: ignore[arg-type] + "host", + "localhost", + ), + port=to_dist.pop("port", None), # type: ignore[arg-type] + max_pool_size=kwargs.pop( # type: ignore[arg-type] + "max_pool_size", + 8192, + ), + max_timeout_seconds=to_dist.pop( # type: ignore[arg-type] + "max_timeout_seconds", + 1800, + ), + local_mode=to_dist.pop( # type: ignore[arg-type] + "local_mode", + True, + ), + lazy_launch=to_dist.pop( # type: ignore[arg-type] + "lazy_launch", + True, + ), + agent_id=cls.generate_agent_id(), + connect_existing=False, + agent_class=cls, + agent_configs={ + "args": args, + "kwargs": kwargs, + "class_name": cls.__name__, + }, + ) instance = super().__call__(*args, **kwargs) - instance._init_settings = {"args": args, "kwargs": kwargs} + instance._init_settings = { + "args": args, + "kwargs": kwargs, + "class_name": cls.__name__, + } return instance -class AgentBase(Operator, metaclass=_RecordInitSettingMeta): +class DistConf(dict): + """Distribution configuration for agents.""" + + def __init__( + self, + host: str = "localhost", + port: int = None, + max_pool_size: int = 8192, + max_timeout_seconds: int = 1800, + local_mode: bool = True, + lazy_launch: bool = True, + ): + """Init the distributed configuration. + + Args: + host (`str`, defaults to `"localhost"`): + Hostname of the rpc agent server. + port (`int`, defaults to `None`): + Port of the rpc agent server. + max_pool_size (`int`, defaults to `8192`): + Max number of task results that the server can accommodate. + max_timeout_seconds (`int`, defaults to `1800`): + Timeout for task results. + local_mode (`bool`, defaults to `True`): + Whether the started rpc server only listens to local + requests. + lazy_launch (`bool`, defaults to `True`): + Only launch the server when the agent is called. + """ + self["host"] = host + self["port"] = port + self["max_pool_size"] = max_pool_size + self["max_timeout_seconds"] = max_timeout_seconds + self["local_mode"] = local_mode + self["lazy_launch"] = lazy_launch + + +class AgentBase(Operator, metaclass=_AgentMeta): """Base class for all agents. All agents should inherit from this class and implement the `reply` @@ -40,6 +141,7 @@ def __init__( model_config_name: str = None, use_memory: bool = True, memory_config: Optional[dict] = None, + to_dist: Optional[Union[DistConf, bool]] = False, ) -> None: r"""Initialize an agent from the given arguments. @@ -56,6 +158,31 @@ def __init__( Whether the agent has memory. memory_config (`Optional[dict]`): The config of memory. + to_dist (`Optional[Union[DistConf, bool]]`, default to `False`): + The configurations passed to :py:meth:`to_dist` method. Used in + :py:class:`_AgentMeta`, when this parameter is provided, + the agent will automatically be converted into its distributed + version. Below are some examples: + + .. code-block:: python + + # run as a sub process + agent = XXXAgent( + # ... other parameters + to_dist=True, + ) + + # connect to an existing agent server + agent = XXXAgent( + # ... other parameters + to_dist=DistConf( + host="", + port=, + # other parameters + ), + ) + + See :doc:`Tutorial` for detail. """ self.name = name self.memory_config = memory_config @@ -78,6 +205,12 @@ def __init__( # The audience of this agent, which means if this agent generates a # response, it will be passed to all agents in the audience. self._audience = None + # convert to distributed agent, conversion is in `_AgentMeta` + if to_dist is not False and to_dist is not None: + logger.info( + f"Convert {self.__class__.__name__}[{self.name}] into" + " a distributed agent.", + ) @classmethod def generate_agent_id(cls) -> str: @@ -85,6 +218,39 @@ def generate_agent_id(cls) -> str: # TODO: change cls.__name__ into a global unique agent_type return f"{cls.__name__}_{uuid.uuid4().hex}" + # todo: add a unique agent_type field to distinguish different agent class + @classmethod + def get_agent_class(cls, agent_class_name: str) -> Type[AgentBase]: + """Get the agent class based on the specific agent class name. + + Args: + agent_class_name (`str`): the name of the agent class. + + Raises: + ValueError: Agent class name not exits. + + Returns: + Type[AgentBase]: the AgentBase sub-class. + """ + if agent_class_name not in cls._registry: + raise ValueError(f"Agent [{agent_class_name}] not found.") + return cls._registry[agent_class_name] # type: ignore[return-value] + + @classmethod + def register_agent_class(cls, agent_class: Type[AgentBase]) -> None: + """Register the agent class into the registry. + + Args: + agent_class (Type[AgentBase]): the agent class to be registered. + """ + agent_class_name = agent_class.__name__ + if agent_class_name in cls._registry: + logger.info( + f"Agent class with name [{agent_class_name}] already exists.", + ) + else: + cls._registry[agent_class_name] = agent_class + def reply(self, x: dict = None) -> dict: """Define the actions taken by this agent. @@ -206,9 +372,9 @@ def to_dist( port: int = None, max_pool_size: int = 8192, max_timeout_seconds: int = 1800, - launch_server: bool = True, local_mode: bool = True, lazy_launch: bool = True, + launch_server: bool = None, ) -> AgentBase: """Convert current agent instance into a distributed version. @@ -218,14 +384,25 @@ def to_dist( port (`int`, defaults to `None`): Port of the rpc agent server. max_pool_size (`int`, defaults to `8192`): - Max number of task results that the server can accommodate. + Only takes effect when `host` and `port` are not filled in. + The max number of agent reply messages that the started agent + server can accommodate. Note that the oldest message will be + deleted after exceeding the pool size. max_timeout_seconds (`int`, defaults to `1800`): - Timeout for task results. + Only takes effect when `host` and `port` are not filled in. + Maximum time for reply messages to be cached in the launched + agent server. Note that expired messages will be deleted. local_mode (`bool`, defaults to `True`): - Whether the started rpc server only listens to local + Only takes effect when `host` and `port` are not filled in. + Whether the started agent server only listens to local requests. lazy_launch (`bool`, defaults to `True`): - Only launch the server when the agent is called. + Only takes effect when `host` and `port` are not filled in. + If `True`, launch the agent server when the agent is called, + otherwise, launch the agent server immediately. + launch_server(`bool`, defaults to `None`): + This field has been deprecated and will be removed in + future releases. Returns: `AgentBase`: the wrapped agent instance with distributed @@ -235,15 +412,20 @@ def to_dist( if issubclass(self.__class__, RpcAgent): return self + if launch_server is not None: + logger.warning( + "`launch_server` has been deprecated and will be removed in " + "future releases. When `host` and `port` is not provided, the " + "agent server will be launched automatically.", + ) return RpcAgent( + name=self.name, agent_class=self.__class__, agent_configs=self._init_settings, - name=self.name, host=host, port=port, max_pool_size=max_pool_size, max_timeout_seconds=max_timeout_seconds, - launch_server=launch_server, local_mode=local_mode, lazy_launch=lazy_launch, agent_id=self.agent_id, diff --git a/src/agentscope/agents/dict_dialog_agent.py b/src/agentscope/agents/dict_dialog_agent.py index 0ee8061b3..eb16690e0 100644 --- a/src/agentscope/agents/dict_dialog_agent.py +++ b/src/agentscope/agents/dict_dialog_agent.py @@ -1,59 +1,18 @@ # -*- coding: utf-8 -*- -"""A dict dialog agent that using `parse_func` and `fault_handler` to -parse the model response.""" -import json -from typing import Any, Optional, Callable -from loguru import logger +"""An agent that replies in a dictionary format.""" +from typing import Optional from ..message import Msg from .agent import AgentBase -from ..models import ModelResponse -from ..prompt import PromptType -from ..utils.tools import _convert_to_str - - -def parse_dict(response: ModelResponse) -> ModelResponse: - """Parse function for DictDialogAgent""" - try: - if response.text is not None: - response_dict = json.loads(response.text) - else: - raise ValueError( - f"The text field of the response s None: {response}", - ) - except json.decoder.JSONDecodeError: - # Sometimes LLM may return a response with single quotes, which is not - # a valid JSON format. We replace single quotes with double quotes and - # try to load it again. - # TODO: maybe using a more robust json library to handle this case - response_dict = json.loads(response.text.replace("'", '"')) - - return ModelResponse(raw=response_dict) - - -def default_response(response: ModelResponse) -> ModelResponse: - """The default response of fault_handler""" - return ModelResponse(raw={"speak": response.text}) +from ..parsers import ParserBase class DictDialogAgent(AgentBase): """An agent that generates response in a dict format, where user can - specify the required fields in the response via prompt, e.g. - - .. code-block:: python + specify the required fields in the response via specifying the parser - prompt = "... Response in the following format that can be loaded by - python json.loads() - { - "thought": "thought", - "speak": "thoughts summary to say to others", - # ... - }" - - This agent class is an example for using `parse_func` and `fault_handler` - to parse the output from the model, and handling the fault when parsing - fails. We take "speak" as a required field in the response, and print - the speak field as the output response. + About parser, please refer to our + [tutorial](https://modelscope.github.io/agentscope/en/tutorial/203-parser.html) For usage example, please refer to the example of werewolf in `examples/game_werewolf`""" @@ -65,10 +24,7 @@ def __init__( model_config_name: str, use_memory: bool = True, memory_config: Optional[dict] = None, - parse_func: Optional[Callable[..., Any]] = parse_dict, - fault_handler: Optional[Callable[..., Any]] = default_response, max_retries: Optional[int] = 3, - prompt_type: Optional[PromptType] = None, ) -> None: """Initialize the dict dialog agent. @@ -85,19 +41,9 @@ def __init__( Whether the agent has memory. memory_config (`Optional[dict]`, defaults to `None`): The config of memory. - parse_func (`Optional[Callable[..., Any]]`, defaults to `parse_dict`): - The function used to parse the model output, - e.g. `json.loads`, which is used to extract json from the - output. - fault_handler (`Optional[Callable[..., Any]]`, defaults to `default_response`): - The function used to handle the fault when parse_func fails - to parse the model output. max_retries (`Optional[int]`, defaults to `None`): The maximum number of retries when failed to parse the model output. - prompt_type (`Optional[PromptType]`, defaults to `PromptType.LIST`): - The type of the prompt organization, chosen from - `PromptType.LIST` or `PromptType.STRING`. """ # noqa super().__init__( name=name, @@ -107,18 +53,17 @@ def __init__( memory_config=memory_config, ) - # record the func and handler for parsing and handling faults - self.parse_func = parse_func - self.fault_handler = fault_handler + self.parser = None self.max_retries = max_retries - if prompt_type is not None: - logger.warning( - "The argument `prompt_type` is deprecated and " - "will be removed in the future.", - ) + def set_parser(self, parser: ParserBase) -> None: + """Set response parser, which will provide 1) format instruction; 2) + response parsing; 3) filtering fields when returning message, storing + message in memory. So developers only need to change the + parser, and the agent will work as expected. + """ + self.parser = parser - # TODO change typing from dict to MSG def reply(self, x: dict = None) -> dict: """Reply function of the agent. Processes the input data, generates a prompt using the current @@ -151,42 +96,29 @@ def reply(self, x: dict = None) -> dict: self.memory and self.memory.get_memory() or x, # type: ignore[arg-type] + Msg("system", self.parser.format_instruction, "system"), ) # call llm - response = self.model( + res = self.model( prompt, - parse_func=self.parse_func, - fault_handler=self.fault_handler, + parse_func=self.parser.parse, max_retries=self.max_retries, - ).raw - - # logging raw messages in debug mode - logger.debug(json.dumps(response, indent=4, ensure_ascii=False)) - - # In this agent, if the response is a dict, we treat "speak" as a - # special key, which represents the text to be spoken - if isinstance(response, dict) and "speak" in response: - msg = Msg( - self.name, - response["speak"], - role="assistant", - **response, - ) - else: - msg = Msg(self.name, response, role="assistant") - - # Print/speak the message in this agent's voice - self.speak(msg) + ) - # record to memory - if self.memory: - # Convert the response dict into a string to store in memory - msg_memory = Msg( - name=self.name, - content=_convert_to_str(response), - role="assistant", - ) - self.memory.add(msg_memory) + # Filter the parsed response by keys for storing in memory, returning + # in the reply function, and feeding into the metadata field in the + # returned message object. + self.memory.add( + Msg(self.name, self.parser.to_memory(res.parsed), "assistant"), + ) + + msg = Msg( + self.name, + content=self.parser.to_content(res.parsed), + role="assistant", + metadata=self.parser.to_metadata(res.parsed), + ) + self.speak(msg) return msg diff --git a/src/agentscope/agents/rag_agents.py b/src/agentscope/agents/rag_agents.py index 181eb48ee..99902ff07 100644 --- a/src/agentscope/agents/rag_agents.py +++ b/src/agentscope/agents/rag_agents.py @@ -15,7 +15,7 @@ CHECKING_PROMPT = """ - Does the retrieved content is relevant to the query? + Is the retrieved content relevant to the query? Retrieved content: {} Query: {} Only answer YES or NO. @@ -143,7 +143,6 @@ def reply(self, x: dict = None) -> dict: query, ), ) - print(msg) checking = self.model([msg]) logger.info(checking) checking = checking.text.lower() diff --git a/src/agentscope/agents/react_agent.py b/src/agentscope/agents/react_agent.py index 39b6a5d00..cdc81788b 100644 --- a/src/agentscope/agents/react_agent.py +++ b/src/agentscope/agents/react_agent.py @@ -136,6 +136,8 @@ def __init__( "function": service_toolkit.tools_calling_format, }, required_keys=["thought", "speak", "function"], + # Only print the speak field when verbose is False + keys_to_content=True if self.verbose else "speak", ) def reply(self, x: dict = None) -> dict: @@ -155,9 +157,8 @@ def reply(self, x: dict = None) -> dict: "system", self.parser.format_instruction, role="system", + echo=self.verbose, ) - if self.verbose: - self.speak(hint_msg) # Prepare prompt for the model prompt = self.model.format(self.memory.get_memory(), hint_msg) @@ -171,16 +172,21 @@ def reply(self, x: dict = None) -> dict: ) # Record the response in memory - msg_response = Msg(self.name, res.text, "assistant") - self.memory.add(msg_response) + self.memory.add( + Msg( + self.name, + self.parser.to_memory(res.parsed), + "assistant", + ), + ) # Print out the response - if self.verbose: - self.speak(msg_response) - else: - self.speak( - Msg(self.name, res.parsed["speak"], "assistant"), - ) + msg_returned = Msg( + self.name, + self.parser.to_content(res.parsed), + "assistant", + ) + self.speak(msg_returned) # Skip the next steps if no need to call tools # The parsed field is a dictionary @@ -192,7 +198,7 @@ def reply(self, x: dict = None) -> dict: and len(arg_function) == 0 ): # Only the speak field is exposed to users or other agents - return Msg(self.name, res.parsed["speak"], "assistant") + return msg_returned # Only catch the response parsing error and expose runtime # errors to developers for debugging @@ -244,9 +250,8 @@ def reply(self, x: dict = None) -> dict: "iterations. Now generate a reply by summarizing the current " "situation.", role="system", + echo=self.verbose, ) - if self.verbose: - self.speak(hint_msg) # Generate a reply by summarizing the current situation prompt = self.model.format(self.memory.get_memory(), hint_msg) diff --git a/src/agentscope/agents/rpc_agent.py b/src/agentscope/agents/rpc_agent.py index 14b55d7bb..306dfa900 100644 --- a/src/agentscope/agents/rpc_agent.py +++ b/src/agentscope/agents/rpc_agent.py @@ -1,60 +1,14 @@ # -*- coding: utf-8 -*- """ Base class for Rpc Agent """ +from typing import Type, Optional, Union, Sequence -from multiprocessing import Process, Event, Pipe, cpu_count -from multiprocessing.synchronize import Event as EventClass -import socket -import threading -import json -import traceback -from typing import Any, Optional, Union, Type, Sequence -from concurrent import futures -from loguru import logger - -try: - import grpc - from grpc import ServicerContext -except ImportError: - grpc = None - ServicerContext = Any - -try: - from expiringdict import ExpiringDict -except ImportError: - ExpiringDict = None - -from agentscope._init import init_process, _INIT_SETTINGS from agentscope.agents.agent import AgentBase from agentscope.message import ( - Msg, PlaceholderMessage, - deserialize, serialize, ) -from agentscope.rpc import ( - RpcAgentClient, - RpcMsg, - RpcAgentServicer, - add_RpcAgentServicer_to_server, -) - - -def rpc_servicer_method( # type: ignore[no-untyped-def] - func, -): - """A decorator used to identify that the specific method is an rpc agent - servicer method, which can only be run in the rpc server process. - """ - - def inner(rpc_agent, msg): # type: ignore[no-untyped-def] - if not rpc_agent.is_servicer: - error_msg = f"Detect main process try to use rpc servicer method \ - [{func.__name__}]" - logger.error(error_msg) - raise RuntimeError(error_msg) - return func(rpc_agent, msg) - - return inner +from agentscope.rpc import RpcAgentClient +from agentscope.server.launcher import RpcAgentServerLauncher class RpcAgent(AgentBase): @@ -63,32 +17,29 @@ class RpcAgent(AgentBase): def __init__( self, name: str, - agent_class: Type[AgentBase], - agent_configs: Optional[dict] = None, host: str = "localhost", port: int = None, - launch_server: bool = True, + agent_class: Type[AgentBase] = None, + agent_configs: Optional[dict] = None, max_pool_size: int = 8192, max_timeout_seconds: int = 1800, local_mode: bool = True, lazy_launch: bool = True, agent_id: str = None, - create_with_agent_configs: bool = True, + connect_existing: bool = False, ) -> None: """Initialize a RpcAgent instance. Args: - name (`str`): Name of the agent. - agent_class (`Type[AgentBase]`): - The AgentBase subclass encapsulated by this wrapper. - agent_configs (`dict`, defaults to `None`): The args used to - initialize the agent_class. - host (`str`, defaults to `"localhost"`): + name (`str`): the name of the agent. + host (`str`, defaults to `localhost`): Hostname of the rpc agent server. port (`int`, defaults to `None`): Port of the rpc agent server. - launch_server (`bool`, defaults to `True`): - Whether to launch the gRPC agent server. + agent_class (`Type[AgentBase]`): + the AgentBase subclass of the source agent. + agent_configs (`dict`): The args used to + initialize the agent, generated by `_AgentMeta`. max_pool_size (`int`, defaults to `8192`): Max number of task results that the server can accommodate. max_timeout_seconds (`int`, defaults to `1800`): @@ -101,34 +52,31 @@ def __init__( agent_id (`str`, defaults to `None`): The agent id of this instance. If `None`, it will be generated randomly. - create_with_agent_configs (`bool`, defaults to `True`): - Only takes effect when `agent_configs` is provided. - If true, create the agent instance for the agent with - provided `agent_configs`, otherwise uses the agent server's - default parameters. + connect_existing (`bool`, defaults to `False`): + Set to `True`, if the agent is already running on the agent + server. """ super().__init__(name=name) + self.agent_class = agent_class + self.agent_configs = agent_configs self.host = host self.port = port self.server_launcher = None self.client = None + self.connect_existing = connect_existing if agent_id is not None: self._agent_id = agent_id - else: - self._agent_id = agent_class.generate_agent_id() - self.agent_class = agent_class + # if host and port are not provided, launch server locally + launch_server = port is None if launch_server: + self.host = "localhost" self.server_launcher = RpcAgentServerLauncher( - agent_class=agent_class, - agent_args=agent_configs["args"] if agent_configs else None, - agent_kwargs=( - agent_configs["kwargs"] if agent_configs else None - ), - host=host, + host=self.host, port=port, max_pool_size=max_pool_size, max_timeout_seconds=max_timeout_seconds, local_mode=local_mode, + custom_agents=[agent_class], ) if not lazy_launch: self._launch_server() @@ -138,9 +86,8 @@ def __init__( port=self.port, agent_id=self.agent_id, ) - self.client.create_agent( - agent_configs if create_with_agent_configs else None, - ) + if not self.connect_existing: + self.client.create_agent(agent_configs) def _launch_server(self) -> None: """Launch a rpc server and update the port and the client""" @@ -151,6 +98,7 @@ def _launch_server(self) -> None: port=self.port, agent_id=self.agent_id, ) + self.client.create_agent(self.agent_configs) def reply(self, x: dict = None) -> dict: if self.client is None: @@ -203,14 +151,14 @@ def clone_instances( # clone instances without agent server for _ in range(generated_instance_number): + new_agent_id = self.client.call_func("_clone_agent") generated_instances.append( RpcAgent( name=self.name, - agent_class=self.agent_class, host=self.host, port=self.port, - launch_server=False, - create_with_agent_configs=False, + agent_id=new_agent_id, + connect_existing=True, ), ) return generated_instances @@ -222,505 +170,3 @@ def stop(self) -> None: def __del__(self) -> None: self.stop() - - -def setup_rpc_agent_server( - agent_class: Type[AgentBase], - agent_args: tuple, - agent_kwargs: dict, - host: str, - port: int, - init_settings: dict = None, - start_event: EventClass = None, - stop_event: EventClass = None, - pipe: int = None, - local_mode: bool = True, - max_pool_size: int = 8192, - max_timeout_seconds: int = 1800, -) -> None: - """Setup gRPC server rpc agent. - - Args: - agent_class (`Type[AgentBase]`): - A subclass of AgentBase. - agent_args (`tuple`): The args tuple used to initialize the - agent_class. - agent_kwargs (`dict`): The args dict used to initialize the - agent_class. - host (`str`, defaults to `"localhost"`): - Hostname of the rpc agent server. - port (`int`): - The socket port monitored by grpc server. - init_settings (`dict`, defaults to `None`): - Init settings for agentscope.init. - start_event (`EventClass`, defaults to `None`): - An Event instance used to determine whether the child process - has been started. - stop_event (`EventClass`, defaults to `None`): - The stop Event instance used to determine whether the child - process has been stopped. - pipe (`int`, defaults to `None`): - A pipe instance used to pass the actual port of the server. - local_mode (`bool`, defaults to `None`): - Only listen to local requests. - max_pool_size (`int`, defaults to `8192`): - Max number of task results that the server can accommodate. - max_timeout_seconds (`int`, defaults to `1800`): - Timeout for task results. - """ - - if init_settings is not None: - init_process(**init_settings) - servicer = RpcServerSideWrapper( - agent_class, - agent_args, - agent_kwargs, - host=host, - port=port, - max_pool_size=max_pool_size, - max_timeout_seconds=max_timeout_seconds, - ) - while True: - try: - port = check_port(port) - servicer.port = port - logger.info( - f"Starting rpc server [{agent_class.__name__}] at port" - f" [{port}]...", - ) - server = grpc.server( - futures.ThreadPoolExecutor(max_workers=cpu_count()), - ) - add_RpcAgentServicer_to_server(servicer, server) - if local_mode: - server.add_insecure_port(f"localhost:{port}") - else: - server.add_insecure_port(f"0.0.0.0:{port}") - server.start() - break - except OSError: - logger.warning( - f"Failed to start rpc server at port [{port}]" - f"try another port", - ) - logger.info( - f"rpc server [{agent_class.__name__}] at port [{port}] started " - "successfully", - ) - if start_event is not None: - pipe.send(port) - start_event.set() - stop_event.wait() - logger.info( - f"Stopping rpc server [{agent_class.__name__}] at port [{port}]", - ) - server.stop(1.0).wait() - else: - server.wait_for_termination() - logger.info( - f"rpc server [{agent_class.__name__}] at port [{port}] stopped " - "successfully", - ) - - -def find_available_port() -> int: - """Get an unoccupied socket port number.""" - with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s: - s.bind(("", 0)) - return s.getsockname()[1] - - -def check_port(port: Optional[int] = None) -> int: - """Check if the port is available. - - Args: - port (`int`): - the port number being checked. - - Returns: - `int`: the port number that passed the check. If the port is found - to be occupied, an available port number will be automatically - returned. - """ - if port is None: - new_port = find_available_port() - logger.warning( - "gRpc server port is not provided, automatically select " - f"[{new_port}] as the port number.", - ) - return new_port - with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s: - if s.connect_ex(("localhost", port)) == 0: - new_port = find_available_port() - logger.warning( - f"Port [{port}] is occupied, use [{new_port}] instead", - ) - return new_port - return port - - -class RpcAgentServerLauncher: - """Launcher of rpc agent server.""" - - def __init__( - self, - agent_class: Type[AgentBase] = None, - agent_args: tuple = (), - agent_kwargs: dict = None, - host: str = "localhost", - port: int = None, - max_pool_size: int = 8192, - max_timeout_seconds: int = 1800, - local_mode: bool = False, - ) -> None: - """Init a rpc agent server launcher. - - Args: - agent_class (`Type[AgentBase]`, defaults to `None`): - The AgentBase subclass encapsulated by this wrapper. - agent_args (`tuple`): The args tuple used to initialize the - agent_class. - agent_kwargs (`dict`): The args dict used to initialize the - agent_class. - host (`str`, defaults to `"localhost"`): - Hostname of the rpc agent server. - port (`int`, defaults to `None`): - Port of the rpc agent server. - max_pool_size (`int`, defaults to `8192`): - Max number of task results that the server can accommodate. - max_timeout_seconds (`int`, defaults to `1800`): - Timeout for task results. - local_mode (`bool`, defaults to `False`): - Whether the started rpc server only listens to local - requests. - """ - self.agent_class = agent_class - self.agent_args = agent_args - self.agent_kwargs = agent_kwargs - self.host = host - self.port = check_port(port) - self.max_pool_size = max_pool_size - self.max_timeout_seconds = max_timeout_seconds - self.local_mode = local_mode - self.server = None - self.stop_event = None - self.parent_con = None - - def _launch_in_main(self) -> None: - """Launch gRPC server in main-process""" - server_thread = threading.Thread( - target=setup_rpc_agent_server, - kwargs={ - "agent_class": self.agent_class, - "agent_args": self.agent_args, - "agent_kwargs": self.agent_kwargs, - "host": self.host, - "port": self.port, - "max_pool_size": self.max_pool_size, - "max_timeout_seconds": self.max_timeout_seconds, - "local_mode": self.local_mode, - }, - ) - server_thread.start() - logger.info( - f"Launch [{self.agent_class.__name__}] server at " - f"[{self.host}:{self.port}] success", - ) - server_thread.join() - - def _launch_in_sub(self) -> None: - """Launch gRPC server in sub-process.""" - self.stop_event = Event() - self.parent_con, child_con = Pipe() - start_event = Event() - server_process = Process( - target=setup_rpc_agent_server, - kwargs={ - "agent_class": self.agent_class, - "agent_args": self.agent_args, - "agent_kwargs": self.agent_kwargs, - "host": self.host, - "port": self.port, - "init_settings": _INIT_SETTINGS, - "start_event": start_event, - "stop_event": self.stop_event, - "pipe": child_con, - "max_pool_size": self.max_pool_size, - "max_timeout_seconds": self.max_timeout_seconds, - "local_mode": self.local_mode, - }, - ) - server_process.start() - self.port = self.parent_con.recv() - start_event.wait() - self.server = server_process - logger.info( - f"Launch [{self.agent_class.__name__}] server at " - f"[{self.host}:{self.port}] success", - ) - - def launch(self, in_subprocess: bool = True) -> None: - """launch a rpc agent server. - - Args: - in_subprocess (bool, optional): launch the server in subprocess. - Defaults to True. For agents that need to obtain command line - input, such as UserAgent, please set this value to False. - """ - if in_subprocess: - self._launch_in_sub() - else: - self._launch_in_main() - - def wait_until_terminate(self) -> None: - """Wait for server process""" - if self.server is not None: - self.server.join() - - def shutdown(self) -> None: - """Shutdown the rpc agent server.""" - if self.server is not None: - if self.stop_event is not None: - self.stop_event.set() - self.stop_event = None - self.server.join() - if self.server.is_alive(): - self.server.kill() - logger.info( - f"Rpc server [{self.agent_class.__name__}] at port" - f" [{self.port}] is killed.", - ) - self.server = None - - -class RpcServerSideWrapper(RpcAgentServicer): - """A wrapper to extend an AgentBase into a gRPC Servicer.""" - - def __init__( - self, - agent_class: Type[AgentBase], - agent_args: tuple, - agent_kwargs: dict, - host: str = "localhost", - port: int = None, - max_pool_size: int = 8192, - max_timeout_seconds: int = 1800, - ): - """Init the service side wrapper. - - Args: - agent_class (`Type[AgentBase]`): The AgentBase subclass - encapsulated by this wrapper. - agent_args (`tuple`): The args tuple used to initialize the - agent_class. - agent_kwargs (`dict`): The args dict used to initialize the - agent_class. - host (`str`, defaults to "localhost"): - Hostname of the rpc agent server. - port (`int`, defaults to `None`): - Port of the rpc agent server. - max_pool_size (`int`, defaults to `8192`): - The max number of task results that the server can - accommodate. Note that the oldest result will be deleted - after exceeding the pool size. - max_timeout_seconds (`int`, defaults to `1800`): - Timeout for task results. Note that expired results will be - deleted. - """ - self.agent_class = agent_class - self.agent_args = agent_args - self.agent_kwargs = agent_kwargs - self.host = host - self.port = port - self.result_pool = ExpiringDict( - max_len=max_pool_size, - max_age_seconds=max_timeout_seconds, - ) - self.executor = futures.ThreadPoolExecutor(max_workers=cpu_count()) - self.task_id_lock = threading.Lock() - self.agent_id_lock = threading.Lock() - self.task_id_counter = 0 - self.agent_pool: dict[str, AgentBase] = {} - - def get_task_id(self) -> int: - """Get the auto-increment task id.""" - with self.task_id_lock: - self.task_id_counter += 1 - return self.task_id_counter - - def check_and_generate_agent( - self, - agent_id: str, - agent_configs: dict = None, - ) -> None: - """ - Check whether the agent exists, and create new agent instance - for new agent. - - Args: - agent_id (`str`): the agent id. - """ - with self.agent_id_lock: - if agent_id not in self.agent_pool: - if agent_configs is not None: - agent_instance = self.agent_class( - *agent_configs["args"], - **agent_configs["kwargs"], - ) - else: - agent_instance = self.agent_class( - *self.agent_args, - **self.agent_kwargs, - ) - agent_instance._agent_id = agent_id # pylint: disable=W0212 - self.agent_pool[agent_id] = agent_instance - logger.info(f"create agent instance [{agent_id}]") - - def check_and_delete_agent(self, agent_id: str) -> None: - """ - Check whether the agent exists, and delete the agent instance - for the agent_id. - - Args: - agent_id (`str`): the agent id. - """ - with self.agent_id_lock: - if agent_id in self.agent_pool: - self.agent_pool.pop(agent_id) - logger.info(f"delete agent instance [{agent_id}]") - - def call_func(self, request: RpcMsg, _: ServicerContext) -> RpcMsg: - """Call the specific servicer function.""" - if hasattr(self, request.target_func): - if request.target_func not in ["_create_agent", "_get"]: - self.check_and_generate_agent(request.agent_id) - return getattr(self, request.target_func)(request) - else: - # TODO: support other user defined method - logger.error(f"Unsupported method {request.target_func}") - return RpcMsg( - value=Msg( - name=self.agent_pool[request.agent_id].name, - content=f"Unsupported method {request.target_func}", - role="assistant", - ).serialize(), - ) - - def _reply(self, request: RpcMsg) -> RpcMsg: - """Call function of RpcAgentService - - Args: - request (`RpcMsg`): - Message containing input parameters or input parameter - placeholders. - - Returns: - `RpcMsg`: A serialized Msg instance with attributes name, host, - port and task_id - """ - if request.value: - msg = deserialize(request.value) - else: - msg = None - task_id = self.get_task_id() - self.result_pool[task_id] = threading.Condition() - self.executor.submit( - self.process_messages, - task_id, - request.agent_id, - msg, # type: ignore[arg-type] - ) - return RpcMsg( - value=Msg( - name=self.agent_pool[request.agent_id].name, - content=None, - task_id=task_id, - ).serialize(), - ) - - def _get(self, request: RpcMsg) -> RpcMsg: - """Get function of RpcAgentService - - Args: - request (`RpcMsg`): - Identifier of message, with json format:: - - { - 'task_id': int - } - - Returns: - `RpcMsg`: Concrete values of the specific message (or part of it). - """ - msg = json.loads(request.value) - while True: - result = self.result_pool.get(msg["task_id"]) - if isinstance(result, threading.Condition): - with result: - result.wait(timeout=1) - else: - break - return RpcMsg(value=result.serialize()) - - def _observe(self, request: RpcMsg) -> RpcMsg: - """Observe function of RpcAgentService - - Args: - request (`RpcMsg`): - The serialized input to be observed. - - Returns: - `RpcMsg`: Empty RpcMsg. - """ - msgs = deserialize(request.value) - for msg in msgs: - if isinstance(msg, PlaceholderMessage): - msg.update_value() - self.agent_pool[request.agent_id].observe(msgs) - return RpcMsg() - - def _create_agent(self, request: RpcMsg) -> RpcMsg: - """Create a new agent instance for the agent_id. - - Args: - request (RpcMsg): request message with a `agent_id` field. - """ - self.check_and_generate_agent( - request.agent_id, - agent_configs=json.loads(request.value) if request.value else None, - ) - return RpcMsg() - - def _delete_agent(self, request: RpcMsg) -> RpcMsg: - """Delete the agent instance of the specific sesssion_id. - - Args: - request (RpcMsg): request message with a `agent_id` field. - """ - self.check_and_delete_agent(request.agent_id) - return RpcMsg() - - def process_messages( - self, - task_id: int, - agent_id: str, - task_msg: dict = None, - ) -> None: - """Task processing.""" - if isinstance(task_msg, PlaceholderMessage): - task_msg.update_value() - cond = self.result_pool[task_id] - try: - result = self.agent_pool[agent_id].reply(task_msg) - self.result_pool[task_id] = result - except Exception: - error_msg = traceback.format_exc() - logger.error(f"Error in agent [{agent_id}]:\n{error_msg}") - self.result_pool[task_id] = Msg( - name="ERROR", - role="assistant", - __status="ERROR", - content=f"Error in agent [{agent_id}]:\n{error_msg}", - ) - with cond: - cond.notify_all() diff --git a/src/agentscope/agents/user_agent.py b/src/agentscope/agents/user_agent.py index 38bd46de1..ee97a935f 100644 --- a/src/agentscope/agents/user_agent.py +++ b/src/agentscope/agents/user_agent.py @@ -81,7 +81,9 @@ def reply( # Input url of file, image, video, audio or website url = None if self.require_url: - url = input("URL: ") + url = input("URL (or Enter to skip): ") + if url == "": + url = None # Add additional keys msg = Msg( diff --git a/src/agentscope/memory/memory.py b/src/agentscope/memory/memory.py index e7671a12c..14b82ee25 100644 --- a/src/agentscope/memory/memory.py +++ b/src/agentscope/memory/memory.py @@ -7,11 +7,13 @@ """ from abc import ABC, abstractmethod -from typing import Iterable +from typing import Iterable, Sequence from typing import Optional from typing import Union from typing import Callable +from ..message import MessageBase + class MemoryBase(ABC): """Base class for memory.""" @@ -33,6 +35,8 @@ def __init__( def update_config(self, config: dict) -> None: """ Configure memory as specified in config + Args: + config (`dict`): Configuration of resetting this memory """ self.config = config @@ -43,14 +47,29 @@ def get_memory( filter_func: Optional[Callable[[int, dict], bool]] = None, ) -> list: """ - Return a certain range (`recent_n` or all) of memory, filtered by - `filter_func` + Return a certain range (`recent_n` or all) of memory, + filtered by `filter_func` + Args: + recent_n (int, optional): + indicate the most recent N memory pieces to be returned. + filter_func (Optional[Callable[[int, dict], bool]]): + filter function to decide which pieces of memory should + be returned, taking the index and a piece of memory as + input and return True (return this memory) or False + (does not return) """ @abstractmethod - def add(self, memories: Union[list[dict], dict, None]) -> None: + def add( + self, + memories: Union[Sequence[dict], dict, None], + ) -> None: """ Adding new memory fragment, depending on how the memory are stored + Args: + memories (Union[Sequence[dict], dict, None]): + Memories to be added. If the memory is not in MessageBase, + it will first be converted into a message type. """ @abstractmethod @@ -58,26 +77,48 @@ def delete(self, index: Union[Iterable, int]) -> None: """ Delete memory fragment, depending on how the memory are stored and matched + Args: + index (Union[Iterable, int]): + indices of the memory fragments to delete """ @abstractmethod def load( self, - memories: Union[str, dict, list], + memories: Union[str, list[MessageBase], MessageBase], overwrite: bool = False, ) -> None: """ Load memory, depending on how the memory are passed, design to load from both file or dict + Args: + memories (Union[str, list[MessageBase], MessageBase]): + memories to be loaded. + If it is in str type, it will be first checked if it is a + file; otherwise it will be deserialized as messages. + Otherwise, memories must be either in message type or list + of messages. + overwrite (bool): + if True, clear the current memory before loading the new ones; + if False, memories will be appended to the old one at the end. """ @abstractmethod def export( self, - to_mem: bool = False, file_path: Optional[str] = None, + to_mem: bool = False, ) -> Optional[list]: - """Export memory, depending on how the memory are stored""" + """ + Export memory, depending on how the memory are stored + Args: + file_path (Optional[str]): + file path to save the memory to. + to_mem (Optional[str]): + if True, just return the list of messages in memory + Notice: this method prevents file_path is None when to_mem + is False. + """ @abstractmethod def clear(self) -> None: diff --git a/src/agentscope/memory/temporary_memory.py b/src/agentscope/memory/temporary_memory.py index b06de508a..356fa4d96 100644 --- a/src/agentscope/memory/temporary_memory.py +++ b/src/agentscope/memory/temporary_memory.py @@ -16,6 +16,14 @@ from ..models import load_model_by_config_name from ..service.retrieval.retrieval_from_list import retrieve_from_list from ..service.retrieval.similarity import Embedding +from ..message import ( + deserialize, + serialize, + MessageBase, + Msg, + Tht, + PlaceholderMessage, +) class TemporaryMemory(MemoryBase): @@ -28,6 +36,16 @@ def __init__( config: Optional[dict] = None, embedding_model: Union[str, Callable] = None, ) -> None: + """ + Temporary memory module for conversation. + Args: + config (dict): + configuration of the memory + embedding_model (Union[str, Callable]) + if the temporary memory needs to be embedded, + then either pass the name of embedding model or + the embedding model itself. + """ super().__init__(config) self._content = [] @@ -43,10 +61,20 @@ def add( memories: Union[Sequence[dict], dict, None], embed: bool = False, ) -> None: + # pylint: disable=too-many-branches + """ + Adding new memory fragment, depending on how the memory are stored + Args: + memories (Union[Sequence[dict], dict, None]): + memories to be added. If the memory is not in MessageBase, + it will first be converted into a message type. + embed (bool): + whether to generate embedding for the new added memories + """ if memories is None: return - if not isinstance(memories, list): + if not isinstance(memories, Sequence): record_memories = [memories] else: record_memories = memories @@ -54,6 +82,27 @@ def add( # if memory doesn't have id attribute, we skip the checking memories_idx = set(_.id for _ in self._content if hasattr(_, "id")) for memory_unit in record_memories: + if not issubclass(type(memory_unit), MessageBase): + try: + if ( + "name" in memory_unit + and memory_unit["name"] == "thought" + ): + memory_unit = Tht(**memory_unit) + else: + memory_unit = Msg(**memory_unit) + except Exception as exc: + raise ValueError( + f"Cannot add {memory_unit} to memory, " + f"must be with subclass of MessageBase", + ) from exc + + # in case this is a PlaceholderMessage, try to update + # the values first + if isinstance(memory_unit, PlaceholderMessage): + memory_unit.update_value() + memory_unit = Msg(**memory_unit) + # add to memory if it's new if ( not hasattr(memory_unit, "id") @@ -71,6 +120,13 @@ def add( self._content.append(memory_unit) def delete(self, index: Union[Iterable, int]) -> None: + """ + Delete memory fragment, depending on how the memory are stored + and matched + Args: + index (Union[Iterable, int]): + indices of the memory fragments to delete + """ if self.size() == 0: logger.warning( "The memory is empty, and the delete operation is " @@ -101,16 +157,26 @@ def delete(self, index: Union[Iterable, int]) -> None: def export( self, - to_mem: bool = False, file_path: Optional[str] = None, + to_mem: bool = False, ) -> Optional[list]: - """Export memory to json file""" + """ + Export memory, depending on how the memory are stored + Args: + file_path (Optional[str]): + file path to save the memory to. The messages will + be serialized and written to the file. + to_mem (Optional[str]): + if True, just return the list of messages in memory + Notice: this method prevents file_path is None when to_mem + is False. + """ if to_mem: return self._content if to_mem is False and file_path is not None: with open(file_path, "w", encoding="utf-8") as f: - json.dump(self._content, f, indent=4) + f.write(serialize(self._content)) else: raise NotImplementedError( "file type only supports " @@ -120,16 +186,30 @@ def export( def load( self, - memories: Union[str, dict, list], + memories: Union[str, list[MessageBase], MessageBase], overwrite: bool = False, ) -> None: + """ + Load memory, depending on how the memory are passed, design to load + from both file or dict + Args: + memories (Union[str, list[MessageBase], MessageBase]): + memories to be loaded. + If it is in str type, it will be first checked if it is a + file; otherwise it will be deserialized as messages. + Otherwise, memories must be either in message type or list + of messages. + overwrite (bool): + if True, clear the current memory before loading the new ones; + if False, memories will be appended to the old one at the end. + """ if isinstance(memories, str): if os.path.isfile(memories): with open(memories, "r", encoding="utf-8") as f: - self.add(json.load(f)) + load_memories = deserialize(f.read()) else: try: - load_memories = json.loads(memories) + load_memories = deserialize(memories) if not isinstance(load_memories, dict) and not isinstance( load_memories, list, diff --git a/src/agentscope/message.py b/src/agentscope/message.py index 9327ca476..372d6e624 100644 --- a/src/agentscope/message.py +++ b/src/agentscope/message.py @@ -91,6 +91,28 @@ def serialize(self) -> str: class Msg(MessageBase): """The Message class.""" + id: str + """The id of the message.""" + + name: str + """The name of who send the message.""" + + content: Any + """The content of the message.""" + + role: Literal["system", "user", "assistant"] + """The role of the message sender.""" + + metadata: Optional[dict] + """Save the information for application's control flow, or other + purposes.""" + + url: Optional[Union[Sequence[str], str]] + """A url to file, image, video, audio or website.""" + + timestamp: str + """The timestamp of the message.""" + def __init__( self, name: str, @@ -99,6 +121,7 @@ def __init__( url: Optional[Union[Sequence[str], str]] = None, timestamp: Optional[str] = None, echo: bool = False, + metadata: Optional[Union[dict, str]] = None, **kwargs: Any, ) -> None: """Initialize the message object @@ -117,6 +140,11 @@ def __init__( timestamp (`Optional[str]`, defaults to `None`): The timestamp of the message, if None, it will be set to current time. + echo (`bool`, defaults to `False`): + Whether to print the message to the console. + metadata (`Optional[Union[dict, str]]`, defaults to `None`): + Save the information for application's control flow, or other + purposes. **kwargs (`Any`): Other attributes of the message. """ @@ -134,6 +162,7 @@ def __init__( role=role or "assistant", url=url, timestamp=timestamp, + metadata=metadata, **kwargs, ) if echo: @@ -192,12 +221,18 @@ def __init__( self, content: Any, timestamp: Optional[str] = None, + **kwargs: Any, ) -> None: + if "name" in kwargs: + kwargs.pop("name") + if "role" in kwargs: + kwargs.pop("role") super().__init__( name="thought", content=content, role="assistant", timestamp=timestamp, + **kwargs, ) def to_str(self) -> str: @@ -286,7 +321,11 @@ def __init__( self._port: int = port self._task_id: int = task_id else: - self._stub = call_in_thread(client, x, "_reply") + self._stub = call_in_thread( + client, + x.serialize() if x is not None else "", + "_reply", + ) self._host = client.host self._port = client.port self._task_id = None @@ -344,7 +383,15 @@ def update_value(self) -> MessageBase: def __update_task_id(self) -> None: if self._stub is not None: - resp = deserialize(self._stub.get_response()) + try: + resp = deserialize(self._stub.get_response()) + except Exception as e: + logger.error( + f"Failed to get task_id: {self._stub.get_response()}", + ) + raise ValueError( + f"Failed to get task_id: {self._stub.get_response()}", + ) from e self._task_id = resp["task_id"] # type: ignore[call-overload] self._stub = None @@ -379,7 +426,7 @@ def serialize(self) -> str: } -def deserialize(s: str) -> Union[MessageBase, Sequence]: +def deserialize(s: Union[str, bytes]) -> Union[MessageBase, Sequence]: """Deserialize json string into MessageBase""" js_msg = json.loads(s) msg_type = js_msg.pop("__type") @@ -387,7 +434,7 @@ def deserialize(s: str) -> Union[MessageBase, Sequence]: return [deserialize(s) for s in js_msg["__value"]] elif msg_type not in _MSGS: raise NotImplementedError( - "Deserialization of {msg_type} is not supported.", + f"Deserialization of {msg_type} is not supported.", ) return _MSGS[msg_type](**js_msg) diff --git a/src/agentscope/models/__init__.py b/src/agentscope/models/__init__.py index 5ecc4b317..832829993 100644 --- a/src/agentscope/models/__init__.py +++ b/src/agentscope/models/__init__.py @@ -33,6 +33,13 @@ GeminiChatWrapper, GeminiEmbeddingWrapper, ) +from .zhipu_model import ( + ZhipuAIChatWrapper, + ZhipuAIEmbeddingWrapper, +) +from .litellm_model import ( + LiteLLMChatWrapper, +) __all__ = [ @@ -53,6 +60,9 @@ "OllamaGenerationWrapper", "GeminiChatWrapper", "GeminiEmbeddingWrapper", + "ZhipuAIChatWrapper", + "ZhipuAIEmbeddingWrapper", + "LiteLLMChatWrapper", "load_model_by_config_name", "read_model_configs", "clear_model_configs", diff --git a/src/agentscope/models/dashscope_model.py b/src/agentscope/models/dashscope_model.py index 4fd380de3..c4183aa85 100644 --- a/src/agentscope/models/dashscope_model.py +++ b/src/agentscope/models/dashscope_model.py @@ -11,7 +11,7 @@ try: import dashscope -except ModuleNotFoundError: +except ImportError: dashscope = None from .model import ModelWrapperBase, ModelResponse diff --git a/src/agentscope/models/litellm_model.py b/src/agentscope/models/litellm_model.py new file mode 100644 index 000000000..7a9309c07 --- /dev/null +++ b/src/agentscope/models/litellm_model.py @@ -0,0 +1,254 @@ +# -*- coding: utf-8 -*- +"""Model wrapper based on litellm https://docs.litellm.ai/docs/""" +from abc import ABC +from typing import Union, Any, List, Sequence + +from loguru import logger + +from .model import ModelWrapperBase, ModelResponse +from ..message import MessageBase +from ..utils.tools import _convert_to_str + +try: + import litellm +except ImportError: + litellm = None + + +class LiteLLMWrapperBase(ModelWrapperBase, ABC): + """The model wrapper based on LiteLLM API.""" + + def __init__( + self, + config_name: str, + model_name: str = None, + generate_args: dict = None, + **kwargs: Any, + ) -> None: + """ + To use the LiteLLM wrapper, environent variables must be set. + Different model_name could be using different environment variables. + For example: + - for model_name: "gpt-3.5-turbo", you need to set "OPENAI_API_KEY" + ``` + os.environ["OPENAI_API_KEY"] = "your-api-key" + ``` + - for model_name: "claude-2", you need to set "ANTHROPIC_API_KEY" + - for Azure OpenAI, you need to set "AZURE_API_KEY", + "AZURE_API_BASE", "AZURE_API_VERSION" + You should refer to the docs in https://docs.litellm.ai/docs/ . + Args: + config_name (`str`): + The name of the model config. + model_name (`str`, default `None`): + The name of the model to use in OpenAI API. + generate_args (`dict`, default `None`): + The extra keyword arguments used in litellm api generation, + e.g. `temperature`, `seed`. + For generate_args, please refer to + https://docs.litellm.ai/docs/completion/input + for more detailes. + + """ + + if model_name is None: + model_name = config_name + logger.warning("model_name is not set, use config_name instead.") + + super().__init__(config_name=config_name) + + if litellm is None: + raise ImportError( + "Cannot import litellm package in current python environment." + "You should try:" + "1. Install litellm by `pip install litellm`" + "2. If you still have import error, you should try to " + "update the openai to higher version, e.g. " + "by runing `pip install openai==1.25.1", + ) + + self.model_name = model_name + self.generate_args = generate_args or {} + self._register_default_metrics() + + def format( + self, + *args: Union[MessageBase, Sequence[MessageBase]], + ) -> Union[List[dict], str]: + raise RuntimeError( + f"Model Wrapper [{type(self).__name__}] doesn't " + f"need to format the input. Please try to use the " + f"model wrapper directly.", + ) + + +class LiteLLMChatWrapper(LiteLLMWrapperBase): + """The model wrapper based on litellm chat API. + To use the LiteLLM wrapper, environent variables must be set. + Different model_name could be using different environment variables. + For example: + - for model_name: "gpt-3.5-turbo", you need to set "OPENAI_API_KEY" + ``` + os.environ["OPENAI_API_KEY"] = "your-api-key" + ``` + - for model_name: "claude-2", you need to set "ANTHROPIC_API_KEY" + - for Azure OpenAI, you need to set "AZURE_API_KEY", + "AZURE_API_BASE", "AZURE_API_VERSION" + You should refer to the docs in https://docs.litellm.ai/docs/ . + """ + + model_type: str = "litellm_chat" + + def _register_default_metrics(self) -> None: + # Set monitor accordingly + # TODO: set quota to the following metrics + self.monitor.register( + self._metric("call_counter"), + metric_unit="times", + ) + self.monitor.register( + self._metric("prompt_tokens"), + metric_unit="token", + ) + self.monitor.register( + self._metric("completion_tokens"), + metric_unit="token", + ) + self.monitor.register( + self._metric("total_tokens"), + metric_unit="token", + ) + + def __call__( + self, + messages: list, + **kwargs: Any, + ) -> ModelResponse: + """ + Args: + messages (`list`): + A list of messages to process. + **kwargs (`Any`): + The keyword arguments to litellm chat completions API, + e.g. `temperature`, `max_tokens`, `top_p`, etc. Please refer to + https://docs.litellm.ai/docs/completion/input + for more detailed arguments. + + Returns: + `ModelResponse`: + The response text in text field, and the raw response in + raw field. + """ + + # step1: prepare keyword arguments + kwargs = {**self.generate_args, **kwargs} + + # step2: checking messages + if not isinstance(messages, list): + raise ValueError( + "LiteLLM `messages` field expected type `list`, " + f"got `{type(messages)}` instead.", + ) + if not all("role" in msg and "content" in msg for msg in messages): + raise ValueError( + "Each message in the 'messages' list must contain a 'role' " + "and 'content' key for LiteLLM API.", + ) + + # step3: forward to generate response + response = litellm.completion( + model=self.model_name, + messages=messages, + **kwargs, + ) + + # step4: record the api invocation if needed + self._save_model_invocation( + arguments={ + "model": self.model_name, + "messages": messages, + **kwargs, + }, + response=response.model_dump(), + ) + + # step5: update monitor accordingly + self.update_monitor(call_counter=1, **response.usage.model_dump()) + + # step6: return response + return ModelResponse( + text=response.choices[0].message.content, + raw=response.model_dump(), + ) + + def format( + self, + *args: Union[MessageBase, Sequence[MessageBase]], + ) -> List[dict]: + """Format the input string and dictionary into the unified format. + Note that the format function might not be the optimal way to contruct + prompt for every model, but a common way to do so. + Developers are encouraged to implement their own prompt + engineering strategies if have strong performance concerns. + + Args: + args (`Union[MessageBase, Sequence[MessageBase]]`): + The input arguments to be formatted, where each argument + should be a `Msg` object, or a list of `Msg` objects. + In distribution, placeholder is also allowed. + Returns: + `List[dict]`: + The formatted messages in the format that anthropic Chat API + required. + """ + + # Parse all information into a list of messages + input_msgs = [] + for _ in args: + if _ is None: + continue + if isinstance(_, MessageBase): + input_msgs.append(_) + elif isinstance(_, list) and all( + isinstance(__, MessageBase) for __ in _ + ): + input_msgs.extend(_) + else: + raise TypeError( + f"The input should be a Msg object or a list " + f"of Msg objects, got {type(_)}.", + ) + + # record dialog history as a list of strings + system_content_template = [] + dialogue = [] + for i, unit in enumerate(input_msgs): + if i == 0 and unit.role == "system": + # system prompt + system_prompt = _convert_to_str(unit.content) + if not system_prompt.endswith("\n"): + system_prompt += "\n" + system_content_template.append(system_prompt) + else: + # Merge all messages into a dialogue history prompt + dialogue.append( + f"{unit.name}: {_convert_to_str(unit.content)}", + ) + + if len(dialogue) != 0: + dialogue_history = "\n".join(dialogue) + + system_content_template.extend( + ["## Dialogue History", dialogue_history], + ) + + system_content = "\n".join(system_content_template) + + messages = [ + { + "role": "user", + "content": system_content, + }, + ] + + return messages diff --git a/src/agentscope/models/ollama_model.py b/src/agentscope/models/ollama_model.py index 31b136dcb..b4157c586 100644 --- a/src/agentscope/models/ollama_model.py +++ b/src/agentscope/models/ollama_model.py @@ -3,8 +3,6 @@ from abc import ABC from typing import Sequence, Any, Optional, List, Union -from loguru import logger - from agentscope.message import MessageBase from agentscope.models import ModelWrapperBase, ModelResponse from agentscope.utils.tools import _convert_to_str @@ -170,10 +168,43 @@ def format( self, *args: Union[MessageBase, Sequence[MessageBase]], ) -> List[dict]: - """A basic strategy to format the input into the required format of - Ollama Chat API. + """Format the messages for ollama Chat API. + + All messages will be formatted into a single system message with + system prompt and dialogue history. + + Note: + 1. This strategy maybe not suitable for all scenarios, + and developers are encouraged to implement their own prompt + engineering strategies. + 2. For ollama chat api, the content field shouldn't be empty string. + + Example: + + .. code-block:: python + + prompt = model.format( + Msg("system", "You're a helpful assistant", role="system"), + Msg("Bob", "Hi, how can I help you?", role="assistant"), + Msg("user", "What's the date today?", role="user") + ) + + The prompt will be as follows: + + .. code-block:: python + + [ + { + "role": "user", + "content": ( + "You're a helpful assistant\\n\\n" + "## Dialogue History\\n" + "Bob: Hi, how can I help you?\\n" + "user: What's the date today?" + ) + } + ] - Note for ollama chat api, the content field shouldn't be empty string. Args: args (`Union[MessageBase, Sequence[MessageBase]]`): @@ -185,39 +216,63 @@ def format( `List[dict]`: The formatted messages. """ - ollama_msgs = [] - for msg in args: - if msg is None: - continue - if isinstance(msg, MessageBase): - # content shouldn't be empty string - if msg.content == "": - logger.warning( - "In ollama chat API, the content field cannot be " - "empty string. To avoid error, the empty string is " - "replaced by a blank space automatically, but the " - "model may not work as expected.", - ) - msg.content = " " - - ollama_msg = { - "role": msg.role, - "content": _convert_to_str(msg.content), - } - - # image url - if msg.url is not None: - ollama_msg["images"] = [msg.url] - ollama_msgs.append(ollama_msg) - elif isinstance(msg, list): - ollama_msgs.extend(self.format(*msg)) + # Parse all information into a list of messages + input_msgs = [] + for _ in args: + if _ is None: + continue + if isinstance(_, MessageBase): + input_msgs.append(_) + elif isinstance(_, list) and all( + isinstance(__, MessageBase) for __ in _ + ): + input_msgs.extend(_) else: raise TypeError( - f"Invalid message type: {type(msg)}, `Msg` is expected.", + f"The input should be a Msg object or a list " + f"of Msg objects, got {type(_)}.", ) - return ollama_msgs + # record dialog history as a list of strings + system_content_template = [] + dialogue = [] + # TODO: here we default the url links to images + images = [] + for i, unit in enumerate(input_msgs): + if i == 0 and unit.role == "system": + # system prompt + system_prompt = _convert_to_str(unit.content) + if not system_prompt.endswith("\n"): + system_prompt += "\n" + system_content_template.append(system_prompt) + else: + # Merge all messages into a dialogue history prompt + dialogue.append( + f"{unit.name}: {_convert_to_str(unit.content)}", + ) + + if unit.url is not None: + images.append(unit.url) + + if len(dialogue) != 0: + dialogue_history = "\n".join(dialogue) + + system_content_template.extend( + ["## Dialogue History", dialogue_history], + ) + + system_content = "\n".join(system_content_template) + + system_message = { + "role": "system", + "content": system_content, + } + + if len(images) != 0: + system_message["images"] = images + + return [system_message] class OllamaEmbeddingWrapper(OllamaWrapperBase): diff --git a/src/agentscope/models/openai_model.py b/src/agentscope/models/openai_model.py index 2f74e101d..99542582b 100644 --- a/src/agentscope/models/openai_model.py +++ b/src/agentscope/models/openai_model.py @@ -1,14 +1,14 @@ # -*- coding: utf-8 -*- """Model wrapper for OpenAI models""" from abc import ABC -from typing import Union, Any, List, Sequence +from typing import Union, Any, List, Sequence, Dict from loguru import logger from .model import ModelWrapperBase, ModelResponse from ..file_manager import file_manager from ..message import MessageBase -from ..utils.tools import _convert_to_str +from ..utils.tools import _convert_to_str, _to_openai_image_url try: import openai @@ -107,6 +107,9 @@ class OpenAIChatWrapper(OpenAIWrapperBase): deprecated_model_type: str = "openai" + substrings_in_vision_models_names = ["gpt-4-turbo", "vision", "gpt-4o"] + """The substrings in the model names of vision models.""" + def _register_default_metrics(self) -> None: # Set monitor accordingly # TODO: set quota to the following metrics @@ -212,6 +215,77 @@ def __call__( raw=response.model_dump(), ) + def _format_msg_with_url( + self, + msg: MessageBase, + ) -> Dict: + """Format a message with image urls into openai chat format. + This format method is used for gpt-4o, gpt-4-turbo, gpt-4-vision and + other vision models. + """ + # Check if the model is a vision model + if not any( + _ in self.model_name + for _ in self.substrings_in_vision_models_names + ): + logger.warning( + f"The model {self.model_name} is not a vision model. " + f"Skip the url in the message.", + ) + return { + "role": msg.role, + "name": msg.name, + "content": _convert_to_str(msg.content), + } + + # Put all urls into a list + urls = [msg.url] if isinstance(msg.url, str) else msg.url + + # Check if the url refers to an image + checked_urls = [] + for url in urls: + try: + checked_urls.append(_to_openai_image_url(url)) + except TypeError: + logger.warning( + f"The url {url} is not a valid image url for " + f"OpenAI Chat API, skipped.", + ) + + if len(checked_urls) == 0: + # If no valid image url is provided, return the normal message dict + return { + "role": msg.role, + "name": msg.name, + "content": _convert_to_str(msg.content), + } + else: + # otherwise, use the vision format message + returned_msg = { + "role": msg.role, + "name": msg.name, + "content": [ + { + "type": "text", + "text": _convert_to_str(msg.content), + }, + ], + } + + image_dicts = [ + { + "type": "image_url", + "image_url": { + "url": _, + }, + } + for _ in checked_urls + ] + + returned_msg["content"].extend(image_dicts) + + return returned_msg + def format( self, *args: Union[MessageBase, Sequence[MessageBase]], @@ -230,19 +304,22 @@ def format( The formatted messages in the format that OpenAI Chat API required. """ - messages = [] for arg in args: if arg is None: continue if isinstance(arg, MessageBase): - messages.append( - { - "role": arg.role, - "name": arg.name, - "content": _convert_to_str(arg.content), - }, - ) + if arg.url is not None: + messages.append(self._format_msg_with_url(arg)) + else: + messages.append( + { + "role": arg.role, + "name": arg.name, + "content": _convert_to_str(arg.content), + }, + ) + elif isinstance(arg, list): messages.extend(self.format(*arg)) else: diff --git a/src/agentscope/models/zhipu_model.py b/src/agentscope/models/zhipu_model.py new file mode 100644 index 000000000..5c33e2b45 --- /dev/null +++ b/src/agentscope/models/zhipu_model.py @@ -0,0 +1,350 @@ +# -*- coding: utf-8 -*- +"""Model wrapper for ZhipuAI models""" +from abc import ABC +from typing import Union, Any, List, Sequence + +from loguru import logger + +from .model import ModelWrapperBase, ModelResponse +from ..message import MessageBase +from ..utils.tools import _convert_to_str + +try: + import zhipuai +except ImportError: + zhipuai = None + + +class ZhipuAIWrapperBase(ModelWrapperBase, ABC): + """The model wrapper for ZhipuAI API.""" + + def __init__( + self, + config_name: str, + model_name: str = None, + api_key: str = None, + client_args: dict = None, + generate_args: dict = None, + **kwargs: Any, + ) -> None: + """Initialize the zhipuai client. + To init the ZhipuAi client, the api_key is required. + Other client args include base_url and timeout. + The base_url is set to https://open.bigmodel.cn/api/paas/v4 + if not specified. The timeout arg is set for http request timeout. + + Args: + config_name (`str`): + The name of the model config. + model_name (`str`, default `None`): + The name of the model to use in ZhipuAI API. + api_key (`str`, default `None`): + The API key for ZhipuAI API. If not specified, it will + be read from the environment variable. + client_args (`dict`, default `None`): + The extra keyword arguments to initialize the ZhipuAI client. + generate_args (`dict`, default `None`): + The extra keyword arguments used in zhipuai api generation, + e.g. `temperature`, `seed`. + """ + + if model_name is None: + model_name = config_name + logger.warning("model_name is not set, use config_name instead.") + + super().__init__(config_name=config_name) + + if zhipuai is None: + raise ImportError( + "Cannot find zhipuai package in current python environment.", + ) + + self.model_name = model_name + self.generate_args = generate_args or {} + + self.client = zhipuai.ZhipuAI( + api_key=api_key, + **(client_args or {}), + ) + + self._register_default_metrics() + + def format( + self, + *args: Union[MessageBase, Sequence[MessageBase]], + ) -> Union[List[dict], str]: + raise RuntimeError( + f"Model Wrapper [{type(self).__name__}] doesn't " + f"need to format the input. Please try to use the " + f"model wrapper directly.", + ) + + +class ZhipuAIChatWrapper(ZhipuAIWrapperBase): + """The model wrapper for ZhipuAI's chat API.""" + + model_type: str = "zhipuai_chat" + + def _register_default_metrics(self) -> None: + # Set monitor accordingly + # TODO: set quota to the following metrics + self.monitor.register( + self._metric("call_counter"), + metric_unit="times", + ) + self.monitor.register( + self._metric("prompt_tokens"), + metric_unit="token", + ) + self.monitor.register( + self._metric("completion_tokens"), + metric_unit="token", + ) + self.monitor.register( + self._metric("total_tokens"), + metric_unit="token", + ) + + def __call__( + self, + messages: list, + **kwargs: Any, + ) -> ModelResponse: + """Processes a list of messages to construct a payload for the ZhipuAI + API call. It then makes a request to the ZhipuAI API and returns the + response. This method also updates monitoring metrics based on the + API response. + + Args: + messages (`list`): + A list of messages to process. + **kwargs (`Any`): + The keyword arguments to ZhipuAI chat completions API, + e.g. `temperature`, `max_tokens`, `top_p`, etc. Please refer to + https://open.bigmodel.cn/dev/api + for more detailed arguments. + + Returns: + `ModelResponse`: + The response text in text field, and the raw response in + raw field. + + Note: + `parse_func`, `fault_handler` and `max_retries` are reserved for + `_response_parse_decorator` to parse and check the response + generated by model wrapper. Their usages are listed as follows: + - `parse_func` is a callable function used to parse and check + the response generated by the model, which takes the response + as input. + - `max_retries` is the maximum number of retries when the + `parse_func` raise an exception. + - `fault_handler` is a callable function which is called + when the response generated by the model is invalid after + `max_retries` retries. + """ + + # step1: prepare keyword arguments + kwargs = {**self.generate_args, **kwargs} + + # step2: checking messages + if not isinstance(messages, list): + raise ValueError( + "ZhipuAI `messages` field expected type `list`, " + f"got `{type(messages)}` instead.", + ) + if not all("role" in msg and "content" in msg for msg in messages): + raise ValueError( + "Each message in the 'messages' list must contain a 'role' " + "and 'content' key for ZhipuAI API.", + ) + + # step3: forward to generate response + response = self.client.chat.completions.create( + model=self.model_name, + messages=messages, + **kwargs, + ) + + # step4: record the api invocation if needed + self._save_model_invocation( + arguments={ + "model": self.model_name, + "messages": messages, + **kwargs, + }, + response=response.model_dump(), + ) + + # step5: update monitor accordingly + self.update_monitor(call_counter=1, **response.usage.model_dump()) + + # step6: return response + return ModelResponse( + text=response.choices[0].message.content, + raw=response.model_dump(), + ) + + def format( + self, + *args: Union[MessageBase, Sequence[MessageBase]], + ) -> List[dict]: + """Format the input string and dictionary into the format that + ZhipuAI Chat API required. + + In this format function, the input messages are formatted into a + single system messages with format "{name}: {content}" for each + message. Note this strategy maybe not suitable for all scenarios, + and developers are encouraged to implement their own prompt + engineering strategies. + + Args: + args (`Union[MessageBase, Sequence[MessageBase]]`): + The input arguments to be formatted, where each argument + should be a `Msg` object, or a list of `Msg` objects. + In distribution, placeholder is also allowed. + + Returns: + `List[dict]`: + The formatted messages in the format that ZhipuAI Chat API + required. + """ + + # Parse all information into a list of messages + input_msgs = [] + for _ in args: + if _ is None: + continue + if isinstance(_, MessageBase): + input_msgs.append(_) + elif isinstance(_, list) and all( + isinstance(__, MessageBase) for __ in _ + ): + input_msgs.extend(_) + else: + raise TypeError( + f"The input should be a Msg object or a list " + f"of Msg objects, got {type(_)}.", + ) + + messages = [] + + # record dialog history as a list of strings + dialogue = [] + for i, unit in enumerate(input_msgs): + if i == 0 and unit.role == "system": + # system prompt + messages.append( + { + "role": unit.role, + "content": _convert_to_str(unit.content), + }, + ) + else: + # Merge all messages into a dialogue history prompt + dialogue.append( + f"{unit.name}: {_convert_to_str(unit.content)}", + ) + + dialogue_history = "\n".join(dialogue) + + user_content_template = "## Dialogue History\n{dialogue_history}" + + messages.append( + { + "role": "user", + "content": user_content_template.format( + dialogue_history=dialogue_history, + ), + }, + ) + + return messages + + +class ZhipuAIEmbeddingWrapper(ZhipuAIWrapperBase): + """The model wrapper for ZhipuAI embedding API.""" + + model_type: str = "zhipuai_embedding" + + def __call__( + self, + texts: str, + **kwargs: Any, + ) -> ModelResponse: + """Embed the messages with ZhipuAI embedding API. + + Args: + texts (`str`): + The messages used to embed. + **kwargs (`Any`): + The keyword arguments to ZhipuAI embedding API, + e.g. `encoding_format`, `user`. Please refer to + https://open.bigmodel.cn/dev/api#text_embedding + for more detailed arguments. + + Returns: + `ModelResponse`: + A list of embeddings in embedding field and the + raw response in raw field. + + Note: + `parse_func`, `fault_handler` and `max_retries` are reserved for + `_response_parse_decorator` to parse and check the response + generated by model wrapper. Their usages are listed as follows: + - `parse_func` is a callable function used to parse and check + the response generated by the model, which takes the response + as input. + - `max_retries` is the maximum number of retries when the + `parse_func` raise an exception. + - `fault_handler` is a callable function which is called + when the response generated by the model is invalid after + `max_retries` retries. + """ + # step1: prepare keyword arguments + kwargs = {**self.generate_args, **kwargs} + + # step2: forward to generate response + response = self.client.embeddings.create( + input=texts, + model=self.model_name, + ) + + # step3: record the model api invocation if needed + self._save_model_invocation( + arguments={ + "model": self.model_name, + "input": texts, + **kwargs, + }, + response=response.model_dump(), + ) + + # step4: update monitor accordingly + self.update_monitor(call_counter=1, **response.usage.model_dump()) + + # step5: return response + response_json = response.model_dump() + return ModelResponse( + embedding=[_["embedding"] for _ in response_json["data"]], + raw=response_json, + ) + + def _register_default_metrics(self) -> None: + # Set monitor accordingly + # TODO: set quota to the following metrics + self.monitor.register( + self._metric("call_counter"), + metric_unit="times", + ) + self.monitor.register( + self._metric("prompt_tokens"), + metric_unit="token", + ) + self.monitor.register( + self._metric("completion_tokens"), + metric_unit="token", + ) + self.monitor.register( + self._metric("total_tokens"), + metric_unit="token", + ) diff --git a/src/agentscope/parsers/code_block_parser.py b/src/agentscope/parsers/code_block_parser.py index 2621406cf..627d89a1f 100644 --- a/src/agentscope/parsers/code_block_parser.py +++ b/src/agentscope/parsers/code_block_parser.py @@ -1,5 +1,7 @@ # -*- coding: utf-8 -*- """Model response parser class for Markdown code block.""" +from typing import Optional + from agentscope.models import ModelResponse from agentscope.parsers import ParserBase @@ -13,7 +15,7 @@ class MarkdownCodeBlockParser(ParserBase): tag_begin: str = "```{language_name}" """The beginning tag.""" - content_hint: str = "${your_{language_name}_CODE}" + content_hint: str = "${{your_{language_name}_code}}" """The hint of the content.""" tag_end: str = "```" @@ -22,15 +24,38 @@ class MarkdownCodeBlockParser(ParserBase): format_instruction: str = ( "You should generate {language_name} code in a {language_name} fenced " "code block as follows: \n```{language_name}\n" - "${your_{language_name}_CODE}\n```" + "{content_hint}\n```" ) """The instruction for the format of the code block.""" - def __init__(self, language_name: str) -> None: + def __init__( + self, + language_name: str, + content_hint: Optional[str] = None, + ) -> None: + """Initialize the parser with the language name and the optional + content hint. + + Args: + language_name (`str`): + The name of the language, which will be used + in ```{language_name} + content_hint (`Optional[str]`, defaults to `None`): + The hint used to remind LLM what should be fill between the + tags. If not provided, the default content hint + "${{your_{language_name}_code}}" will be used. + """ self.name = self.name.format(language_name=language_name) self.tag_begin = self.tag_begin.format(language_name=language_name) + + if content_hint is None: + self.content_hint = f"${{your_{language_name}_code}}" + else: + self.content_hint = content_hint + self.format_instruction = self.format_instruction.format( language_name=language_name, + content_hint=self.content_hint, ).strip() def parse(self, response: ModelResponse) -> ModelResponse: diff --git a/src/agentscope/parsers/json_object_parser.py b/src/agentscope/parsers/json_object_parser.py index 14bd4d5fb..74b82b51d 100644 --- a/src/agentscope/parsers/json_object_parser.py +++ b/src/agentscope/parsers/json_object_parser.py @@ -2,7 +2,7 @@ """The parser for JSON object in the model response.""" import json from copy import deepcopy -from typing import Optional, Any, List +from typing import Optional, Any, List, Sequence, Union from loguru import logger @@ -14,6 +14,7 @@ ) from agentscope.models import ModelResponse from agentscope.parsers import ParserBase +from agentscope.parsers.parser_base import DictFilterMixin from agentscope.utils.tools import _join_str_with_comma_and @@ -121,7 +122,7 @@ def format_instruction(self) -> str: ) -class MarkdownJsonDictParser(MarkdownJsonObjectParser): +class MarkdownJsonDictParser(MarkdownJsonObjectParser, DictFilterMixin): """A class used to parse a JSON dictionary object in a markdown fenced code""" @@ -152,6 +153,9 @@ def __init__( self, content_hint: Optional[Any] = None, required_keys: List[str] = None, + keys_to_memory: Optional[Union[str, bool, Sequence[str]]] = True, + keys_to_content: Optional[Union[str, bool, Sequence[str]]] = True, + keys_to_metadata: Optional[Union[str, bool, Sequence[str]]] = False, ) -> None: """Initialize the parser with the content hint. @@ -165,8 +169,42 @@ def __init__( A list of required keys in the JSON dictionary object. If the response misses any of the required keys, it will raise a RequiredFieldNotFoundError. + keys_to_memory (`Optional[Union[str, bool, Sequence[str]]]`, + defaults to `True`): + The key or keys to be filtered in `to_memory` method. If + it's + - `False`, `None` will be returned in the `to_memory` method + - `str`, the corresponding value will be returned + - `List[str]`, a filtered dictionary will be returned + - `True`, the whole dictionary will be returned + keys_to_content (`Optional[Union[str, bool, Sequence[str]]`, + defaults to `True`): + The key or keys to be filtered in `to_content` method. If + it's + - `False`, `None` will be returned in the `to_content` method + - `str`, the corresponding value will be returned + - `List[str]`, a filtered dictionary will be returned + - `True`, the whole dictionary will be returned + keys_to_metadata (`Optional[Union[str, bool, Sequence[str]]`, + defaults to `False`): + The key or keys to be filtered in `to_metadata` method. If + it's + - `False`, `None` will be returned in the `to_metadata` method + - `str`, the corresponding value will be returned + - `List[str]`, a filtered dictionary will be returned + - `True`, the whole dictionary will be returned + """ - super().__init__(content_hint) + # Initialize the markdown json object parser + MarkdownJsonObjectParser.__init__(self, content_hint) + + # Initialize the mixin class to allow filtering the parsed response + DictFilterMixin.__init__( + self, + keys_to_memory=keys_to_memory, + keys_to_content=keys_to_content, + keys_to_metadata=keys_to_metadata, + ) self.required_keys = required_keys or [] diff --git a/src/agentscope/parsers/parser_base.py b/src/agentscope/parsers/parser_base.py index 3f4d4d7f4..dd56df762 100644 --- a/src/agentscope/parsers/parser_base.py +++ b/src/agentscope/parsers/parser_base.py @@ -1,10 +1,17 @@ # -*- coding: utf-8 -*- """The base class for model response parser.""" from abc import ABC, abstractmethod +from typing import Union, Sequence + +from loguru import logger from agentscope.exception import TagNotFoundError from agentscope.models import ModelResponse +# TODO: Support one-time warning in logger rather than setting global variable +_FIRST_TIME_TO_REPORT_CONTENT = True +_FIRST_TIME_TO_REPORT_MEMORY = True + class ParserBase(ABC): """The base class for model response parser.""" @@ -54,7 +61,7 @@ def _extract_first_content_by_tag( raise TagNotFoundError( f"Missing " f"tag{'' if len(missing_tags)==1 else 's'} " - f"{' and '.join(missing_tags)} in response.", + f"{' and '.join(missing_tags)} in response: {text}", raw_response=text, missing_begin_tag=index_start == -1, missing_end_tag=index_end == -1, @@ -65,3 +72,137 @@ def _extract_first_content_by_tag( ] return extract_text + + +class DictFilterMixin: + """A mixin class to filter the parsed response by keys. It allows users + to set keys to be filtered during speaking, storing in memory, and + returning in the agent reply function. + """ + + def __init__( + self, + keys_to_memory: Union[str, bool, Sequence[str]], + keys_to_content: Union[str, bool, Sequence[str]], + keys_to_metadata: Union[str, bool, Sequence[str]], + ) -> None: + """Initialize the mixin class with the keys to be filtered during + speaking, storing in memory, and returning in the agent reply function. + + Args: + keys_to_memory (`Optional[Union[str, bool, Sequence[str]]]`): + The key or keys to be filtered in `to_memory` method. If + it's + - `False`, `None` will be returned in the `to_memory` method + - `str`, the corresponding value will be returned + - `List[str]`, a filtered dictionary will be returned + - `True`, the whole dictionary will be returned + keys_to_content (`Optional[Union[str, bool, Sequence[str]]`): + The key or keys to be filtered in `to_content` method. If + it's + - `False`, `None` will be returned in the `to_content` method + - `str`, the corresponding value will be returned + - `List[str]`, a filtered dictionary will be returned + - `True`, the whole dictionary will be returned + keys_to_metadata (`Optional[Union[str, bool, Sequence[str]]]`): + The key or keys to be filtered in `to_metadata` method. If + it's + - `False`, `None` will be returned in the `to_metadata` method + - `str`, the corresponding value will be returned + - `List[str]`, a filtered dictionary will be returned + - `True`, the whole dictionary will be returned + """ + self.keys_to_memory = keys_to_memory + self.keys_to_content = keys_to_content + self.keys_to_metadata = keys_to_metadata + + def to_memory( + self, + parsed_response: dict, + allow_missing: bool = False, + ) -> Union[str, dict, None]: + """Filter the fields that will be stored in memory.""" + return self._filter_content_by_names( + parsed_response, + self.keys_to_memory, + allow_missing=allow_missing, + ) + + def to_content( + self, + parsed_response: dict, + allow_missing: bool = False, + ) -> Union[str, dict, None]: + """Filter the fields that will be fed into the content field in the + returned message, which will be exposed to other agents. + """ + return self._filter_content_by_names( + parsed_response, + self.keys_to_content, + allow_missing=allow_missing, + ) + + def to_metadata( + self, + parsed_response: dict, + allow_missing: bool = False, + ) -> Union[str, dict, None]: + """Filter the fields that will be fed into the returned message + directly to control the application workflow.""" + return self._filter_content_by_names( + parsed_response, + self.keys_to_metadata, + allow_missing=allow_missing, + ) + + def _filter_content_by_names( + self, + parsed_response: dict, + keys: Union[str, bool, Sequence[str]], + allow_missing: bool = False, + ) -> Union[str, dict, None]: + """Filter the parsed response by keys. If only one key is provided, the + returned content will be a single corresponding value. Otherwise, + the returned content will be a dictionary with the filtered keys and + their corresponding values. + + Args: + keys (`Union[str, bool, Sequence[str]]`): + The key or keys to be filtered. If it's + - `False`, `None` will be returned in the `to_content` method + - `str`, the corresponding value will be returned + - `List[str]`, a filtered dictionary will be returned + - `True`, the whole dictionary will be returned + allow_missing (`bool`, defaults to `False`): + Whether to allow missing keys in the response. If set to + `True`, the method will skip the missing keys in the response. + Otherwise, it will raise a `ValueError` when a key is missing. + + Returns: + `Union[str, dict]`: The filtered content. + """ + + if isinstance(keys, bool): + if keys: + return parsed_response + else: + return None + + if isinstance(keys, str): + return parsed_response[keys] + + # check if the required names are in the response + for name in keys: + if name not in parsed_response: + if allow_missing: + logger.warning( + f"Content name {name} not found in the response. Skip " + f"it.", + ) + else: + raise ValueError(f"Name {name} not found in the response.") + return { + name: parsed_response[name] + for name in keys + if name in parsed_response + } diff --git a/src/agentscope/parsers/tagged_content_parser.py b/src/agentscope/parsers/tagged_content_parser.py index 464617a25..9f7a17d36 100644 --- a/src/agentscope/parsers/tagged_content_parser.py +++ b/src/agentscope/parsers/tagged_content_parser.py @@ -1,10 +1,12 @@ # -*- coding: utf-8 -*- """The parser for tagged content in the model response.""" import json +from typing import Union, Sequence, Optional, List -from agentscope.exception import JsonParsingError +from agentscope.exception import JsonParsingError, TagNotFoundError from agentscope.models import ModelResponse from agentscope.parsers import ParserBase +from agentscope.parsers.parser_base import DictFilterMixin class TaggedContent: @@ -12,7 +14,8 @@ class TaggedContent: and tag end.""" name: str - """The name of the tagged content.""" + """The name of the tagged content, which will be used as the key in + extracted dictionary.""" tag_begin: str """The beginning tag.""" @@ -60,7 +63,7 @@ def __str__(self) -> str: return f"{self.tag_begin}{self.content_hint}{self.tag_end}" -class MultiTaggedContentParser(ParserBase): +class MultiTaggedContentParser(ParserBase, DictFilterMixin): """Parse response text by multiple tags, and return a dict of their content. Asking llm to generate JSON dictionary object directly maybe not a good idea due to involving escape characters and other issues. So we can @@ -79,14 +82,60 @@ class MultiTaggedContentParser(ParserBase): equals to `True`, this instruction will be used to remind the model to generate JSON object.""" - def __init__(self, *tagged_contents: TaggedContent) -> None: + def __init__( + self, + *tagged_contents: TaggedContent, + keys_to_memory: Optional[Union[str, bool, Sequence[str]]] = True, + keys_to_content: Optional[Union[str, bool, Sequence[str]]] = True, + keys_to_metadata: Optional[Union[str, bool, Sequence[str]]] = False, + keys_allow_missing: Optional[List[str]] = None, + ) -> None: """Initialize the parser with tags. Args: - tags (`dict[str, Tuple[str, str]]`): - A dictionary of tags, the key is the tag name, and the value is - a tuple of starting tag and end tag. + *tagged_contents (`dict[str, Tuple[str, str]]`): + Multiple TaggedContent objects, each object contains the tag + name, tag begin, content hint and tag end. The name will be + used as the key in the extracted dictionary. + required_keys (`Optional[List[str]]`, defaults to `None`): + A list of required + keys_to_memory (`Optional[Union[str, bool, Sequence[str]]]`, + defaults to `True`): + The key or keys to be filtered in `to_memory` method. If + it's + - `False`, `None` will be returned in the `to_memory` method + - `str`, the corresponding value will be returned + - `List[str]`, a filtered dictionary will be returned + - `True`, the whole dictionary will be returned + keys_to_content (`Optional[Union[str, bool, Sequence[str]]`, + defaults to `True`): + The key or keys to be filtered in `to_content` method. If + it's + - `False`, `None` will be returned in the `to_content` method + - `str`, the corresponding value will be returned + - `List[str]`, a filtered dictionary will be returned + - `True`, the whole dictionary will be returned + keys_to_metadata (`Optional[Union[str, bool, Sequence[str]]]`, + defaults to `False`): + The key or keys to be filtered in `to_metadata` method. If + it's + - `False`, `None` will be returned in the `to_metadata` method + - `str`, the corresponding value will be returned + - `List[str]`, a filtered dictionary will be returned + - `True`, the whole dictionary will be returned + keys_allow_missing (`Optional[List[str]]`, defaults to `None`): + A list of keys that are allowed to be missing in the response. """ + # Initialize the mixin class + DictFilterMixin.__init__( + self, + keys_to_memory=keys_to_memory, + keys_to_content=keys_to_content, + keys_to_metadata=keys_to_metadata, + ) + + self.keys_allow_missing = keys_allow_missing + self.tagged_contents = list(tagged_contents) # Prepare the format instruction according to the tagged contents @@ -123,26 +172,38 @@ def parse(self, response: ModelResponse) -> ModelResponse: tag_begin = tagged_content.tag_begin tag_end = tagged_content.tag_end - extract_content = self._extract_first_content_by_tag( - response, - tag_begin, - tag_end, - ) - - if tagged_content.parse_json: - try: - extract_content = json.loads(extract_content) - except json.decoder.JSONDecodeError as e: - raw_response = f"{tag_begin}{extract_content}{tag_end}" - raise JsonParsingError( - f"The content between {tagged_content.tag_begin} and " - f"{tagged_content.tag_end} should be a JSON object." - f'When parsing "{raw_response}", an error occurred: ' - f"{e}", - raw_response=raw_response, - ) from None - - tag_to_content[tagged_content.name] = extract_content + try: + extract_content = self._extract_first_content_by_tag( + response, + tag_begin, + tag_end, + ) + + if tagged_content.parse_json: + try: + extract_content = json.loads(extract_content) + except json.decoder.JSONDecodeError as e: + raw_response = f"{tag_begin}{extract_content}{tag_end}" + raise JsonParsingError( + f"The content between " + f"{tagged_content.tag_begin} and " + f"{tagged_content.tag_end} should be a JSON " + f'object. An error "{e}" occurred when parsing: ' + f"{raw_response}", + raw_response=raw_response, + ) from None + + tag_to_content[tagged_content.name] = extract_content + + except TagNotFoundError as e: + # if the key is allowed to be missing, skip the error + if ( + self.keys_allow_missing is not None + and tagged_content.name in self.keys_allow_missing + ): + continue + + raise e from None response.parsed = tag_to_content return response diff --git a/src/agentscope/rag/knowledge_bank.py b/src/agentscope/rag/knowledge_bank.py index 3a5a30ff0..dead7dbd4 100644 --- a/src/agentscope/rag/knowledge_bank.py +++ b/src/agentscope/rag/knowledge_bank.py @@ -1,9 +1,10 @@ # -*- coding: utf-8 -*- """ -Knowledge bank for making RAG module easier to use +Knowledge bank for making Knowledge objects easier to use """ import copy -from typing import Optional +import json +from typing import Optional, Union from loguru import logger from agentscope.models import load_model_by_config_name from agentscope.agents import AgentBase @@ -34,16 +35,21 @@ class KnowledgeBank: """ KnowledgeBank enables - 1) provide an easy and fast way to initialize the RAG model; - 2) make RAG model reusable and sharable for multiple agents. + 1) provide an easy and fast way to initialize the Knowledge object; + 2) make Knowledge object reusable and sharable for multiple agents. """ def __init__( self, - configs: dict, + configs: Union[dict, str], ) -> None: """initialize the knowledge bank""" - self.configs = configs + if isinstance(configs, str): + logger.info(f"Loading configs from {configs}") + with open(configs, "r", encoding="utf-8") as fp: + self.configs = json.loads(fp.read()) + else: + self.configs = configs self.stored_knowledge: dict[str, LlamaIndexKnowledge] = {} self._init_knowledge() @@ -70,7 +76,7 @@ def add_data_as_knowledge( Transform data in a directory to be ready to work with RAG. Args: knowledge_id (str): - user-defined unique id for the knowledge with RAG + user-defined unique id for the knowledge emb_model_name (str): name of the embedding model model_name (Optional[str]): @@ -87,7 +93,7 @@ def add_data_as_knowledge( - ... Examples can refer to../examples/conversation_with_RAG_agents/ - a simple example of importing data to RAG: + a simple example of importing data to Knowledge object: '' knowledge_bank.add_data_as_knowledge( knowledge_id="agentscope_tutorial_rag", @@ -130,25 +136,25 @@ def get_knowledge( duplicate: bool = False, ) -> LlamaIndexKnowledge: """ - Get a RAG from the knowledge bank. + Get a Knowledge object from the knowledge bank. Args: knowledge_id (str): - unique id for the RAG + unique id for the Knowledge object duplicate (bool): - whether return a copy of the RAG. + whether return a copy of the Knowledge object. Returns: - LlamaIndexRAG: - the RAG object defined with Llama-index + LlamaIndexKnowledge: + the Knowledge object defined with Llama-index """ if knowledge_id not in self.stored_knowledge: raise ValueError( - f"{knowledge_id} does not exist in the " f"knowledge bank.", + f"{knowledge_id} does not exist in the knowledge bank.", ) - rag = self.stored_knowledge[knowledge_id] + knowledge = self.stored_knowledge[knowledge_id] if duplicate: - rag = copy.deepcopy(rag) + knowledge = copy.deepcopy(knowledge) logger.info(f"knowledge bank loaded: {knowledge_id}.") - return rag + return knowledge def equip(self, agent: AgentBase, duplicate: bool = False) -> None: """ diff --git a/src/agentscope/rag/llama_index_knowledge.py b/src/agentscope/rag/llama_index_knowledge.py index 49439321f..0654eec89 100644 --- a/src/agentscope/rag/llama_index_knowledge.py +++ b/src/agentscope/rag/llama_index_knowledge.py @@ -208,7 +208,6 @@ def __init__( f"Embedding model does not support {type(self.emb_model)}.", ) # then we can initialize the RAG - print("init", self.knowledge_config) self._init_rag() def _init_rag(self, **kwargs: Any) -> None: diff --git a/src/agentscope/rpc/__init__.py b/src/agentscope/rpc/__init__.py index 03cf58169..42d3b5fe5 100644 --- a/src/agentscope/rpc/__init__.py +++ b/src/agentscope/rpc/__init__.py @@ -1,20 +1,22 @@ # -*- coding: utf-8 -*- """Import all rpc related modules in the package.""" -from typing import Any from .rpc_agent_client import RpcAgentClient, ResponseStub, call_in_thread try: from .rpc_agent_pb2 import RpcMsg # pylint: disable=E0611 -except ModuleNotFoundError: - RpcMsg = Any # type: ignore[misc] -try: from .rpc_agent_pb2_grpc import RpcAgentServicer from .rpc_agent_pb2_grpc import RpcAgentStub from .rpc_agent_pb2_grpc import add_RpcAgentServicer_to_server -except ImportError: - RpcAgentServicer = object - RpcAgentStub = Any - add_RpcAgentServicer_to_server = Any +except ImportError as import_error: + from agentscope.utils.tools import ImportErrorReporter + + RpcMsg = ImportErrorReporter(import_error, "distribute") # type: ignore[misc] + RpcAgentServicer = ImportErrorReporter(import_error, "distribute") + RpcAgentStub = ImportErrorReporter(import_error, "distribute") + add_RpcAgentServicer_to_server = ImportErrorReporter( + import_error, + "distribute", + ) __all__ = [ diff --git a/src/agentscope/rpc/rpc_agent_client.py b/src/agentscope/rpc/rpc_agent_client.py index 98b82a6d5..189e0895f 100644 --- a/src/agentscope/rpc/rpc_agent_client.py +++ b/src/agentscope/rpc/rpc_agent_client.py @@ -1,22 +1,25 @@ # -*- coding: utf-8 -*- """ Client of rpc agent server """ -import json import threading -from typing import Any, Optional +import base64 +from typing import Optional from loguru import logger try: + import dill import grpc -except ImportError: - grpc = None - -try: + from grpc import RpcError from agentscope.rpc.rpc_agent_pb2 import RpcMsg # pylint: disable=E0611 from agentscope.rpc.rpc_agent_pb2_grpc import RpcAgentStub -except ModuleNotFoundError: - RpcMsg = Any # type: ignore[misc] - RpcAgentStub = Any +except ImportError as import_error: + from agentscope.utils.tools import ImportErrorReporter + + dill = ImportErrorReporter(import_error, "distribute") + grpc = ImportErrorReporter(import_error, "distribute") + RpcMsg = ImportErrorReporter(import_error, "distribute") + RpcAgentStub = ImportErrorReporter(import_error, "distribute") + RpcError = ImportError class RpcAgentClient: @@ -63,18 +66,14 @@ def call_func( ) return result_msg.value - def create_agent(self, agent_configs: Optional[dict]) -> None: + def create_agent(self, agent_configs: dict) -> None: """Create a new agent for this client.""" try: if self.agent_id is None or len(self.agent_id) == 0: return self.call_func( - func_name="_create_agent", - value=( - None - if agent_configs is None - else json.dumps(agent_configs) - ), + "_create_agent", + base64.b64encode(dill.dumps(agent_configs)).decode("utf-8"), ) except Exception as e: logger.error( @@ -117,14 +116,14 @@ def get_response(self) -> str: def call_in_thread( client: RpcAgentClient, - x: dict, + value: str, func_name: str, ) -> ResponseStub: """Call rpc function in a sub-thread. Args: client (`RpcAgentClient`): the rpc client. - x (`dict`): the value of the reqeust. + x (`str`): the value of the reqeust. func_name (`str`): the name of the function being called. Returns: @@ -133,11 +132,15 @@ def call_in_thread( stub = ResponseStub() def wrapper() -> None: - resp = client.call_func( - func_name=func_name, - value=x.serialize() if x is not None else "", - ) - stub.set_response(resp) # type: ignore[arg-type] + try: + resp = client.call_func( + func_name=func_name, + value=value, + ) + stub.set_response(resp) # type: ignore[arg-type] + except RpcError as e: + logger.error(f"Fail to call {func_name} in thread: {e}") + stub.set_response(str(e)) thread = threading.Thread(target=wrapper) thread.start() diff --git a/src/agentscope/rpc/rpc_agent_pb2_grpc.py b/src/agentscope/rpc/rpc_agent_pb2_grpc.py index 93ee27369..4099c7027 100644 --- a/src/agentscope/rpc/rpc_agent_pb2_grpc.py +++ b/src/agentscope/rpc/rpc_agent_pb2_grpc.py @@ -3,8 +3,10 @@ """Client and server classes corresponding to protobuf-defined services.""" try: import grpc -except ImportError: - grpc = None +except ImportError as import_error: + from agentscope.utils.tools import ImportErrorReporter + + grpc = ImportErrorReporter(import_error, "distribute") import agentscope.rpc.rpc_agent_pb2 as rpc__agent__pb2 diff --git a/src/agentscope/server/__init__.py b/src/agentscope/server/__init__.py new file mode 100644 index 000000000..8b69a542a --- /dev/null +++ b/src/agentscope/server/__init__.py @@ -0,0 +1,10 @@ +# -*- coding: utf-8 -*- +"""Import all server related modules in the package.""" +from .launcher import RpcAgentServerLauncher, as_server +from .servicer import AgentServerServicer + +__all__ = [ + "RpcAgentServerLauncher", + "AgentServerServicer", + "as_server", +] diff --git a/src/agentscope/server/launcher.py b/src/agentscope/server/launcher.py new file mode 100644 index 000000000..ed5ed7f67 --- /dev/null +++ b/src/agentscope/server/launcher.py @@ -0,0 +1,449 @@ +# -*- coding: utf-8 -*- +""" Server of distributed agent""" +import os +from multiprocessing import Process, Event, Pipe +from multiprocessing.synchronize import Event as EventClass +import asyncio +import signal +import argparse +from typing import Type +from concurrent import futures +from loguru import logger + +try: + import grpc + from agentscope.rpc.rpc_agent_pb2_grpc import ( + add_RpcAgentServicer_to_server, + ) +except ImportError as import_error: + from agentscope.utils.tools import ImportErrorReporter + + grpc = ImportErrorReporter(import_error, "distribute") + add_RpcAgentServicer_to_server = ImportErrorReporter( + import_error, + "distribute", + ) + +import agentscope +from agentscope.server.servicer import AgentServerServicer +from agentscope.agents.agent import AgentBase +from agentscope.utils.tools import ( + _get_timestamp, + check_port, +) + + +def _setup_agent_server( + host: str, + port: int, + server_id: str, + init_settings: dict = None, + start_event: EventClass = None, + stop_event: EventClass = None, + pipe: int = None, + local_mode: bool = True, + max_pool_size: int = 8192, + max_timeout_seconds: int = 1800, + custom_agents: list = None, +) -> None: + """Setup agent server. + + Args: + host (`str`, defaults to `"localhost"`): + Hostname of the agent server. + port (`int`): + The socket port monitored by the agent server. + server_id (`str`): + The id of the server. + init_settings (`dict`, defaults to `None`): + Init settings for agentscope.init. + start_event (`EventClass`, defaults to `None`): + An Event instance used to determine whether the child process + has been started. + stop_event (`EventClass`, defaults to `None`): + The stop Event instance used to determine whether the child + process has been stopped. + pipe (`int`, defaults to `None`): + A pipe instance used to pass the actual port of the server. + local_mode (`bool`, defaults to `None`): + Only listen to local requests. + max_pool_size (`int`, defaults to `8192`): + Max number of agent replies that the server can accommodate. + max_timeout_seconds (`int`, defaults to `1800`): + Timeout for agent replies. + custom_agents (`list`, defaults to `None`): + A list of custom agent classes that are not in `agentscope.agents`. + """ + asyncio.run( + _setup_agent_server_async( + host=host, + port=port, + server_id=server_id, + init_settings=init_settings, + start_event=start_event, + stop_event=stop_event, + pipe=pipe, + local_mode=local_mode, + max_pool_size=max_pool_size, + max_timeout_seconds=max_timeout_seconds, + custom_agents=custom_agents, + ), + ) + + +async def _setup_agent_server_async( + host: str, + port: int, + server_id: str, + init_settings: dict = None, + start_event: EventClass = None, + stop_event: EventClass = None, + pipe: int = None, + local_mode: bool = True, + max_pool_size: int = 8192, + max_timeout_seconds: int = 1800, + custom_agents: list = None, +) -> None: + """Setup agent server in an async way. + + Args: + host (`str`, defaults to `"localhost"`): + Hostname of the agent server. + port (`int`): + The socket port monitored by the agent server. + server_id (`str`): + The id of the server. + init_settings (`dict`, defaults to `None`): + Init settings for agentscope.init. + start_event (`EventClass`, defaults to `None`): + An Event instance used to determine whether the child process + has been started. + stop_event (`EventClass`, defaults to `None`): + The stop Event instance used to determine whether the child + process has been stopped. + pipe (`int`, defaults to `None`): + A pipe instance used to pass the actual port of the server. + local_mode (`bool`, defaults to `None`): + If `True`, only listen to requests from "localhost", otherwise, + listen to requests from all hosts. + max_pool_size (`int`, defaults to `8192`): + The max number of agent reply messages that the server can + accommodate. Note that the oldest message will be deleted + after exceeding the pool size. + max_timeout_seconds (`int`, defaults to `1800`): + Maximum time for reply messages to be cached in the server. + Note that expired messages will be deleted. + custom_agents (`list`, defaults to `None`): + A list of custom agent classes that are not in `agentscope.agents`. + """ + from agentscope._init import init_process + + if init_settings is not None: + init_process(**init_settings) + servicer = AgentServerServicer( + host=host, + port=port, + max_pool_size=max_pool_size, + max_timeout_seconds=max_timeout_seconds, + ) + # update agent registry + if custom_agents is not None: + for agent_class in custom_agents: + AgentBase.register_agent_class(agent_class=agent_class) + + async def shutdown_signal_handler() -> None: + logger.info( + f"Received shutdown signal. Gracefully stopping the server at " + f"[{host}:{port}].", + ) + await server.stop(grace=5) + + loop = asyncio.get_running_loop() + if os.name != "nt": + # windows does not support add_signal_handler + for sig in (signal.SIGINT, signal.SIGTERM): + loop.add_signal_handler( + sig, + lambda: asyncio.create_task(shutdown_signal_handler()), + ) + while True: + try: + port = check_port(port) + servicer.port = port + server = grpc.aio.server( + futures.ThreadPoolExecutor(max_workers=None), + ) + add_RpcAgentServicer_to_server(servicer, server) + if local_mode: + server.add_insecure_port(f"localhost:{port}") + else: + server.add_insecure_port(f"0.0.0.0:{port}") + await server.start() + break + except OSError: + logger.warning( + f"Failed to start agent server at port [{port}]" + f"try another port", + ) + logger.info( + f"agent server [{server_id}] at {host}:{port} started successfully", + ) + if start_event is not None: + pipe.send(port) + start_event.set() + while not stop_event.is_set(): + await asyncio.sleep(1) + logger.info( + f"Stopping agent server at [{host}:{port}]", + ) + await server.stop(grace=10.0) + else: + await server.wait_for_termination() + logger.info( + f"agent server [{server_id}] at {host}:{port} stopped successfully", + ) + + +class RpcAgentServerLauncher: + """The launcher of AgentServer.""" + + def __init__( + self, + host: str = "localhost", + port: int = None, + max_pool_size: int = 8192, + max_timeout_seconds: int = 1800, + local_mode: bool = False, + custom_agents: list = None, + server_id: str = None, + agent_class: Type[AgentBase] = None, + agent_args: tuple = (), + agent_kwargs: dict = None, + ) -> None: + """Init a launcher of agent server. + + Args: + host (`str`, defaults to `"localhost"`): + Hostname of the agent server. + port (`int`, defaults to `None`): + Socket port of the agent server. + max_pool_size (`int`, defaults to `8192`): + The max number of agent reply messages that the server can + accommodate. Note that the oldest message will be deleted + after exceeding the pool size. + max_timeout_seconds (`int`, defaults to `1800`): + Maximum time for reply messages to be cached in the server. + Note that expired messages will be deleted. + local_mode (`bool`, defaults to `False`): + If `True`, only listen to requests from "localhost", otherwise, + listen to requests from all hosts. + custom_agents (`list`, defaults to `None`): + A list of custom agent classes that are not in + `agentscope.agents`. + server_id (`str`, defaults to `None`): + The id of the agent server. If not specified, a random id + will be generated. + agent_class (`Type[AgentBase]`, deprecated): + The AgentBase subclass encapsulated by this wrapper. + agent_args (`tuple`, deprecated): The args tuple used to + initialize the agent_class. + agent_kwargs (`dict`, deprecated): The args dict used to + initialize the agent_class. + """ + self.host = host + self.port = check_port(port) + self.max_pool_size = max_pool_size + self.max_timeout_seconds = max_timeout_seconds + self.local_mode = local_mode + self.server = None + self.stop_event = None + self.parent_con = None + self.custom_agents = custom_agents + self.server_id = ( + self.generate_server_id() if server_id is None else server_id + ) + if ( + agent_class is not None + or len(agent_args) > 0 + or agent_kwargs is not None + ): + logger.warning( + "`agent_class`, `agent_args` and `agent_kwargs` is deprecated" + " in `RpcAgentServerLauncher`", + ) + + def generate_server_id(self) -> str: + """Generate server id""" + return f"{self.host}:{self.port}-{_get_timestamp('%y%m%d-%H:%M:%S')}" + + def _launch_in_main(self) -> None: + """Launch agent server in main-process""" + logger.info( + f"Launching agent server at [{self.host}:{self.port}]...", + ) + asyncio.run( + _setup_agent_server_async( + host=self.host, + port=self.port, + server_id=self.server_id, + max_pool_size=self.max_pool_size, + max_timeout_seconds=self.max_timeout_seconds, + local_mode=self.local_mode, + custom_agents=self.custom_agents, + ), + ) + + def _launch_in_sub(self) -> None: + """Launch an agent server in sub-process.""" + from agentscope._init import _INIT_SETTINGS + + self.stop_event = Event() + self.parent_con, child_con = Pipe() + start_event = Event() + server_process = Process( + target=_setup_agent_server, + kwargs={ + "host": self.host, + "port": self.port, + "server_id": self.server_id, + "init_settings": _INIT_SETTINGS, + "start_event": start_event, + "stop_event": self.stop_event, + "pipe": child_con, + "max_pool_size": self.max_pool_size, + "max_timeout_seconds": self.max_timeout_seconds, + "local_mode": self.local_mode, + "custom_agents": self.custom_agents, + }, + ) + server_process.start() + self.port = self.parent_con.recv() + start_event.wait() + self.server = server_process + logger.info( + f"Launch agent server at [{self.host}:{self.port}] success", + ) + + def launch(self, in_subprocess: bool = True) -> None: + """launch an agent server. + + Args: + in_subprocess (bool, optional): launch the server in subprocess. + Defaults to True. For agents that need to obtain command line + input, such as UserAgent, please set this value to False. + """ + if in_subprocess: + self._launch_in_sub() + else: + self._launch_in_main() + + def wait_until_terminate(self) -> None: + """Wait for server process""" + if self.server is not None: + self.server.join() + + def shutdown(self) -> None: + """Shutdown the agent server.""" + if self.server is not None: + if self.stop_event is not None: + self.stop_event.set() + self.stop_event = None + self.server.join() + if self.server.is_alive(): + self.server.kill() + logger.info( + f"Agent server at port [{self.port}] is killed.", + ) + self.server = None + + +def as_server() -> None: + """Launch an agent server with terminal command. + + Note: + + The arguments of `as_server` are listed as follows: + + * `--host`: the hostname of the server. + * `--port`: the socket port of the server. + * `--max-pool-size`: max number of agent reply messages that the server + can accommodate. Note that the oldest message will be deleted + after exceeding the pool size. + * `--max-timeout-seconds`: max time for reply messages to be cached + in the server. Note that expired messages will be deleted. + * `--local-mode`: whether the started agent server only listens to + local requests. + * `--model-config-path`: the path to the model config json file + + In most cases, you only need to specify the `--host`, `--port` and + `--model-config-path`. + + .. code-block:: shell + + as_server --host localhost --port 12345 --model-config-path config.json + + """ # noqa + parser = argparse.ArgumentParser() + parser.add_argument( + "--host", + type=str, + default="localhost", + help="hostname of the server", + ) + parser.add_argument( + "--port", + type=int, + default=12310, + help="socket port of the server", + ) + parser.add_argument( + "--max-pool-size", + type=int, + default=8192, + help=( + "max number of agent reply messages that the server " + "can accommodate. Note that the oldest message will be deleted " + "after exceeding the pool size." + ), + ) + parser.add_argument( + "--max-timeout-seconds", + type=int, + default=1800, + help=( + "max time for agent reply messages to be cached" + "in the server. Note that expired messages will be deleted." + ), + ) + parser.add_argument( + "--local-mode", + type=bool, + default=False, + help=( + "If `True`, only listen to requests from 'localhost', otherwise, " + "listen to requests from all hosts." + ), + ) + parser.add_argument( + "--model-config-path", + type=str, + help="path to the model config json file", + ) + args = parser.parse_args() + agentscope.init( + project="agent_server", + name=f"server_{args.host}:{args.port}", + runtime_id=_get_timestamp( + "server_{}_{}_%y%m%d-%H%M%S", + ).format(args.host, args.port), + model_configs=args.model_config_path, + ) + launcher = RpcAgentServerLauncher( + host=args.host, + port=args.port, + max_pool_size=args.max_pool_size, + max_timeout_seconds=args.max_timeout_seconds, + local_mode=args.local_mode, + ) + launcher.launch(in_subprocess=False) + launcher.wait_until_terminate() diff --git a/src/agentscope/server/servicer.py b/src/agentscope/server/servicer.py new file mode 100644 index 000000000..53c63425f --- /dev/null +++ b/src/agentscope/server/servicer.py @@ -0,0 +1,313 @@ +# -*- coding: utf-8 -*- +""" Server of distributed agent""" +import threading +import base64 +import json +import traceback +from concurrent import futures +from loguru import logger + +try: + import dill + import grpc + from grpc import ServicerContext + from expiringdict import ExpiringDict + from ..rpc.rpc_agent_pb2 import RpcMsg # pylint: disable=E0611 + from ..rpc.rpc_agent_pb2_grpc import RpcAgentServicer +except ImportError as import_error: + from agentscope.utils.tools import ImportErrorReporter + + dill = ImportErrorReporter(import_error, "distribute") + grpc = ImportErrorReporter(import_error, "distribute") + ServicerContext = ImportErrorReporter(import_error, "distribute") + ExpiringDict = ImportErrorReporter(import_error, "distribute") + RpcMsg = ImportErrorReporter( # type: ignore[misc] + import_error, + "distribute", + ) + RpcAgentServicer = ImportErrorReporter(import_error, "distribute") + +from ..agents.agent import AgentBase +from ..message import ( + Msg, + PlaceholderMessage, + deserialize, +) + + +class AgentServerServicer(RpcAgentServicer): + """A Servicer for RPC Agent Server (formerly RpcServerSideWrapper)""" + + def __init__( + self, + host: str = "localhost", + port: int = None, + max_pool_size: int = 8192, + max_timeout_seconds: int = 1800, + ): + """Init the AgentServerServicer. + + Args: + host (`str`, defaults to "localhost"): + Hostname of the rpc agent server. + port (`int`, defaults to `None`): + Port of the rpc agent server. + max_pool_size (`int`, defaults to `8192`): + The max number of agent reply messages that the server can + accommodate. Note that the oldest message will be deleted + after exceeding the pool size. + max_timeout_seconds (`int`, defaults to `1800`): + Maximum time for reply messages to be cached in the server. + Note that expired messages will be deleted. + """ + self.host = host + self.port = port + self.result_pool = ExpiringDict( + max_len=max_pool_size, + max_age_seconds=max_timeout_seconds, + ) + self.executor = futures.ThreadPoolExecutor(max_workers=None) + self.task_id_lock = threading.Lock() + self.agent_id_lock = threading.Lock() + self.task_id_counter = 0 + self.agent_pool: dict[str, AgentBase] = {} + + def get_task_id(self) -> int: + """Get the auto-increment task id. + Each reply call will get a unique task id.""" + with self.task_id_lock: + self.task_id_counter += 1 + return self.task_id_counter + + def agent_exists(self, agent_id: str) -> bool: + """Check whether the agent exists. + + Args: + agent_id (`str`): the agent id. + + Returns: + bool: whether the agent exists. + """ + return agent_id in self.agent_pool + + def check_and_generate_agent( + self, + agent_id: str, + agent_configs: dict, + ) -> None: + """ + Check whether the agent exists, and create new agent instance + for new agent. + + Args: + agent_id (`str`): the agent id. + agent_configs (`dict`): configuration used to initialize the agent, + with three fields (generated in `_AgentMeta`): + + .. code-block:: python + + { + "class_name": {name of the agent} + "args": {args in tuple type to init the agent} + "kwargs": {args in dict type to init the agent} + } + + """ + with self.agent_id_lock: + if agent_id not in self.agent_pool: + agent_class_name = agent_configs["class_name"] + agent_instance = AgentBase.get_agent_class(agent_class_name)( + *agent_configs["args"], + **agent_configs["kwargs"], + ) + agent_instance._agent_id = agent_id # pylint: disable=W0212 + self.agent_pool[agent_id] = agent_instance + logger.info(f"create agent instance [{agent_id}]") + + def check_and_delete_agent(self, agent_id: str) -> None: + """ + Check whether the agent exists, and delete the agent instance + for the agent_id. + + Args: + agent_id (`str`): the agent id. + """ + with self.agent_id_lock: + if agent_id in self.agent_pool: + self.agent_pool.pop(agent_id) + logger.info(f"delete agent instance [{agent_id}]") + + def call_func( # pylint: disable=W0236 + self, + request: RpcMsg, + context: ServicerContext, + ) -> RpcMsg: + """Call the specific servicer function.""" + if hasattr(self, request.target_func): + if request.target_func not in ["_create_agent", "_get"]: + if not self.agent_exists(request.agent_id): + return context.abort( + grpc.StatusCode.INVALID_ARGUMENT, + f"Agent [{request.agent_id}] not exists.", + ) + return getattr(self, request.target_func)(request) + else: + # TODO: support other user defined method + logger.error(f"Unsupported method {request.target_func}") + return context.abort( + grpc.StatusCode.INVALID_ARGUMENT, + f"Unsupported method {request.target_func}", + ) + + def _reply(self, request: RpcMsg) -> RpcMsg: + """Call function of RpcAgentService + + Args: + request (`RpcMsg`): + Message containing input parameters or input parameter + placeholders. + + Returns: + `RpcMsg`: A serialized Msg instance with attributes name, host, + port and task_id + """ + if request.value: + msg = deserialize(request.value) + else: + msg = None + task_id = self.get_task_id() + self.result_pool[task_id] = threading.Condition() + self.executor.submit( + self.process_messages, + task_id, + request.agent_id, + msg, # type: ignore[arg-type] + ) + return RpcMsg( + value=Msg( # type: ignore[arg-type] + name=self.agent_pool[request.agent_id].name, + content=None, + task_id=task_id, + ).serialize(), + ) + + def _get(self, request: RpcMsg) -> RpcMsg: + """Get a reply message with specific task_id. + + Args: + request (`RpcMsg`): + The task id that generated this message, with json format:: + + { + 'task_id': int + } + + Returns: + `RpcMsg`: Concrete values of the specific message (or part of it). + """ + msg = json.loads(request.value) + while True: + result = self.result_pool.get(msg["task_id"]) + if isinstance(result, threading.Condition): + with result: + result.wait(timeout=1) + else: + break + return RpcMsg(value=result.serialize()) + + def _observe(self, request: RpcMsg) -> RpcMsg: + """Observe function of the original agent. + + Args: + request (`RpcMsg`): + The serialized input to be observed. + + Returns: + `RpcMsg`: Empty RpcMsg. + """ + msgs = deserialize(request.value) + for msg in msgs: + if isinstance(msg, PlaceholderMessage): + msg.update_value() + self.agent_pool[request.agent_id].observe(msgs) + return RpcMsg() + + def _create_agent(self, request: RpcMsg) -> RpcMsg: + """Create a new agent instance with the given agent_id. + + Args: + request (RpcMsg): request message with a `agent_id` field. + """ + self.check_and_generate_agent( + request.agent_id, + agent_configs=( + dill.loads(base64.b64decode(request.value)) + if request.value + else None + ), + ) + return RpcMsg() + + def _clone_agent(self, request: RpcMsg) -> RpcMsg: + """Clone a new agent instance from the origin instance. + + Args: + request (RpcMsg): The `agent_id` field is the agent_id of the + agent to be cloned. + + Returns: + `RpcMsg`: The `value` field contains the agent_id of generated + agent. + """ + agent_id = request.agent_id + with self.agent_id_lock: + if agent_id not in self.agent_pool: + raise ValueError(f"Agent [{agent_id}] not exists") + ori_agent = self.agent_pool[agent_id] + new_agent = ori_agent.__class__( + *ori_agent._init_settings["args"], # pylint: disable=W0212 + **ori_agent._init_settings["kwargs"], # pylint: disable=W0212 + ) + with self.agent_id_lock: + self.agent_pool[new_agent.agent_id] = new_agent + return RpcMsg(value=new_agent.agent_id) # type: ignore[arg-type] + + def _delete_agent(self, request: RpcMsg) -> RpcMsg: + """Delete the agent instance of the specific agent_id. + + Args: + request (RpcMsg): request message with a `agent_id` field. + """ + self.check_and_delete_agent(request.agent_id) + return RpcMsg() + + def process_messages( + self, + task_id: int, + agent_id: str, + task_msg: dict = None, + ) -> None: + """Processing an input message and generate its reply message. + + Args: + task_id (`int`): task id of the input message, . + agent_id (`str`): the id of the agent that accepted the message. + task_msg (`dict`): the input message. + """ + if isinstance(task_msg, PlaceholderMessage): + task_msg.update_value() + cond = self.result_pool[task_id] + try: + result = self.agent_pool[agent_id].reply(task_msg) + self.result_pool[task_id] = result + except Exception: + error_msg = traceback.format_exc() + logger.error(f"Error in agent [{agent_id}]:\n{error_msg}") + self.result_pool[task_id] = Msg( + name="ERROR", + role="assistant", + __status="ERROR", + content=f"Error in agent [{agent_id}]:\n{error_msg}", + ) + with cond: + cond.notify_all() diff --git a/src/agentscope/service/__init__.py b/src/agentscope/service/__init__.py index dce26c195..67cb7ce73 100644 --- a/src/agentscope/service/__init__.py +++ b/src/agentscope/service/__init__.py @@ -21,6 +21,11 @@ from .sql_query.mongodb import query_mongodb from .web.search import bing_search, google_search from .web.arxiv import arxiv_search +from .web.dblp import ( + dblp_search_publications, + dblp_search_authors, + dblp_search_venues, +) from .service_response import ServiceResponse from .service_toolkit import ServiceToolkit from .service_toolkit import ServiceFactory @@ -70,6 +75,9 @@ def get_help() -> None: "load_web", "parse_html_to_text", "download_from_url", + "dblp_search_publications", + "dblp_search_authors", + "dblp_search_venues", # to be deprecated "ServiceFactory", ] diff --git a/src/agentscope/service/file/common.py b/src/agentscope/service/file/common.py index adeb5a0ad..ef8e8855b 100644 --- a/src/agentscope/service/file/common.py +++ b/src/agentscope/service/file/common.py @@ -1,10 +1,10 @@ # -*- coding: utf-8 -*- +# pylint: disable=C0301 """ Common operators for file and directory. """ import os import shutil from typing import List -from agentscope.utils.common import write_file from agentscope.service.service_response import ServiceResponse from agentscope.service.service_status import ServiceExecStatus @@ -29,7 +29,19 @@ def create_file(file_path: str, content: str = "") -> ServiceResponse: status=ServiceExecStatus.ERROR, content="FileExistsError: The file already exists.", ) - return write_file(content, file_path) + try: + with open(file_path, "w", encoding="utf-8") as file: + file.write(content) + return ServiceResponse( + status=ServiceExecStatus.SUCCESS, + content="Success", + ) + except Exception as e: + error_message = f"{e.__class__.__name__}: {e}" + return ServiceResponse( + status=ServiceExecStatus.ERROR, + content=error_message, + ) def delete_file(file_path: str) -> ServiceResponse: diff --git a/src/agentscope/service/web/dblp.py b/src/agentscope/service/web/dblp.py new file mode 100644 index 000000000..7d6ab9c1c --- /dev/null +++ b/src/agentscope/service/web/dblp.py @@ -0,0 +1,318 @@ +# -*- coding: utf-8 -*- +""" Search papers, authors and venues in DBLP API. +For detail usage of the DBLP API +please refer to https://dblp.org/faq/How+can+I+fetch+DBLP+data.html +""" +from agentscope.service.service_response import ( + ServiceResponse, + ServiceExecStatus, +) +from agentscope.utils.common import requests_get + + +def dblp_search_publications( + question: str, + num_results: int = 30, + start: int = 0, + num_completion: int = 10, +) -> ServiceResponse: + """Search publications in the DBLP database. + + Args: + question (`str`): + The search query string. + num_results (`int`, defaults to `30`): + The number of search results to return. + start (`int`, defaults to `0`): + The index of the first search result to return. + num_completion (`int`, defaults to `10`): + The number of completions to generate. + + Returns: + `ServiceResponse`: A dictionary containing `status` and `content`. + The `status` attribute is from the ServiceExecStatus enum, + indicating success or error. + The `content` is a list of parsed publication data if successful, + or an error message if failed. + Each item in the list contains publication information + includes title, authors, venue, pages, year, type, DOI, and URL. + + Example: + .. code-block:: python + search_results = dblp_search_publications(question="Extreme + Learning Machine", + num_results=3, + results_per_page=1, + num_completion=1) + print(search_results) + + It returns the following structure: + + .. code-block:: python + + { + 'status': , + 'content': [ + { + 'title': 'Power transformers fault diagnosis + based on a meta-learning approach to kernel + extreme learning machine with opposition-based + learning sparrow search algorithm.', + 'venue': 'J. Intell. Fuzzy Syst.', + 'pages': '455-466', + 'year': '2023', + 'type': 'Journal Articles', + 'doi': '10.3233/JIFS-211862', + 'url': 'https://dblp.org/rec/journals/jifs/YuTZTCH23', + 'authors': 'Song Yu, Weimin Tan, Chengming Zhang, + Chao Tang, Lihong Cai, Dong Hu' + }, + { + 'title': 'Performance comparison of Extreme Learning + Machinesand other machine learning methods + on WBCD data set.', + 'venue': 'SIU', + 'pages': '1-4', + 'year': '2021', + 'type': 'Conference and Workshop Papers', + 'doi': '10.1109/SIU53274.2021.9477984', + 'url': 'https://dblp.org/rec/conf/siu/KeskinDAY21', + 'authors': 'Ömer Selim Keskin, Akif Durdu, + Muhammet Fatih Aslan, Abdullah Yusefi' + } + ] + } + """ + + url = "https://dblp.org/search/publ/api" + params = { + "q": question, + "format": "json", + "h": num_results, + "f": start, + "c": num_completion, + } + search_results = requests_get(url, params) + + if isinstance(search_results, str): + return ServiceResponse(ServiceExecStatus.ERROR, search_results) + + hits = search_results.get("result", {}).get("hits", {}).get("hit", []) + parsed_data = [] + for hit in hits: + info = hit.get("info", {}) + title = info.get("title", "No title available") + venue = info.get("venue", "No venue available") + pages = info.get("pages", "No page information") + year = info.get("year", "Year not specified") + pub_type = info.get("type", "Type not specified") + doi = info.get("doi", "No DOI available") + url = info.get("url", "No URL available") + authors = info.get("authors", {}).get("author", []) + authors_info = info.get("authors", {}).get("author", []) + if isinstance( + authors_info, + dict, + ): # Check if there's only one author in a dict format + authors_info = [authors_info] + authors = ", ".join( + [author["text"] for author in authors_info if "text" in author], + ) + data = { + "title": title, + "venue": venue, + "pages": pages, + "year": year, + "type": pub_type, + "doi": doi, + "url": url, + "authors": authors, + } + parsed_data.append(data) + return ServiceResponse(ServiceExecStatus.SUCCESS, parsed_data) + + +def dblp_search_authors( + question: str, + num_results: int = 30, + start: int = 0, + num_completion: int = 10, +) -> ServiceResponse: + """Search for author information in the DBLP database. + + Args: + question (`str`): + The search query string. + num_results (`int`, defaults to `30`): + The number of search results to return. + start (`int`, defaults to `0`): + The index of the first search result to return. + num_completion (`int`, defaults to `10`): + The number of completions to generate. + + + Returns: + `ServiceResponse`: A dictionary containing `status` and `content`. + The `status` attribute is from the + ServiceExecStatus enum, indicating the success or error of the search. + The `content` is a list of parsed author + data if successful, or an error message if failed. + Each item in the list contains author information + including their name, URL, and affiliations. + + Example: + .. code-block:: python + + search_results = dblp_search_authors(question="Liu ZiWei", + num_results=3, + results_per_page=1, + num_completion=1) + print(search_results) + + It returns the following structure: + + .. code-block:: python + + { + 'status': , + 'content': [ + { + 'author': 'Ziwei Liu 0001', + 'url': 'https://dblp.org/pid/05/6300-1', + 'affiliations': 'Advantech Singapore Pte Ltd, + Singapore; + National University of Singapore, + Department of Computer Science, Singapore' + }, + { + 'author': 'Ziwei Liu 0002', + 'url': 'https://dblp.org/pid/05/6300-2', + 'affiliations': 'Nanyang Technological University, + S-Lab, Singapore; + Chinese University of Hong Kong, + Department of Information Engineering, + Hong Kong' + } + ] + } + """ + url = "https://dblp.org/search/author/api" + params = { + "q": question, + "format": "json", + "h": num_results, + "f": start, + "c": num_completion, + } + search_results = requests_get(url, params) + if isinstance(search_results, str): + return ServiceResponse(ServiceExecStatus.ERROR, search_results) + hits = search_results.get("result", {}).get("hits", {}).get("hit", []) + parsed_data = [] + for hit in hits: + author = hit["info"]["author"] + author_url = hit["info"]["url"] + affiliations = [] + notes = hit["info"].get("notes", {}) + note_entries = notes.get("note", []) + if isinstance(note_entries, dict): + note_entries = [note_entries] + for note in note_entries: + if note["@type"] == "affiliation": + affiliations.append(note["text"]) + affiliations = "; ".join(affiliations) + entry_dict = { + "author": author, + "url": author_url, + "affiliations": affiliations, + } + parsed_data.append(entry_dict) + return ServiceResponse(ServiceExecStatus.SUCCESS, parsed_data) + + +def dblp_search_venues( + question: str, + num_results: int = 30, + start: int = 0, + num_completion: int = 10, +) -> ServiceResponse: + """Search for venue information in the DBLP database. + + Args: + question (`str`): + The search query string. + num_results (`int`, defaults to `30`): + The number of search results to return. + start (`int`, defaults to `0`): + The index of the first search result to return. + num_completion (`int`, defaults to `10`): + The number of completions to generate. + + Returns: + `ServiceResponse`: A dictionary containing `status` and `content`. + The `status` attribute is from the ServiceExecStatus enum, + indicating the success or error of the search. + The `content` is a list of parsed venue data if successful, + or an error message if failed. + Each item in the list contains venue information including + its name, acronym, type, and URL. + + Example: + .. code-block:: python + + search_results = dblp_search_venues(question="AAAI", + num_results=1, + results_per_page=1, + num_completion=1) + print(search_results) + + It returns the following structure: + + .. code-block:: python + + { + 'status': , + 'content': [ + { + 'venue': 'AAAI Conference on Artificial Intelligence + (AAAI)', + 'acronym': 'AAAI', + 'type': 'Conference or Workshop', + 'url': 'https://dblp.org/db/conf/aaai/' + }, + { + 'venue': ''AAAI Fall Symposium Series', + 'acronym': 'No acronym available', + 'type': 'Conference or Workshop', + 'url': 'https://dblp.org/db/conf/aaaifs/' + } + ] + } + """ + url = "https://dblp.org/search/venue/api" + params = { + "q": question, + "format": "json", + "h": num_results, + "f": start, + "c": num_completion, + } + search_results = requests_get(url, params) + if isinstance(search_results, str): + return ServiceResponse(ServiceExecStatus.ERROR, search_results) + + hits = search_results.get("result", {}).get("hits", {}).get("hit", []) + parsed_data = [] + for hit in hits: + venue = hit["info"]["venue"] + acronym = hit["info"].get("acronym", "No acronym available") + venue_type = hit["info"].get("type", "Type not specified") + url = hit["info"]["url"] + entry_dict = { + "venue": venue, + "acronym": acronym, + "type": venue_type, + "url": url, + } + parsed_data.append(entry_dict) + return ServiceResponse(ServiceExecStatus.SUCCESS, parsed_data) diff --git a/src/agentscope/service/web/search.py b/src/agentscope/service/web/search.py index fd72b7536..b5ff7e59f 100644 --- a/src/agentscope/service/web/search.py +++ b/src/agentscope/service/web/search.py @@ -1,7 +1,6 @@ # -*- coding: utf-8 -*- """Search question in the web""" from typing import Any - from agentscope.service.service_response import ServiceResponse from agentscope.utils.common import requests_get from agentscope.service.service_status import ServiceExecStatus diff --git a/src/agentscope/utils/monitor.py b/src/agentscope/utils/monitor.py index e1c9e98f4..08b1dc24b 100644 --- a/src/agentscope/utils/monitor.py +++ b/src/agentscope/utils/monitor.py @@ -288,6 +288,59 @@ def sqlite_cursor(db_path: str, timeout: float = 30.0) -> Generator: conn.close() +class DummyMonitor(MonitorBase): + """A monitor that does nothing""" + + def register( + self, + metric_name: str, + metric_unit: Optional[str] = None, + quota: Optional[float] = None, + ) -> bool: + return True + + def exists(self, metric_name: str) -> bool: + return True + + def add(self, metric_name: str, value: float) -> bool: + return True + + def update(self, values: dict, prefix: Optional[str] = None) -> None: + return None + + def clear(self, metric_name: str) -> bool: + return True + + def remove(self, metric_name: str) -> bool: + return True + + def get_value(self, metric_name: str) -> Optional[float]: + return 0.0 + + def get_unit(self, metric_name: str) -> Optional[str]: + return "" + + def get_quota(self, metric_name: str) -> Optional[float]: + return 0.0 + + def set_quota(self, metric_name: str, quota: float) -> bool: + return True + + def get_metric(self, metric_name: str) -> Optional[dict]: + return {} + + def get_metrics(self, filter_regex: Optional[str] = None) -> dict: + return {} + + def register_budget( + self, + model_name: str, + value: float, + prefix: Optional[str] = "local", + ) -> bool: + return True + + class SqliteMonitor(MonitorBase): """A monitor based on sqlite""" @@ -646,6 +699,8 @@ def get_monitor( if cls._instance is None: if impl_type is None or impl_type.lower() == "sqlite": cls._instance = SqliteMonitor(db_path=db_path) + elif impl_type == "dummy": + cls._instance = DummyMonitor() else: raise NotImplementedError( "Monitor with type [{type}] is not implemented.", diff --git a/src/agentscope/utils/tools.py b/src/agentscope/utils/tools.py index 75cc0c7cb..8888d99e6 100644 --- a/src/agentscope/utils/tools.py +++ b/src/agentscope/utils/tools.py @@ -3,9 +3,11 @@ import base64 import datetime import json +import os.path import secrets import string -from typing import Any, Literal, List +import socket +from typing import Any, Literal, List, Optional from urllib.parse import urlparse @@ -60,6 +62,42 @@ def to_dialog_str(item: dict) -> str: return f"{speaker}: {content}" +def find_available_port() -> int: + """Get an unoccupied socket port number.""" + with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s: + s.bind(("", 0)) + return s.getsockname()[1] + + +def check_port(port: Optional[int] = None) -> int: + """Check if the port is available. + + Args: + port (`int`): + the port number being checked. + + Returns: + `int`: the port number that passed the check. If the port is found + to be occupied, an available port number will be automatically + returned. + """ + if port is None: + new_port = find_available_port() + logger.warning( + "agent server port is not provided, automatically select " + f"[{new_port}] as the port number.", + ) + return new_port + with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s: + if s.connect_ex(("localhost", port)) == 0: + new_port = find_available_port() + logger.warning( + f"Port [{port}] is occupied, use [{new_port}] instead", + ) + return new_port + return port + + def _guess_type_by_extension( url: str, ) -> Literal["image", "audio", "video", "file"]: @@ -129,7 +167,7 @@ def _to_openai_image_url(url: str) -> str: """ # See https://platform.openai.com/docs/guides/vision for details of # support image extensions. - image_extensions = ( + support_image_extensions = ( ".png", ".jpg", ".jpeg", @@ -139,16 +177,17 @@ def _to_openai_image_url(url: str) -> str: parsed_url = urlparse(url) - # Checking for HTTP(S) image links - if parsed_url.scheme in ["http", "https"]: - lower_path = parsed_url.path.lower() - if lower_path.endswith(image_extensions): + lower_url = url.lower() + + # Web url + if parsed_url.scheme != "": + if any(lower_url.endswith(_) for _ in support_image_extensions): return url # Check if it is a local file - elif parsed_url.scheme == "file" or not parsed_url.scheme: - if parsed_url.path.lower().endswith(image_extensions): - with open(parsed_url.path, "rb") as image_file: + elif os.path.exists(url) and os.path.isfile(url): + if any(lower_url.endswith(_) for _ in support_image_extensions): + with open(url, "rb") as image_file: base64_image = base64.b64encode(image_file.read()).decode( "utf-8", ) @@ -156,7 +195,7 @@ def _to_openai_image_url(url: str) -> str: mime_type = f"image/{extension}" return f"data:{mime_type};base64,{base64_image}" - raise TypeError(f"{url} should be end with {image_extensions}.") + raise TypeError(f"{url} should be end with {support_image_extensions}.") def _download_file(url: str, path_file: str, max_retries: int = 3) -> bool: @@ -294,3 +333,39 @@ def _join_str_with_comma_and(elements: List[str]) -> str: return " and ".join(elements) else: return ", ".join(elements[:-1]) + f", and {elements[-1]}" + + +class ImportErrorReporter: + """Used as a placeholder for missing packages. + When called, an ImportError will be raised, prompting the user to install + the specified extras requirement. + """ + + def __init__(self, error: ImportError, extras_require: str = None) -> None: + """Init the ImportErrorReporter. + + Args: + error (`ImportError`): the original ImportError. + extras_require (`str`): the extras requirement. + """ + self.error = error + self.extras_require = extras_require + + def __call__(self, *args: Any, **kwds: Any) -> Any: + return self._raise_import_error() + + def __getattr__(self, name: str) -> Any: + return self._raise_import_error() + + def __getitem__(self, __key: Any) -> Any: + return self._raise_import_error() + + def _raise_import_error(self) -> Any: + """Raise the ImportError""" + err_msg = f"ImportError occorred: [{self.error.msg}]." + if self.extras_require is not None: + err_msg += ( + f" Please install [{self.extras_require}] version" + " of agentscope." + ) + raise ImportError(err_msg) diff --git a/src/agentscope/web/workstation/workflow_node.py b/src/agentscope/web/workstation/workflow_node.py index f4b14d914..0005e04b8 100644 --- a/src/agentscope/web/workstation/workflow_node.py +++ b/src/agentscope/web/workstation/workflow_node.py @@ -826,6 +826,7 @@ def compile(self) -> dict: "dashscope_chat": ModelNode, "openai_chat": ModelNode, "post_api_chat": ModelNode, + "post_api_dall_e": ModelNode, "Message": MsgNode, "DialogAgent": DialogAgentNode, "UserAgent": UserAgentNode, diff --git a/tests/format_test.py b/tests/format_test.py index 589398837..07efa86ae 100644 --- a/tests/format_test.py +++ b/tests/format_test.py @@ -1,6 +1,7 @@ # -*- coding: utf-8 -*- """Unit test for prompt engineering strategies in format function.""" import unittest +from unittest import mock from unittest.mock import MagicMock, patch from agentscope.message import Msg @@ -9,8 +10,10 @@ OllamaChatWrapper, OllamaGenerationWrapper, GeminiChatWrapper, + ZhipuAIChatWrapper, DashScopeChatWrapper, DashScopeMultiModalWrapper, + LiteLLMChatWrapper, ) @@ -29,6 +32,27 @@ def setUp(self) -> None: ], ] + self.inputs_vision = [ + Msg("system", "You are a helpful assistant", role="system"), + [ + Msg( + "user", + "Describe the images", + role="user", + url="https://fakeweb/test.jpg", + ), + Msg( + "user", + "And this images", + "user", + url=[ + "/Users/xxx/abc.png", + "/Users/xxx/def.mp3", + ], + ), + ], + ] + self.wrong_inputs = [ Msg("system", "You are a helpful assistant", role="system"), [ @@ -37,6 +61,118 @@ def setUp(self) -> None: ], ] + @patch("builtins.open", mock.mock_open(read_data=b"abcdef")) + @patch("os.path.isfile") + @patch("os.path.exists") + @patch("openai.OpenAI") + def test_openai_chat_vision_with_wrong_model( + self, + mock_client: MagicMock, + mock_exists: MagicMock, + mock_isfile: MagicMock, + ) -> None: + """Unit test for the format function in openai chat api wrapper with + vision models""" + mock_exists.side_effect = lambda url: url == "/Users/xxx/abc.png" + mock_isfile.side_effect = lambda url: url == "/Users/xxx/abc.png" + + # Prepare the mock client + mock_client.return_value = "client_dummy" + + model = OpenAIChatWrapper( + config_name="", + model_name="gpt-4", + ) + + # correct format + ground_truth = [ + { + "role": "system", + "content": "You are a helpful assistant", + "name": "system", + }, + { + "role": "user", + "name": "user", + "content": "Describe the images", + }, + { + "role": "user", + "name": "user", + "content": "And this images", + }, + ] + + prompt = model.format(*self.inputs_vision) + self.assertListEqual(prompt, ground_truth) + + @patch("builtins.open", mock.mock_open(read_data=b"abcdef")) + @patch("os.path.isfile") + @patch("os.path.exists") + @patch("openai.OpenAI") + def test_openai_chat_vision( + self, + mock_client: MagicMock, + mock_exists: MagicMock, + mock_isfile: MagicMock, + ) -> None: + """Unit test for the format function in openai chat api wrapper with + vision models""" + mock_exists.side_effect = lambda url: url == "/Users/xxx/abc.png" + mock_isfile.side_effect = lambda url: url == "/Users/xxx/abc.png" + + # Prepare the mock client + mock_client.return_value = "client_dummy" + + model = OpenAIChatWrapper( + config_name="", + model_name="gpt-4o", + ) + + # correct format + ground_truth = [ + { + "role": "system", + "content": "You are a helpful assistant", + "name": "system", + }, + { + "role": "user", + "name": "user", + "content": [ + { + "type": "text", + "text": "Describe the images", + }, + { + "type": "image_url", + "image_url": { + "url": "https://fakeweb/test.jpg", + }, + }, + ], + }, + { + "role": "user", + "name": "user", + "content": [ + { + "type": "text", + "text": "And this images", + }, + { + "type": "image_url", + "image_url": { + "url": "", + }, + }, + ], + }, + ] + + prompt = model.format(*self.inputs_vision) + self.assertListEqual(prompt, ground_truth) + @patch("openai.OpenAI") def test_openai_chat(self, mock_client: MagicMock) -> None: """Unit test for the format function in openai chat api wrapper.""" @@ -83,9 +219,16 @@ def test_ollama_chat(self) -> None: # correct format ground_truth = [ - {"role": "system", "content": "You are a helpful assistant"}, - {"role": "user", "content": "What is the weather today?"}, - {"role": "assistant", "content": "It is sunny today"}, + { + "role": "system", + "content": ( + "You are a helpful assistant\n" + "\n" + "## Dialogue History\n" + "user: What is the weather today?\n" + "assistant: It is sunny today" + ), + }, ] prompt = model.format(*self.inputs) # type: ignore[arg-type] self.assertEqual(prompt, ground_truth) @@ -173,6 +316,62 @@ def test_dashscope_chat(self) -> None: with self.assertRaises(TypeError): model.format(*self.wrong_inputs) # type: ignore[arg-type] + def test_zhipuai_chat(self) -> None: + """Unit test for the format function in zhipu chat api wrapper.""" + model = ZhipuAIChatWrapper( + config_name="", + model_name="glm-4", + api_key="xxx", + ) + + ground_truth = [ + { + "content": "You are a helpful assistant", + "role": "system", + }, + { + "content": ( + "## Dialogue History\n" + "user: What is the weather today?\n" + "assistant: It is sunny today" + ), + "role": "user", + }, + ] + + prompt = model.format(*self.inputs) + self.assertListEqual(prompt, ground_truth) + + # wrong format + with self.assertRaises(TypeError): + model.format(*self.wrong_inputs) # type: ignore[arg-type] + + def test_litellm_chat(self) -> None: + """Unit test for the format function in litellm chat api wrapper.""" + model = LiteLLMChatWrapper( + config_name="", + model_name="gpt-3.5-turbo", + api_key="xxx", + ) + + ground_truth = [ + { + "role": "user", + "content": ( + "You are a helpful assistant\n\n" + "## Dialogue History\nuser: What is the weather today?\n" + "assistant: It is sunny today" + ), + }, + ] + + prompt = model.format(*self.inputs) + self.assertListEqual(prompt, ground_truth) + + # wrong format + with self.assertRaises(TypeError): + model.format(*self.wrong_inputs) # type: ignore[arg-type] + def test_dashscope_multimodal_image(self) -> None: """Unit test for the format function in dashscope multimodal conversation api wrapper for image.""" diff --git a/tests/litellm_test.py b/tests/litellm_test.py new file mode 100644 index 000000000..3ee4a8503 --- /dev/null +++ b/tests/litellm_test.py @@ -0,0 +1,61 @@ +# -*- coding: utf-8 -*- +"""litellm test""" +import unittest +from unittest.mock import patch, MagicMock + +import agentscope +from agentscope.models import load_model_by_config_name + + +class TestLiteLLMChatWrapper(unittest.TestCase): + """Test LiteLLM Chat Wrapper""" + + def setUp(self) -> None: + self.api_key = "test_api_key.secret_key" + self.messages = [ + {"role": "user", "content": "Hello, litellm!"}, + {"role": "assistant", "content": "How can I assist you?"}, + ] + + @patch("agentscope.models.litellm_model.litellm") + def test_chat(self, mock_litellm: MagicMock) -> None: + """ + Test chat""" + mock_response = MagicMock() + mock_response.model_dump.return_value = { + "choices": [ + {"message": {"content": "Hello, this is a mocked response!"}}, + ], + "usage": { + "prompt_tokens": 100, + "completion_tokens": 5, + "total_tokens": 105, + }, + } + mock_response.choices[ + 0 + ].message.content = "Hello, this is a mocked response!" + + mock_litellm.completion.return_value = mock_response + + agentscope.init( + model_configs={ + "config_name": "test_config", + "model_type": "litellm_chat", + "model_name": "ollama/llama3:8b", + "api_key": self.api_key, + }, + ) + + model = load_model_by_config_name("test_config") + + response = model( + messages=self.messages, + api_base="http://localhost:11434", + ) + + self.assertEqual(response.text, "Hello, this is a mocked response!") + + +if __name__ == "__main__": + unittest.main() diff --git a/tests/memory_test.py b/tests/memory_test.py index 4a25badec..629fb45d9 100644 --- a/tests/memory_test.py +++ b/tests/memory_test.py @@ -3,10 +3,11 @@ Unit tests for memory classes and functions """ +import os import unittest from unittest.mock import patch, MagicMock -from agentscope.message import Msg +from agentscope.message import Msg, Tht from agentscope.memory import TemporaryMemory @@ -17,6 +18,8 @@ class TemporaryMemoryTest(unittest.TestCase): def setUp(self) -> None: self.memory = TemporaryMemory() + self.file_name_1 = "tmp_mem_file1.txt" + self.file_name_2 = "tmp_mem_file2.txt" self.msg_1 = Msg("user", "Hello", role="user") self.msg_2 = Msg( "agent", @@ -29,19 +32,15 @@ def setUp(self) -> None: role="assistant", ) - self.dict_1 = { - "name": "dict1", - "content": "dict 1", - "role": "assistant", - } - self.dict_2 = { - "name": "dict2", - "content": "dict 2", - "role": "assistant", - } - self.invalid = {"invalid_key": "invalid_value"} + def tearDown(self) -> None: + """Clean up before & after tests.""" + if os.path.exists(self.file_name_1): + os.remove(self.file_name_1) + if os.path.exists(self.file_name_2): + os.remove(self.file_name_2) + def test_add(self) -> None: """Test add different types of object""" # add msg @@ -51,18 +50,11 @@ def test_add(self) -> None: [self.msg_1], ) - # add dict - self.memory.add(self.dict_1) - self.assertEqual( - self.memory.get_memory(), - [self.msg_1, self.dict_1], - ) - # add list self.memory.add([self.msg_2, self.msg_3]) self.assertEqual( self.memory.get_memory(), - [self.msg_1, self.dict_1, self.msg_2, self.msg_3], + [self.msg_1, self.msg_2, self.msg_3], ) @patch("loguru.logger.warning") @@ -84,17 +76,11 @@ def test_delete(self, mock_logging: MagicMock) -> None: def test_invalid(self) -> None: """Test invalid operations for memory""" - self.memory.add(self.invalid) # test invalid add - self.assertEqual( - self.memory.get_memory(), - [self.invalid], - ) - - # test print - self.assertEqual( - self.memory.get_memory(), - [{"invalid_key": "invalid_value"}], + with self.assertRaises(Exception) as context: + self.memory.add(self.invalid) + self.assertTrue( + f"Cannot add {self.invalid} to memory" in str(context.exception), ) def test_load_export(self) -> None: @@ -102,11 +88,11 @@ def test_load_export(self) -> None: Test load and export function of TemporaryMemory """ memory = TemporaryMemory() - user_input = {"name": "user", "content": "Hello"} - agent_input = { - "name": "agent", - "content": "Hello! How can I help you?", - } + user_input = Msg(name="user", content="Hello") + agent_input = Msg( + name="agent", + content="Hello! How can I help you?", + ) memory.load([user_input, agent_input]) retrieved_mem = memory.export(to_mem=True) self.assertEqual( @@ -114,6 +100,44 @@ def test_load_export(self) -> None: [user_input, agent_input], ) + memory.export(file_path=self.file_name_1) + memory.clear() + self.assertEqual( + memory.get_memory(), + [], + ) + memory.load(self.file_name_1) + self.assertEqual( + memory.get_memory(), + [user_input, agent_input], + ) + + def test_tht_memory(self) -> None: + """ + Test temporary memory with Tht, + add, clear, export, loading + """ + memory = TemporaryMemory() + thought = Tht("testing") + memory.add(thought) + + self.assertEqual( + memory.get_memory(), + [thought], + ) + + memory.export(file_path=self.file_name_2) + memory.clear() + self.assertEqual( + memory.get_memory(), + [], + ) + memory.load(self.file_name_2) + self.assertEqual( + memory.get_memory(), + [thought], + ) + if __name__ == "__main__": unittest.main() diff --git a/tests/monitor_test.py b/tests/monitor_test.py index 63ade5f09..aa381927f 100644 --- a/tests/monitor_test.py +++ b/tests/monitor_test.py @@ -6,9 +6,12 @@ import unittest import uuid import os -from agentscope.utils import MonitorBase, QuotaExceededError, MonitorFactory +import shutil +from loguru import logger -from agentscope.utils.monitor import SqliteMonitor +import agentscope +from agentscope.utils import MonitorBase, QuotaExceededError, MonitorFactory +from agentscope.utils.monitor import SqliteMonitor, DummyMonitor class MonitorFactoryTest(unittest.TestCase): @@ -17,10 +20,10 @@ class MonitorFactoryTest(unittest.TestCase): def setUp(self) -> None: MonitorFactory._instance = None # pylint: disable=W0212 self.db_path = f"test-{uuid.uuid4()}.db" - _ = MonitorFactory.get_monitor(db_path=self.db_path) def test_get_monitor(self) -> None: """Test get monitor method of MonitorFactory.""" + _ = MonitorFactory.get_monitor(db_path=self.db_path) monitor1 = MonitorFactory.get_monitor() monitor2 = MonitorFactory.get_monitor() self.assertEqual(monitor1, monitor2) @@ -31,11 +34,59 @@ def test_get_monitor(self) -> None: self.assertTrue(monitor2.remove("token_num")) self.assertFalse(monitor1.exists("token_num")) + def test_monitor_type(self) -> None: + """Test get different type of monitor""" + monitor = MonitorFactory.get_monitor(impl_type="dummy") + self.assertTrue(isinstance(monitor, DummyMonitor)) + MonitorFactory._instance = None # pylint: disable=W0212 + monitor = MonitorFactory.get_monitor( + impl_type="sqlite", + db_path=self.db_path, + ) + self.assertTrue(isinstance(monitor, SqliteMonitor)) + def tearDown(self) -> None: MonitorFactory._instance = None # pylint: disable=W0212 os.remove(self.db_path) +class DummyMonitorTest(unittest.TestCase): + """Test class for DummyMonitor""" + + def setUp(self) -> None: + MonitorFactory._instance = None # pylint: disable=W0212 + agentscope.init( + project="test", + name="monitor", + save_dir="./test_runs", + save_log=True, + use_monitor=False, + ) + + def test_dummy_monitor(self) -> None: + """test dummy monitor""" + monitor = MonitorFactory.get_monitor() + self.assertTrue( + monitor.register_budget( + model_name="qwen", + value=100.0, + prefix="xxx", + ), + ) + self.assertTrue( + monitor.register( + "prompt_tokens", + metric_unit="token", + ), + ) + monitor.update({"call_counter": 1}) + + def tearDown(self) -> None: + MonitorFactory._instance = None # pylint: disable=W0212 + logger.remove() + shutil.rmtree("./test_runs") + + class MonitorTestBase(unittest.TestCase): """An abstract test class for MonitorBase interface""" diff --git a/tests/msghub_test.py b/tests/msghub_test.py index d2034ac2b..75f61dccc 100644 --- a/tests/msghub_test.py +++ b/tests/msghub_test.py @@ -5,6 +5,7 @@ from agentscope.agents import AgentBase from agentscope import msghub +from agentscope.message import Msg class TestAgent(AgentBase): @@ -33,10 +34,10 @@ def setUp(self) -> None: def test_msghub_operation(self) -> None: """Test add, delete and broadcast operations""" - msg1 = {"msg": 1} - msg2 = {"msg": 2} - msg3 = {"msg": 3} - msg4 = {"msg": 4} + msg1 = Msg(name="a1", content="msg1") + msg2 = Msg(name="a2", content="msg2") + msg3 = Msg(name="a3", content="msg3") + msg4 = Msg(name="a4", content="msg4") with msghub(participants=[self.agent1, self.agent2]) as hub: self.agent1(msg1) @@ -68,11 +69,12 @@ def test_msghub(self) -> None: """msghub test.""" ground_truth = [ - { - "role": "wisper", - "content": "This secret that my password is 123456 can't be" + Msg( + name="w1", + content="This secret that my password is 123456 can't be" " leaked!", - }, + role="wisper", + ), ] with msghub(participants=[self.wisper, self.agent1, self.agent2]): diff --git a/tests/parser_test.py b/tests/parser_test.py new file mode 100644 index 000000000..384ef64cf --- /dev/null +++ b/tests/parser_test.py @@ -0,0 +1,232 @@ +# -*- coding: utf-8 -*- +"""Unit test for model response parser.""" +import unittest + +from agentscope.models import ModelResponse +from agentscope.parsers import ( + MarkdownJsonDictParser, + MarkdownJsonObjectParser, + MarkdownCodeBlockParser, + MultiTaggedContentParser, + TaggedContent, +) +from agentscope.parsers.parser_base import DictFilterMixin + + +class ModelResponseParserTest(unittest.TestCase): + """Unit test for model response parser.""" + + def setUp(self) -> None: + """Init for ExampleTest.""" + self.res_dict_1 = ModelResponse( + text=( + "```json\n" + '{"speak": "Hello, world!", ' + '"thought": "xxx", ' + '"end_discussion": true}\n```' + ), + ) + self.instruction_dict_1 = ( + "You should respond a json object in a json fenced code block " + "as follows:\n" + "```json\n" + '{"speak": "what you speak", ' + '"thought": "what you thought", ' + '"end_discussion": true/false}\n' + "```" + ) + self.res_dict_2 = ModelResponse( + text="[SPEAK]Hello, world![/SPEAK]\n" + "[THOUGHT]xxx[/THOUGHT]\n" + "[END_DISCUSSION]true[/END_DISCUSSION]", + ) + self.instruction_dict_2 = ( + "Respond with specific tags as outlined below, and the content " + "between [END_DISCUSSION] and [/END_DISCUSSION] MUST be a JSON " + "object:\n" + "[SPEAK]what you speak[/SPEAK]\n" + "[THOUGHT]what you thought[/THOUGHT]\n" + "[END_DISCUSSION]true/false[/END_DISCUSSION]" + ) + self.gt_dict = { + "speak": "Hello, world!", + "thought": "xxx", + "end_discussion": True, + } + self.hint_dict = ( + '{"speak": "what you speak", ' + '"thought": "what you thought", ' + '"end_discussion": true/false}' + ) + + self.gt_to_memory = {"speak": "Hello, world!", "thought": "xxx"} + self.gt_to_content = "Hello, world!" + self.gt_to_metadata = {"end_discussion": True} + + self.res_list = ModelResponse(text="""```json\n[1,2,3]\n```""") + self.instruction_list = ( + "You should respond a json object in a json fenced code block as " + "follows:\n" + "```json\n" + "{Your generated list of numbers}\n" + "```" + ) + self.gt_list = [1, 2, 3] + self.hint_list = "{Your generated list of numbers}" + + self.res_float = ModelResponse(text="""```json\n3.14\n```""") + self.instruction_float = ( + "You should respond a json object in a json fenced code block as " + "follows:\n" + "```json\n" + "{Your generated float number}\n" + "```" + ) + self.gt_float = 3.14 + self.hint_float = "{Your generated float number}" + + self.res_code = ModelResponse( + text="""```python\nprint("Hello, world!")\n```""", + ) + self.instruction_code = ( + "You should generate python code in a python fenced code block as " + "follows: \n" + "```python\n" + "${your_python_code}\n" + "```" + ) + self.instruction_code_with_hint = ( + "You should generate python code in a python fenced code block as " + "follows: \n" + "```python\n" + "abc\n" + "```" + ) + self.gt_code = """\nprint("Hello, world!")\n""" + + def test_markdownjsondictparser(self) -> None: + """Test for MarkdownJsonDictParser""" + parser = MarkdownJsonDictParser( + content_hint=self.hint_dict, + keys_to_memory=["speak", "thought"], + keys_to_content="speak", + keys_to_metadata=["end_discussion"], + ) + + self.assertEqual(parser.format_instruction, self.instruction_dict_1) + + res = parser.parse(self.res_dict_1) + + self.assertDictEqual(res.parsed, self.gt_dict) + + # test filter functions + self.assertDictEqual(parser.to_memory(res.parsed), self.gt_to_memory) + self.assertEqual(parser.to_content(res.parsed), self.gt_to_content) + self.assertDictEqual( + parser.to_metadata(res.parsed), + self.gt_to_metadata, + ) + + def test_markdownjsonobjectparser(self) -> None: + """Test for MarkdownJsonObjectParser""" + # list + parser_list = MarkdownJsonObjectParser(content_hint=self.hint_list) + + self.assertEqual(parser_list.format_instruction, self.instruction_list) + + res_list = parser_list.parse(self.res_list) + self.assertListEqual(res_list.parsed, self.gt_list) + + # float + parser_float = MarkdownJsonObjectParser(content_hint=self.hint_float) + + self.assertEqual( + parser_float.format_instruction, + self.instruction_float, + ) + + res_float = parser_float.parse(self.res_float) + self.assertEqual(res_float.parsed, self.gt_float) + + def test_markdowncodeblockparser(self) -> None: + """Test for MarkdownCodeBlockParser""" + parser = MarkdownCodeBlockParser(language_name="python") + + self.assertEqual(parser.format_instruction, self.instruction_code) + + res = parser.parse(self.res_code) + + self.assertEqual(res.parsed, self.gt_code) + + def test_markdowncodeblockparser_with_hint(self) -> None: + """Test for MarkdownCodeBlockParser""" + parser = MarkdownCodeBlockParser( + language_name="python", + content_hint="abc", + ) + + self.assertEqual( + parser.format_instruction, + self.instruction_code_with_hint, + ) + + res = parser.parse(self.res_code) + + self.assertEqual(res.parsed, self.gt_code) + + def test_multitaggedcontentparser(self) -> None: + """Test for MultiTaggedContentParser""" + parser = MultiTaggedContentParser( + TaggedContent( + "speak", + tag_begin="[SPEAK]", + content_hint="what you speak", + tag_end="[/SPEAK]", + ), + TaggedContent( + "thought", + tag_begin="[THOUGHT]", + content_hint="what you thought", + tag_end="[/THOUGHT]", + ), + TaggedContent( + "end_discussion", + tag_begin="[END_DISCUSSION]", + content_hint="true/false", + tag_end="[/END_DISCUSSION]", + parse_json=True, + ), + keys_to_memory=["speak", "thought"], + keys_to_content="speak", + keys_to_metadata=["end_discussion"], + ) + + self.assertEqual(parser.format_instruction, self.instruction_dict_2) + + res = parser.parse(self.res_dict_2) + + self.assertDictEqual(res.parsed, self.gt_dict) + + # test filter functions + self.assertDictEqual(parser.to_memory(res.parsed), self.gt_to_memory) + self.assertEqual(parser.to_content(res.parsed), self.gt_to_content) + self.assertDictEqual( + parser.to_metadata(res.parsed), + self.gt_to_metadata, + ) + + def test_DictFilterMixin_default_value(self) -> None: + """Test the default value of the DictFilterMixin class""" + mixin = DictFilterMixin( + keys_to_memory=True, + keys_to_content=True, + keys_to_metadata=False, + ) + + self.assertDictEqual(mixin.to_memory(self.gt_dict), self.gt_dict) + self.assertDictEqual(mixin.to_content(self.gt_dict), self.gt_dict) + self.assertEqual(mixin.to_metadata(self.gt_dict), None) + + +if __name__ == "__main__": + unittest.main() diff --git a/tests/rpc_agent_test.py b/tests/rpc_agent_test.py index 0319cf204..d70613268 100644 --- a/tests/rpc_agent_test.py +++ b/tests/rpc_agent_test.py @@ -8,8 +8,8 @@ from loguru import logger import agentscope -from agentscope.agents import AgentBase -from agentscope.agents.rpc_agent import RpcAgentServerLauncher +from agentscope.agents import AgentBase, DistConf +from agentscope.server import RpcAgentServerLauncher from agentscope.message import Msg from agentscope.message import PlaceholderMessage from agentscope.message import deserialize @@ -95,6 +95,55 @@ def reply(self, x: dict = None) -> dict: return x +class DemoGeneratorAgent(AgentBase): + """A demo agent to generate a number""" + + def __init__(self, name: str, value: int) -> None: + super().__init__(name) + self.value = value + + def reply(self, _: dict = None) -> dict: + time.sleep(1) + return Msg( + name=self.name, + role="assistant", + content={ + "value": self.value, + }, + ) + + +class DemoGatherAgent(AgentBase): + """A demo agent to gather value""" + + def __init__( + self, + name: str, + agents: list[DemoGeneratorAgent], + to_dist: dict = None, + ) -> None: + super().__init__(name, to_dist=to_dist) + self.agents = agents + + def reply(self, _: dict = None) -> dict: + result = [] + stime = time.time() + for agent in self.agents: + result.append(agent()) + value = 0 + for r in result: + value += r.content["value"] + etime = time.time() + return Msg( + name=self.name, + role="assistant", + content={ + "value": value, + "time": etime - stime, + }, + ) + + class DemoErrorAgent(AgentBase): """A demo Rpc agent that raise Error""" @@ -121,13 +170,9 @@ def tearDown(self) -> None: def test_single_rpc_agent_server(self) -> None: """test setup a single rpc agent""" - host = "localhost" - port = 12001 agent_a = DemoRpcAgent( name="a", - ).to_dist( - host=host, - port=port, + to_dist=True, ) self.assertIsNotNone(agent_a) msg = Msg( @@ -177,13 +222,10 @@ def test_connect_to_an_existing_rpc_server(self) -> None: """test connecting to an existing server""" launcher = RpcAgentServerLauncher( # choose port automatically - agent_class=DemoRpcAgent, - agent_kwargs={ - "name": "a", - }, - local_mode=False, host="127.0.0.1", port=12010, + local_mode=False, + custom_agents=[DemoRpcAgent], ) launcher.launch() agent_a = DemoRpcAgent( @@ -191,7 +233,6 @@ def test_connect_to_an_existing_rpc_server(self) -> None: ).to_dist( host="127.0.0.1", port=launcher.port, - launch_server=False, ) msg = Msg( name="System", @@ -226,29 +267,19 @@ def test_connect_to_an_existing_rpc_server(self) -> None: def test_multi_rpc_agent(self) -> None: """test setup multi rpc agent""" - host = "localhost" - port1 = 12001 - port2 = 12002 - port3 = 12003 agent_a = DemoRpcAgentAdd( name="a", ).to_dist( - host=host, - port=port1, lazy_launch=False, ) agent_b = DemoRpcAgentAdd( name="b", ).to_dist( - host=host, - port=port2, lazy_launch=False, ) agent_c = DemoRpcAgentAdd( name="c", ).to_dist( - host=host, - port=port3, lazy_launch=False, ) @@ -292,17 +323,9 @@ def test_multi_rpc_agent(self) -> None: def test_mix_rpc_agent_and_local_agent(self) -> None: """test to use local and rpc agent simultaneously""" - host = "localhost" - # use the same port, agents should choose available ports - # automatically - port1 = 12001 - port2 = 12001 - # rpc agent a agent_a = DemoRpcAgentAdd( name="a", ).to_dist( - host=host, - port=port1, lazy_launch=False, ) # local agent b @@ -310,12 +333,11 @@ def test_mix_rpc_agent_and_local_agent(self) -> None: name="b", ) # rpc agent c - agent_c = DemoRpcAgentAdd( + agent_c = DemoRpcAgentAdd( # pylint: disable=E1123 name="c", - ).to_dist( - host=host, - port=port2, - lazy_launch=False, + to_dist=DistConf( + lazy_launch=False, + ), ) msg = Msg( name="System", @@ -339,7 +361,8 @@ def test_msghub_compatibility(self) -> None: ).to_dist() agent_c = DemoRpcAgentWithMemory( name="c", - ).to_dist() + to_dist=True, + ) participants = [agent_a, agent_b, agent_c] annonuncement_msgs = [ Msg(name="System", content="Announcement 1", role="system"), @@ -368,24 +391,16 @@ def test_standalone_multiprocess_init(self) -> None: """test compatibility with agentscope.init""" monitor = MonitorFactory.get_monitor() monitor.register("msg_num", quota=10) - host = "localhost" - # automatically - port1 = 12001 - port2 = 12002 # rpc agent a agent_a = DemoRpcAgentWithMonitor( name="a", ).to_dist( - host=host, - port=port1, lazy_launch=False, ) # local agent b agent_b = DemoRpcAgentWithMonitor( name="b", ).to_dist( - host=host, - port=port2, lazy_launch=False, ) msg = Msg(name="System", content={"msg_num": 0}, role="system") @@ -403,17 +418,13 @@ def test_standalone_multiprocess_init(self) -> None: logger.chat(msg) self.assertTrue(msg["content"]["quota_exceeded"]) - def test_multi_agent(self) -> None: + def test_multi_agent_in_same_server(self) -> None: """test agent server with multi agent""" launcher = RpcAgentServerLauncher( - # choose port automatically - agent_class=DemoRpcAgentWithMemory, - agent_kwargs={ - "name": "a", - }, - local_mode=False, host="127.0.0.1", port=12010, + local_mode=False, + custom_agents=[DemoRpcAgentWithMemory], ) launcher.launch() # although agent1 and agent2 connect to the same server @@ -425,16 +436,15 @@ def test_multi_agent(self) -> None: agent1 = agent1.to_dist( host="127.0.0.1", port=launcher.port, - launch_server=False, ) self.assertEqual(oid, agent1.agent_id) self.assertEqual(oid, agent1.client.agent_id) - agent2 = DemoRpcAgentWithMemory( + agent2 = DemoRpcAgentWithMemory( # pylint: disable=E1123 name="a", - ).to_dist( - host="127.0.0.1", - port=launcher.port, - launch_server=False, + to_dist={ + "host": "127.0.0.1", + "port": launcher.port, + }, ) # agent3 has the same agent id as agent1 # so it share the same memory with agent1 @@ -443,7 +453,6 @@ def test_multi_agent(self) -> None: ).to_dist( host="127.0.0.1", port=launcher.port, - launch_server=False, ) agent3._agent_id = agent1.agent_id # pylint: disable=W0212 agent3.client.agent_id = agent1.client.agent_id @@ -463,7 +472,7 @@ def test_multi_agent(self) -> None: agent2.client.delete_agent() msg2 = Msg(name="System", content="First Msg for agent2") res2 = agent2(msg2) - self.assertEqual(res2.content["mem_size"], 1) + self.assertRaises(ValueError, res2.__getattr__, "content") # should override remote default parameter(e.g. name field) agent4 = DemoRpcAgentWithMemory( @@ -471,7 +480,6 @@ def test_multi_agent(self) -> None: ).to_dist( host="127.0.0.1", port=launcher.port, - launch_server=False, ) msg5 = Msg(name="System", content="Second Msg for agent4") res5 = agent4(msg5) @@ -523,9 +531,76 @@ def test_clone_instances(self) -> None: self.assertNotEqual(agent4.agent_id, agent.agent_id) self.assertIsNone(agent3.server_launcher) self.assertIsNone(agent4.server_launcher) + msg3 = Msg(name="System", content="First Msg for agent3") + res3 = agent3(msg3) + self.assertEqual(res1.content["mem_size"], 1) + msg4 = Msg(name="System", content="First Msg for agent4") + res4 = agent4(msg4) + self.assertEqual(res3.content["mem_size"], 1) + self.assertEqual(res4.content["mem_size"], 1) def test_error_handling(self) -> None: """Test error handling""" agent = DemoErrorAgent(name="a").to_dist() x = agent() self.assertRaises(RuntimeError, x.__getattr__, "content") + + def test_agent_nesting(self) -> None: + """Test agent nesting""" + host = "localhost" + launcher1 = RpcAgentServerLauncher( + # choose port automatically + host=host, + port=12010, + local_mode=False, + custom_agents=[DemoGatherAgent, DemoGeneratorAgent], + ) + launcher2 = RpcAgentServerLauncher( + # choose port automatically + host=host, + port=12011, + local_mode=False, + custom_agents=[DemoGatherAgent, DemoGeneratorAgent], + ) + launcher1.launch() + launcher2.launch() + agents = [] + for i in range(8): + if i % 2: + agents.append( + DemoGeneratorAgent(name=f"a_{i}", value=i).to_dist( + host=host, + port=launcher1.port, + ), + ) + else: + agents.append( + DemoGeneratorAgent(name=f"a_{i}", value=i).to_dist( + host=host, + port=launcher2.port, + ), + ) + gather1 = DemoGatherAgent( # pylint: disable=E1123 + name="g1", + agents=agents[:4], + to_dist=DistConf( + host=host, + port=launcher1.port, + ), + ) + gather2 = DemoGatherAgent( # pylint: disable=E1123 + name="g2", + agents=agents[4:], + to_dist={ + "host": host, + "port": launcher2.port, + }, + ) + r1 = gather1() + r2 = gather2() + self.assertEqual(r1.content["value"], 6) + self.assertEqual(r2.content["value"], 22) + self.assertTrue(0.5 < r1.content["time"] < 2) + self.assertTrue(0.5 < r2.content["time"] < 2) + launcher1.shutdown() + launcher2.shutdown() diff --git a/tests/zhipu_test.py b/tests/zhipu_test.py new file mode 100644 index 000000000..8573c265b --- /dev/null +++ b/tests/zhipu_test.py @@ -0,0 +1,116 @@ +# -*- coding: utf-8 -*- +"""zhipuai test""" +import unittest +from unittest.mock import patch, MagicMock + +import agentscope +from agentscope.models import load_model_by_config_name + + +class TestZhipuAIChatWrapper(unittest.TestCase): + """Test ZhipuAI Chat Wrapper""" + + def setUp(self) -> None: + self.api_key = "test_api_key.secret_key" + self.messages = [ + {"role": "user", "content": "Hello, ZhipuAI!"}, + {"role": "assistant", "content": "How can I assist you?"}, + ] + + @patch("agentscope.models.zhipu_model.zhipuai") + def test_chat(self, mock_zhipuai: MagicMock) -> None: + """ + Test chat""" + mock_response = MagicMock() + mock_response.model_dump.return_value = { + "choices": [ + {"message": {"content": "Hello, this is a mocked response!"}}, + ], + "usage": { + "prompt_tokens": 100, + "completion_tokens": 5, + "total_tokens": 105, + }, + } + mock_response.choices[ + 0 + ].message.content = "Hello, this is a mocked response!" + mock_zhipuai_client = MagicMock() + mock_zhipuai.ZhipuAI.return_value = mock_zhipuai_client + + mock_zhipuai_client.chat.completions.create.return_value = ( + mock_response + ) + + agentscope.init( + model_configs={ + "config_name": "test_config", + "model_type": "zhipuai_chat", + "model_name": "glm-4", + "api_key": self.api_key, + }, + ) + + model = load_model_by_config_name("test_config") + + response = model(messages=self.messages) + + self.assertEqual(response.text, "Hello, this is a mocked response!") + + mock_zhipuai_client.chat.completions.create.assert_called_once() + + +class TestZhipuAIEmbeddingWrapper(unittest.TestCase): + """Test ZhipuAI Embedding Wrapper""" + + def setUp(self) -> None: + self.api_key = "test_api_key" + self.model_name = "embedding-2" + self.text_to_embed = "This is a test sentence for embedding." + + @patch("agentscope.models.zhipu_model.zhipuai") + def test_embedding(self, mock_zhipuai: MagicMock) -> None: + """Test embedding API""" + mock_embedding_response = MagicMock() + mock_embedding_response.model_dump.return_value = { + "data": [ + {"embedding": [0.1, 0.2, 0.3]}, + ], + "usage": { + "prompt_tokens": 10, + "completion_tokens": 2, + "total_tokens": 12, + }, + } + + mock_zhipuai_client = MagicMock() + mock_zhipuai.ZhipuAI.return_value = mock_zhipuai_client + mock_zhipuai_client.embeddings.create.return_value = ( + mock_embedding_response + ) + + agentscope.init( + model_configs={ + "config_name": "test_embedding", + "model_type": "zhipuai_embedding", + "model_name": self.model_name, + "api_key": self.api_key, + }, + ) + + model = load_model_by_config_name("test_embedding") + + response = model(self.text_to_embed) + + expected_embedding = [[0.1, 0.2, 0.3]] + self.assertEqual(response.embedding, expected_embedding) + + mock_zhipuai_client.embeddings.create.assert_called_once_with( + input=self.text_to_embed, + model=self.model_name, + **{}, + ) + + +if __name__ == "__main__": + unittest.main()