diff --git a/docs/sphinx_doc/en/source/tutorial/203-parser.md b/docs/sphinx_doc/en/source/tutorial/203-parser.md
index bb2fae98e..5bbf46dd7 100644
--- a/docs/sphinx_doc/en/source/tutorial/203-parser.md
+++ b/docs/sphinx_doc/en/source/tutorial/203-parser.md
@@ -65,13 +65,15 @@ You should generate python code in a fenced code block as follows
AgentScope provides multiple built-in parsers, and developers can choose according to their needs.
-| Target Format | Parser Class | Description |
-| --- | --- |------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
-| String | `MarkdownCodeBlockParser` | Requires LLM to generate specified text within a Markdown code block marked by ```. The result is a string. |
-| Dictionary | `MarkdownJsonDictParser` | Requires LLM to produce a specified dictionary within the code block marked by \```json and \```. The result is a Python dictionary. |
-| | `MultiTaggedContentParser` | Requires LLM to generate specified content within multiple tags. Contents from different tags will be parsed into a single Python dictionary with different key-value pairs. |
+| Target Format | Parser Class | Description |
+|---------------------------|----------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| String | `MarkdownCodeBlockParser` | Requires LLM to generate specified text within a Markdown code block marked by ```. The result is a string. |
+| Dictionary | `MarkdownJsonDictParser` | Requires LLM to produce a specified dictionary within the code block marked by \```json and \```. The result is a Python dictionary. |
+| | `MultiTaggedContentParser` | Requires LLM to generate specified content within multiple tags. Contents from different tags will be parsed into a single Python dictionary with different key-value pairs. |
+| | `RegexTaggedContentParser` | For uncertain tag names and quantities, allows users to modify regular expressions, and the return result is a dictionary. |
| JSON / Python Object Type | `MarkdownJsonObjectParser` | Requires LLM to produce specified content within the code block marked by \```json and \```. The result will be converted into a Python object via json.loads. |
+
> **NOTE**: Compared to `MarkdownJsonDictParser`, `MultiTaggedContentParser` is more suitable for weak LLMs and when the required format is too complex.
> For example, when LLM is required to generate Python code, if the code is returned directly within a dictionary, LLM needs to be aware of escaping characters (\t, \n, ...), and the differences between double and single quotes when calling `json.loads`
>
@@ -263,12 +265,34 @@ In AgentScope, we achieve post-processing by calling the `to_content`, `to_memor
> None
> ```
+#### Parsers
-Next we will introduce two parsers for dictionary type.
+For dictionary type return values, AgentScope provides multiple parsers for developers to choose from according to their needs.
-#### MarkdownJsonDictParser
+##### RegexTaggedContentParser
-##### Initialization & Format Instruction Template
+###### Initialization
+
+`RegexTaggedContentParser` is designed for scenarios where 1) the tag name is uncertain, and 2) the number of tags is uncertain.
+In this case, the parser cannot provide a general response format instruction, so developers need to provide the corresponding response format instruction (`format_instruction`) when initializing.
+Of course, the developers can handle the prompt engineering by themselves optionally.
+
+```python
+from agentscope.parsers import RegexTaggedContentParser
+
+parser = RegexTaggedContentParser(
+ format_instruction="""Respond with specific tags as outlined below
+what you thought
+what you speak
+""",
+ try_parse_json=True, # Try to parse the content of the tag as JSON object
+ required_keys=["thought", "speak"] # Required keys in the returned dictionary
+)
+```
+
+##### MarkdownJsonDictParser
+
+###### Initialization & Format Instruction Template
- `MarkdownJsonDictParser` requires LLM to generate dictionary within a code block fenced by \```json and \``` tags.
@@ -303,7 +327,7 @@ This parameter can be a string or a dictionary. For dictionary, it will be autom
```
````
-##### Validation
+###### Validation
The `content_hint` parameter in `MarkdownJsonDictParser` also supports type validation based on Pydantic. When initializing, you can set `content_hint` to a Pydantic model class, and AgentScope will modify the `instruction_format` attribute based on this class. Besides, Pydantic will be used to validate the dictionary returned by LLM during parsing.
@@ -346,11 +370,11 @@ parser.parser("""
""")
````
-#### MultiTaggedContentParser
+##### MultiTaggedContentParser
`MultiTaggedContentParser` asks LLM to generate specific content within multiple tag pairs. The content from different tag pairs will be parsed into a single Python dictionary. Its usage is similar to `MarkdownJsonDictParser`, but the initialization method is different, and it is more suitable for weak LLMs or complex return content.
-##### Initialization & Format Instruction Template
+###### Initialization & Format Instruction Template
Within `MultiTaggedContentParser`, each tag pair will be specified by as `TaggedContent` object, which contains
- Tag name (`name`), the key value in the returned dictionary
@@ -393,7 +417,7 @@ Respond with specific tags as outlined below, and the content between [FINISH_DI
[FINISH_DISCUSSION]true/false, whether the discussion is finished[/FINISH_DISCUSSION]
```
-##### Parse Function
+###### Parse Function
- `MultiTaggedContentParser`'s parsing result is a dictionary, whose keys are the value of `name` in the `TaggedContent` objects.
The following is an example of parsing the LLM response in the werewolf game:
diff --git a/docs/sphinx_doc/zh_CN/source/tutorial/203-parser.md b/docs/sphinx_doc/zh_CN/source/tutorial/203-parser.md
index 8bad224e4..8ea13231c 100644
--- a/docs/sphinx_doc/zh_CN/source/tutorial/203-parser.md
+++ b/docs/sphinx_doc/zh_CN/source/tutorial/203-parser.md
@@ -12,13 +12,17 @@
- [初始化](#初始化)
- [响应格式模版](#响应格式模版)
- [解析函数](#解析函数)
- - [字典类型](#字典dict类型)
- - [MarkdownJsonDictParser](#markdownjsondictparser)
- - [初始化 & 响应格式模版](#初始化--响应格式模版)
- - [类型校验](#类型校验)
- - [MultiTaggedContentParser](#multitaggedcontentparser)
- - [初始化 & 响应格式模版](#初始化--响应格式模版-1)
- - [解析函数](#解析函数-1)
+ - [字典类型](#字典类型)
+ - [关于 DictFilterMixin](#关于-dictfiltermixin)
+ - [解析器](#解析器)
+ - [RegexTaggedContentParser](#regextaggedcontentparser)
+ - [初始化](#初始化)
+ - [MarkdownJsonDictParser](#markdownjsondictparser)
+ - [初始化 & 响应格式模版](#初始化--响应格式模版)
+ - [类型校验](#类型校验)
+ - [MultiTaggedContentParser](#multitaggedcontentparser)
+ - [初始化 & 响应格式模版](#初始化--响应格式模版-1)
+ - [解析函数](#解析函数-1)
- [JSON / Python 对象类型](#json--python-对象类型)
- [MarkdownJsonObjectParser](#markdownjsonobjectparser)
- [初始化 & 响应格式模版](#初始化--响应格式模版-2)
@@ -72,6 +76,7 @@ AgentScope提供了多种不同解析器,开发者可以根据自己的需求
| 字符串(`str`)类型 | `MarkdownCodeBlockParser` | 要求 LLM 将指定的文本生成到Markdown中以 ``` 标识的代码块中,解析结果为字符串。 |
| 字典(`dict`)类型 | `MarkdownJsonDictParser` | 要求 LLM 在 \```json 和 \``` 标识的代码块中产生指定内容的字典,解析结果为 Python 字典。 |
| | `MultiTaggedContentParser` | 要求 LLM 在多个标签中产生指定内容,这些不同标签中的内容将一同被解析成一个 Python 字典,并填入不同的键值对中。 |
+| | `RegexTaggedContentParser` | 适用于不确定标签名,不确定标签数量的场景。允许用户修改正则表达式,返回结果为字典。 |
| JSON / Python对象类型 | `MarkdownJsonObjectParser` | 要求 LLM 在 \```json 和 \``` 标识的代码块中产生指定的内容,解析结果将通过 `json.loads` 转换成 Python 对象。 |
> **NOTE**: 相比`MarkdownJsonDictParser`,`MultiTaggedContentParser`更适合于模型能力不强,以及需要 LLM 返回内容过于复杂的情况。例如 LLM 返回 Python 代码,如果直接在字典中返回代码,那么 LLM 需要注意特殊字符的转义(\t,\n,...),`json.loads`读取时对双引号和单引号的区分等问题。而`MultiTaggedContentParser`实际是让大模型在每个单独的标签中返回各个键值,然后再将它们组成字典,从而降低了LLM返回的难度。
@@ -140,9 +145,13 @@ AgentScope提供了多种不同解析器,开发者可以根据自己的需求
print("hello world!")
```
-### 字典(`dict`)类型
+### 字典类型
-与字符串和一般的 JSON / Python 对象不同,作为LLM应用中常用的数据格式,AgentScope为字典类型提供了额外的后处理功能。初始化解析器时,可以通过额外设置`keys_to_content`,`keys_to_memory`,`keys_to_metadata`三个参数,从而实现在调用`parser`的`to_content`,`to_memory`和`to_metadata`方法时,对字典键值对的过滤。
+#### 关于 DictFilterMixin
+
+与字符串和一般的 JSON / Python 对象不同,作为 LLM 应用中常用的数据格式,AgentScope 通过 [`DictFilterMixin`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/parsers/parser_base.py#L77) 类为字典类型的解析提供后处理功能。
+
+初始化解析器时,可以通过额外设置`keys_to_content`,`keys_to_memory`,`keys_to_metadata`三个参数,从而实现在调用`parser`的`to_content`,`to_memory`和`to_metadata`方法时,对字典键值对的过滤。
其中
- `keys_to_content` 指定的键值对将被放置在返回`Msg`对象中的`content`字段,这个字段内容将会被返回给其它智能体,参与到其他智能体的提示构建中,同时也会被`self.speak`函数调用,用于显式输出
- `keys_to_memory` 指定的键值对将被存储到智能体的记忆中
@@ -264,11 +273,33 @@ AgentScope中,我们通过调用`to_content`,`to_memory`和`to_metadata`方
> None
> ```
-下面我们具体介绍两种字典类型的解析器。
+#### 解析器
-#### MarkdownJsonDictParser
+针对字典类型的返回值,AgentScope 提供了多种不同的解析器,开发者可以根据自己的需求进行选择。
-##### 初始化 & 响应格式模版
+##### RegexTaggedContentParser
+
+###### 初始化
+
+`RegexTaggedContentParser` 主要用于1)不确定的标签名,以及2)不确定标签数量的场景。在这种情况下,该解析器无法提供一个泛用性广的响应格式说明,因此需要开发者在初始化时提供对应的相应格式说明(`format_instruction`)。
+除此之外,用户可以通过设置`try_parse_json`,`required_keys`等参数,设置解析器的行为。
+
+```python
+from agentscope.parsers import RegexTaggedContentParser
+
+parser = RegexTaggedContentParser(
+ format_instruction="""Respond with specific tags as outlined below
+what you thought
+what you speak
+""",
+ try_parse_json=True, # 尝试将标签内容解析成 JSON 对象
+ required_keys=["thought", "speak"] # 必须包含的键
+)
+```
+
+##### MarkdownJsonDictParser
+
+###### 初始化 & 响应格式模版
- `MarkdownJsonDictParser`要求 LLM 在 \```json 和 \``` 标识的代码块中产生指定内容的字典。
- 除了`to_content`,`to_memory`和`to_metadata`参数外,可以通过提供 `content_hint` 参数提供响应结果样例和说明,即提示LLM应该产生什么样子的字典,该参数可以是字符串,也可以是字典,在构建响应格式提示的时候将会被自动转换成字符串进行拼接。
@@ -300,7 +331,7 @@ AgentScope中,我们通过调用`to_content`,`to_memory`和`to_metadata`方
```
````
-##### 类型校验
+###### 类型校验
`MarkdownJsonDictParser`中的`content_hint`参数还支持基于Pydantic的类型校验。初始化时,可以将`content_hint`设置为一个Pydantic的模型类,AgentScope将根据这个类来修改`instruction_format`属性,并且利用Pydantic在解析时对LLM返回的字典进行类型校验。
该功能需要LLM能够理解JSON schema格式的提示,因此适用于能力较强的大模型。
@@ -344,11 +375,11 @@ parser.parser("""
""")
````
-#### MultiTaggedContentParser
+##### MultiTaggedContentParser
`MultiTaggedContentParser`要求 LLM 在多个指定的标签对中产生指定的内容,这些不同标签的内容将一同被解析为一个 Python 字典。使用方法与`MarkdownJsonDictParser`类似,只是初始化方法不同,更适合能力较弱的LLM,或是比较复杂的返回内容。
-##### 初始化 & 响应格式模版
+###### 初始化 & 响应格式模版
`MultiTaggedContentParser`中,每一组标签将会以`TaggedContent`对象的形式传入,其中`TaggedContent`对象包含了
- 标签名(`name`),即返回字典中的key值
@@ -391,7 +422,7 @@ Respond with specific tags as outlined below, and the content between [FINISH_DI
[FINISH_DISCUSSION]true/false, whether the discussion is finished[/FINISH_DISCUSSION]
```
-##### 解析函数
+###### 解析函数
- `MultiTaggedContentParser`的解析结果为字典,其中key为`TaggedContent`对象的`name`的值,以下是狼人杀中解析 LLM 返回的样例:
diff --git a/examples/conversation_with_react_agent/code/conversation_with_react_agent.py b/examples/conversation_with_react_agent/code/conversation_with_react_agent.py
index 8eb1205c0..7c196dd1e 100644
--- a/examples/conversation_with_react_agent/code/conversation_with_react_agent.py
+++ b/examples/conversation_with_react_agent/code/conversation_with_react_agent.py
@@ -70,6 +70,7 @@ def execute_python_code(code: str) -> ServiceResponse: # pylint: disable=C0301
agentscope.init(
model_configs=YOUR_MODEL_CONFIGURATION,
project="Conversation with ReActAgent",
+ save_api_invoke=True,
)
# Create agents
diff --git a/src/agentscope/agents/react_agent.py b/src/agentscope/agents/react_agent.py
index f354ac4ec..4476cb5e8 100644
--- a/src/agentscope/agents/react_agent.py
+++ b/src/agentscope/agents/react_agent.py
@@ -3,16 +3,15 @@
and act iteratively to solve problems. More details can be found in the paper
https://arxiv.org/abs/2210.03629.
"""
-from typing import Any, Optional, Union, Sequence
-
-from loguru import logger
+from typing import Optional, Union, Sequence
from agentscope.exception import ResponseParsingError, FunctionCallError
from agentscope.agents import AgentBase
from agentscope.message import Msg
-from agentscope.parsers import MarkdownJsonDictParser
+from agentscope.parsers.regex_tagged_content_parser import (
+ RegexTaggedContentParser,
+)
from agentscope.service import ServiceToolkit
-from agentscope.service.service_toolkit import ServiceFunction
INSTRUCTION_PROMPT = """## What You Should Do:
1. First, analyze the current situation, and determine your goal.
@@ -42,11 +41,10 @@ def __init__(
self,
name: str,
model_config_name: str,
- service_toolkit: ServiceToolkit = None,
+ service_toolkit: ServiceToolkit,
sys_prompt: str = "You're a helpful assistant. Your name is {name}.",
max_iters: int = 10,
verbose: bool = True,
- **kwargs: Any,
) -> None:
"""Initialize the ReAct agent with the given name, model config name
and tools.
@@ -74,39 +72,6 @@ def __init__(
model_config_name=model_config_name,
)
- # TODO: To compatible with the old version, which will be deprecated
- # soon
- if "tools" in kwargs:
- logger.warning(
- "The argument `tools` will be deprecated soon. "
- "Please use `service_toolkit` instead. Example refers to "
- "https://github.com/modelscope/agentscope/blob/main/"
- "examples/conversation_with_react_agent/code/"
- "conversation_with_react_agent.py",
- )
-
- service_funcs = {}
- for func, json_schema in kwargs["tools"]:
- name = json_schema["function"]["name"]
- service_funcs[name] = ServiceFunction(
- name=name,
- original_func=func,
- processed_func=func,
- json_schema=json_schema,
- )
-
- if service_toolkit is None:
- service_toolkit = ServiceToolkit()
- service_toolkit.service_funcs = service_funcs
- else:
- service_toolkit.service_funcs.update(service_funcs)
-
- elif service_toolkit is None:
- raise ValueError(
- "The argument `service_toolkit` is required to initialize "
- "the ReActAgent.",
- )
-
self.service_toolkit = service_toolkit
self.verbose = verbose
self.max_iters = max_iters
@@ -129,15 +94,22 @@ def __init__(
self.memory.add(Msg("system", self.sys_prompt, role="system"))
# Initialize a parser object to formulate the response from the model
- self.parser = MarkdownJsonDictParser(
- content_hint={
- "thought": "what you thought",
- "speak": "what you speak",
- "function": service_toolkit.tools_calling_format,
- },
- required_keys=["thought", "speak", "function"],
- # Only print the speak field when verbose is False
- keys_to_content=True if self.verbose else "speak",
+ self.parser = RegexTaggedContentParser(
+ format_instruction="""Respond with specific tags as outlined below:
+
+- When calling tool functions, note the "arg_name" should be replaced with the actual argument name:
+what you thought
+the function name you want to call
+the value of the argument
+the value of the argument
+
+- When you want to generate a final response:
+what you thought
+what you respond
+...""", # noqa
+ try_parse_json=True,
+ required_keys=["thought"],
+ keys_to_content="response",
)
def reply(self, x: Optional[Union[Msg, Sequence[Msg]]] = None) -> Msg:
@@ -167,41 +139,38 @@ def reply(self, x: Optional[Union[Msg, Sequence[Msg]]] = None) -> Msg:
try:
raw_response = self.model(prompt)
+ # Print out the text generated by llm in non-/streaming mode
if self.verbose:
# To be compatible with streaming and non-streaming mode
self.speak(raw_response.stream or raw_response.text)
res = self.parser.parse(raw_response)
- # Record the response in memory
- self.memory.add(
- Msg(
- self.name,
- self.parser.to_memory(res.parsed),
- "assistant",
- ),
- )
-
- # Print out the response
- msg_returned = Msg(
- self.name,
- self.parser.to_content(res.parsed),
- "assistant",
- )
-
- if not self.verbose:
- self.speak(msg_returned)
+ # Record the raw text into memory to avoid that LLMs learn
+ # from the previous response format
+ self.memory.add(Msg(self.name, res.text, "assistant"))
# Skip the next steps if no need to call tools
# The parsed field is a dictionary
- arg_function = res.parsed["function"]
+ arg_function = res.parsed.get("function", "")
if (
isinstance(arg_function, str)
and arg_function in ["[]", ""]
or isinstance(arg_function, list)
and len(arg_function) == 0
):
- # Only the speak field is exposed to users or other agents
+ # Only the response field is exposed to users or other
+ # agents
+ msg_returned = Msg(
+ self.name,
+ res.parsed.get("response", res.text),
+ "assistant",
+ )
+
+ if not self.verbose:
+ # Print out the returned message
+ self.speak(msg_returned)
+
return msg_returned
# Only catch the response parsing error and expose runtime
@@ -226,6 +195,20 @@ def reply(self, x: Optional[Union[Msg, Sequence[Msg]]] = None) -> Msg:
# Parse, check and execute the tool functions in service toolkit
try:
+ # Reorganize the parsed response to the required format of the
+ # service toolkit
+ res.parsed["function"] = [
+ {
+ "name": res.parsed["function"],
+ "arguments": {
+ k: v
+ for k, v in res.parsed.items()
+ if k not in ["speak", "thought", "function"]
+ },
+ },
+ ]
+
+ # Execute the function
execute_results = self.service_toolkit.parse_and_call_func(
res.parsed["function"],
)
diff --git a/src/agentscope/parsers/__init__.py b/src/agentscope/parsers/__init__.py
index db2b93d3a..9e434a18a 100644
--- a/src/agentscope/parsers/__init__.py
+++ b/src/agentscope/parsers/__init__.py
@@ -6,6 +6,7 @@
MarkdownJsonDictParser,
)
from .code_block_parser import MarkdownCodeBlockParser
+from .regex_tagged_content_parser import RegexTaggedContentParser
from .tagged_content_parser import (
TaggedContent,
MultiTaggedContentParser,
@@ -19,4 +20,5 @@
"MarkdownCodeBlockParser",
"TaggedContent",
"MultiTaggedContentParser",
+ "RegexTaggedContentParser",
]
diff --git a/src/agentscope/parsers/regex_tagged_content_parser.py b/src/agentscope/parsers/regex_tagged_content_parser.py
new file mode 100644
index 000000000..3850a7ad0
--- /dev/null
+++ b/src/agentscope/parsers/regex_tagged_content_parser.py
@@ -0,0 +1,170 @@
+# -*- coding: utf-8 -*-
+"""The parser for dynamic tagged content"""
+import json
+import re
+from typing import Union, Sequence, Optional, List
+
+from loguru import logger
+
+from ..exception import TagNotFoundError
+from ..models import ModelResponse
+from ..parsers import ParserBase
+from ..parsers.parser_base import DictFilterMixin
+
+
+class RegexTaggedContentParser(ParserBase, DictFilterMixin):
+ """A regex tagged content parser, which extracts tagged content according
+ to the provided regex pattern. Different from other parsers, this parser
+ allows to extract multiple tagged content without knowing the keys in
+ advance. The parsed result will be a dictionary within the parsed field of
+ the model response.
+
+ Compared with other parsers, this parser is more flexible and can be used
+ in dynamic scenarios where
+ - the keys are not known in advance
+ - the number of the tagged content is not fixed
+
+ Note: Without knowing the keys in advance, it's hard to prepare a format
+ instruction template for different scenarios. Therefore, we ask the user
+ to provide the format instruction in the constructor. Of course, the user
+ can construct and manage the prompt by themselves optionally.
+
+ Example:
+ By default, the parser use a regex pattern to extract tagged content
+ with the following format:
+ ```
+ <{name1}>{content1}{name1}>
+ <{name2}>{content2}{name2}>
+ ```
+ The parser will extract the content as the following dictionary:
+ ```
+ {
+ "name1": content1,
+ "name2": content2,
+ }
+ ```
+ """
+
+ def __init__(
+ self,
+ tagged_content_pattern: str = r"<(?P[^>]+)>"
+ r"(?P.*?)"
+ r"\1?>",
+ format_instruction: Optional[str] = None,
+ try_parse_json: bool = True,
+ required_keys: Optional[List[str]] = None,
+ keys_to_memory: Union[str, bool, Sequence[str]] = True,
+ keys_to_content: Union[str, bool, Sequence[str]] = True,
+ keys_to_metadata: Union[str, bool, Sequence[str]] = False,
+ ) -> None:
+ """Initialize the regex tagged content parser.
+
+ Args:
+ tagged_content_pattern (`Optional[str]`, defaults to
+ `"<(?P[^>]+)>(?P.*?)\1?>"`):
+ The regex pattern to extract tagged content. The pattern should
+ contain two named groups: `name` and `content`. The `name`
+ group is used as the key of the tagged content, and the
+ `content` group is used as the value.
+ format_instruction (`Optional[str]`, defaults to `None`):
+ The instruction for the format of the tagged content, which
+ will be attached to the end of the prompt messages to remind
+ the LLM to follow the format.
+ try_parse_json (`bool`, defaults to `True`):
+ Whether to try to parse the tagged content as JSON. Note
+ the parsing function won't raise exceptions.
+ required_keys (`Optional[List[str]]`, defaults to `None`):
+ The keys that are required in the tagged content.
+ keys_to_memory (`Union[str, bool, Sequence[str]]`,
+ defaults to `True`):
+ The keys to save to memory.
+ keys_to_content (`Union[str, bool, Sequence[str]]`,
+ defaults to `True`):
+ The keys to save to content.
+ keys_to_metadata (`Union[str, bool, Sequence[str]]`,
+ defaults to `False`):
+ The key or keys to be filtered in `to_metadata` method. If
+ it's
+ - `False`, `None` will be returned in the `to_metadata` method
+ - `str`, the corresponding value will be returned
+ - `List[str]`, a filtered dictionary will be returned
+ - `True`, the whole dictionary will be returned
+ """
+
+ DictFilterMixin.__init__(
+ self,
+ keys_to_memory=keys_to_memory,
+ keys_to_content=keys_to_content,
+ keys_to_metadata=keys_to_metadata,
+ )
+
+ assert (
+ "" in tagged_content_pattern
+ ), "The tagged content pattern should contain a named group 'name'."
+ assert (
+ "" in tagged_content_pattern
+ ), "The tagged content pattern should contain a named group 'content'."
+
+ self.tagged_content_pattern = tagged_content_pattern
+ self._format_instruction = format_instruction
+ self.try_parse_json = try_parse_json
+ self.required_keys = required_keys or []
+
+ @property
+ def format_instruction(self) -> str:
+ """The format instruction for the tagged content."""
+ if self._format_instruction is None:
+ raise ValueError(
+ "The format instruction is not provided. Please provide it in "
+ "the constructor of the parser.",
+ )
+ return self._format_instruction
+
+ def parse(self, response: ModelResponse) -> ModelResponse:
+ """Parse the response text by the regex pattern, and return a dict of
+ the content in the parsed field of the response.
+
+ Args:
+ response (`ModelResponse`):
+ The response to be parsed.
+
+ Returns:
+ `ModelResponse`: The response with the parsed field as the parsed
+ result.
+ """
+ assert response.text is not None, "The response text is None."
+
+ matches = re.finditer(
+ self.tagged_content_pattern,
+ response.text,
+ flags=re.DOTALL,
+ )
+
+ results = {}
+ for match in matches:
+ results[match.group("name")] = match.group("content")
+
+ keys_missing = [
+ key for key in self.required_keys if key not in results
+ ]
+
+ if len(keys_missing) > 0:
+ raise TagNotFoundError(
+ f"Failed to find tags: {', '.join(keys_missing)}",
+ response.text,
+ )
+
+ if self.try_parse_json:
+ keys_failed = []
+ for key in results:
+ try:
+ results[key] = json.loads(results[key])
+ except json.JSONDecodeError:
+ keys_failed.append(key)
+
+ logger.debug(
+ f'Failed to parse JSON for keys: {", ".join(keys_failed)}',
+ )
+
+ response.parsed = results
+ return response
diff --git a/src/agentscope/service/service_toolkit.py b/src/agentscope/service/service_toolkit.py
index 28d93f5c9..299b23d3f 100644
--- a/src/agentscope/service/service_toolkit.py
+++ b/src/agentscope/service/service_toolkit.py
@@ -373,17 +373,9 @@ def _execute_func(self, cmds: List[dict]) -> str:
execute_results = []
for i, cmd in enumerate(cmds):
- func_name = cmd["name"]
service_func = self.service_funcs[cmd["name"]]
kwargs = cmd.get("arguments", {})
- print(f">>> Executing function {func_name} with arguments:")
- for key, value in kwargs.items():
- value = (
- value if len(str(value)) < 50 else str(value)[:50] + "..."
- )
- print(f">>> \t{key}: {value}")
-
# Execute the function
try:
func_res = service_func.processed_func(**kwargs)
@@ -393,8 +385,6 @@ def _execute_func(self, cmds: List[dict]) -> str:
content=str(e),
)
- print(">>> END ")
-
status = (
"SUCCESS"
if func_res.status == ServiceExecStatus.SUCCESS