Add new parsers for uncertain tag names and quantities (#341)

modelscope · Jul 23, 2024 · 8771ba9 · 8771ba9
1 parent 527d274
commit 8771ba9
Show file tree

Hide file tree

Showing 7 changed files with 308 additions and 107 deletions.
diff --git a/docs/sphinx_doc/en/source/tutorial/203-parser.md b/docs/sphinx_doc/en/source/tutorial/203-parser.md
@@ -65,13 +65,15 @@ You should generate python code in a fenced code block as follows
 
 AgentScope provides multiple built-in parsers, and developers can choose according to their needs.
 
-| Target Format | Parser Class | Description                                                                                                                                                                  |
-| --- | --- |------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
-| String | `MarkdownCodeBlockParser` | Requires LLM to generate specified text within a Markdown code block marked by ```. The result is a string.                                                                  |
-| Dictionary | `MarkdownJsonDictParser` | Requires LLM to produce a specified dictionary within the code block marked by \```json and \```. The result is a Python dictionary.                                         |
-|  | `MultiTaggedContentParser` | Requires LLM to generate specified content within multiple tags. Contents from different tags will be parsed into a single Python dictionary with different key-value pairs. |
+| Target Format             | Parser Class               | Description                                                                                                                                                                  |
+|---------------------------|----------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| String                    | `MarkdownCodeBlockParser`  | Requires LLM to generate specified text within a Markdown code block marked by ```. The result is a string.                                                                  |
+| Dictionary                | `MarkdownJsonDictParser`   | Requires LLM to produce a specified dictionary within the code block marked by \```json and \```. The result is a Python dictionary.                                         |
+|                           | `MultiTaggedContentParser` | Requires LLM to generate specified content within multiple tags. Contents from different tags will be parsed into a single Python dictionary with different key-value pairs. |
+|                           | `RegexTaggedContentParser` | For uncertain tag names and quantities, allows users to modify regular expressions, and the return result is a dictionary.                                                   |
 | JSON / Python Object Type | `MarkdownJsonObjectParser` | Requires LLM to produce specified content within the code block marked by \```json and \```. The result will be converted into a Python object via json.loads.               |
 
+
 > **NOTE**: Compared to `MarkdownJsonDictParser`, `MultiTaggedContentParser` is more suitable for weak LLMs and when the required format is too complex.
 > For example, when LLM is required to generate Python code, if the code is returned directly within a dictionary, LLM needs to be aware of escaping characters (\t, \n, ...), and the differences between double and single quotes when calling `json.loads`
 >
@@ -263,12 +265,34 @@ In AgentScope, we achieve post-processing by calling the `to_content`, `to_memor
 >   None
 >   ```
 
+#### Parsers
 
-Next we will introduce two parsers for dictionary type.
+For dictionary type return values, AgentScope provides multiple parsers for developers to choose from according to their needs.
 
-#### MarkdownJsonDictParser
+##### RegexTaggedContentParser
 
-##### Initialization & Format Instruction Template
+###### Initialization
+
+`RegexTaggedContentParser` is designed for scenarios where 1) the tag name is uncertain, and 2) the number of tags is uncertain.
+In this case, the parser cannot provide a general response format instruction, so developers need to provide the corresponding response format instruction (`format_instruction`) when initializing.
+Of course, the developers can handle the prompt engineering by themselves optionally.
+
+```python
+from agentscope.parsers import RegexTaggedContentParser
+
+parser = RegexTaggedContentParser(
+    format_instruction="""Respond with specific tags as outlined below
+<thought>what you thought</thought>
+<speak>what you speak</speak>
+""",
+    try_parse_json=True,                    # Try to parse the content of the tag as JSON object
+    required_keys=["thought", "speak"]      # Required keys in the returned dictionary
+)
+```
+
+##### MarkdownJsonDictParser
+
+###### Initialization & Format Instruction Template
 
 - `MarkdownJsonDictParser` requires LLM to generate dictionary within a code block fenced by \```json and \``` tags.
 
@@ -303,7 +327,7 @@ This parameter can be a string or a dictionary. For dictionary, it will be autom
   ```
   ````
 
-##### Validation
+###### Validation
 
 The `content_hint` parameter in `MarkdownJsonDictParser` also supports type validation based on Pydantic. When initializing, you can set `content_hint` to a Pydantic model class, and AgentScope will modify the `instruction_format` attribute based on this class. Besides, Pydantic will be used to validate the dictionary returned by LLM during parsing.
 
@@ -346,11 +370,11 @@ parser.parser("""
 """)
 ````
 
-#### MultiTaggedContentParser
+##### MultiTaggedContentParser
 
 `MultiTaggedContentParser` asks LLM to generate specific content within multiple tag pairs. The content from different tag pairs will be parsed into a single Python dictionary. Its usage is similar to `MarkdownJsonDictParser`, but the initialization method is different, and it is more suitable for weak LLMs or complex return content.
 
-##### Initialization & Format Instruction Template
+###### Initialization & Format Instruction Template
 
 Within `MultiTaggedContentParser`, each tag pair will be specified by as `TaggedContent` object, which contains
 - Tag name (`name`), the key value in the returned dictionary
@@ -393,7 +417,7 @@ Respond with specific tags as outlined below, and the content between [FINISH_DI
 [FINISH_DISCUSSION]true/false, whether the discussion is finished[/FINISH_DISCUSSION]
 ```
 
-##### Parse Function
+###### Parse Function
 
 - `MultiTaggedContentParser`'s parsing result is a dictionary, whose keys are the value of `name` in the `TaggedContent` objects.
 The following is an example of parsing the LLM response in the werewolf game:

diff --git a/docs/sphinx_doc/zh_CN/source/tutorial/203-parser.md b/docs/sphinx_doc/zh_CN/source/tutorial/203-parser.md
@@ -12,13 +12,17 @@
       - [初始化](#初始化)
       - [响应格式模版](#响应格式模版)
       - [解析函数](#解析函数)
-  - [字典类型](#字典dict类型)
-    - [MarkdownJsonDictParser](#markdownjsondictparser)
-      - [初始化 & 响应格式模版](#初始化--响应格式模版)
-      - [类型校验](#类型校验)
-    - [MultiTaggedContentParser](#multitaggedcontentparser)
-      - [初始化 & 响应格式模版](#初始化--响应格式模版-1)
-      - [解析函数](#解析函数-1)
+  - [字典类型](#字典类型)
+    - [关于 DictFilterMixin](#关于-dictfiltermixin)
+    - [解析器](#解析器)
+      - [RegexTaggedContentParser](#regextaggedcontentparser)
+        - [初始化](#初始化)
+      - [MarkdownJsonDictParser](#markdownjsondictparser)
+        - [初始化 & 响应格式模版](#初始化--响应格式模版)
+        - [类型校验](#类型校验)
+      - [MultiTaggedContentParser](#multitaggedcontentparser)
+        - [初始化 & 响应格式模版](#初始化--响应格式模版-1)
+        - [解析函数](#解析函数-1)
   - [JSON / Python 对象类型](#json--python-对象类型)
     - [MarkdownJsonObjectParser](#markdownjsonobjectparser)
       - [初始化 & 响应格式模版](#初始化--响应格式模版-2)
@@ -72,6 +76,7 @@ AgentScope提供了多种不同解析器，开发者可以根据自己的需求
 | 字符串(`str`)类型      | `MarkdownCodeBlockParser`  | 要求 LLM 将指定的文本生成到Markdown中以 ``` 标识的代码块中，解析结果为字符串。                            |
 | 字典(`dict`)类型      | `MarkdownJsonDictParser`   | 要求 LLM 在 \```json 和 \``` 标识的代码块中产生指定内容的字典，解析结果为 Python 字典。                  |
 |                   | `MultiTaggedContentParser` | 要求 LLM 在多个标签中产生指定内容，这些不同标签中的内容将一同被解析成一个 Python 字典，并填入不同的键值对中。               |
+|                   | `RegexTaggedContentParser` | 适用于不确定标签名，不确定标签数量的场景。允许用户修改正则表达式，返回结果为字典。                                   |
 | JSON / Python对象类型 | `MarkdownJsonObjectParser` | 要求 LLM 在 \```json 和 \``` 标识的代码块中产生指定的内容，解析结果将通过 `json.loads` 转换成 Python 对象。 |
 
 > **NOTE**: 相比`MarkdownJsonDictParser`，`MultiTaggedContentParser`更适合于模型能力不强，以及需要 LLM 返回内容过于复杂的情况。例如 LLM 返回 Python 代码，如果直接在字典中返回代码，那么 LLM 需要注意特殊字符的转义（\t,\n,...），`json.loads`读取时对双引号和单引号的区分等问题。而`MultiTaggedContentParser`实际是让大模型在每个单独的标签中返回各个键值，然后再将它们组成字典，从而降低了LLM返回的难度。
@@ -140,9 +145,13 @@ AgentScope提供了多种不同解析器，开发者可以根据自己的需求
     print("hello world!")
     ```
 
-### 字典（`dict`）类型
+### 字典类型
 
-与字符串和一般的 JSON / Python 对象不同，作为LLM应用中常用的数据格式，AgentScope为字典类型提供了额外的后处理功能。初始化解析器时，可以通过额外设置`keys_to_content`，`keys_to_memory`，`keys_to_metadata`三个参数，从而实现在调用`parser`的`to_content`，`to_memory`和`to_metadata`方法时，对字典键值对的过滤。
+#### 关于 DictFilterMixin
+
+与字符串和一般的 JSON / Python 对象不同，作为 LLM 应用中常用的数据格式，AgentScope 通过 [`DictFilterMixin`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/parsers/parser_base.py#L77) 类为字典类型的解析提供后处理功能。
+
+初始化解析器时，可以通过额外设置`keys_to_content`，`keys_to_memory`，`keys_to_metadata`三个参数，从而实现在调用`parser`的`to_content`，`to_memory`和`to_metadata`方法时，对字典键值对的过滤。
 其中
   - `keys_to_content` 指定的键值对将被放置在返回`Msg`对象中的`content`字段，这个字段内容将会被返回给其它智能体，参与到其他智能体的提示构建中，同时也会被`self.speak`函数调用，用于显式输出
   - `keys_to_memory` 指定的键值对将被存储到智能体的记忆中
@@ -264,11 +273,33 @@ AgentScope中，我们通过调用`to_content`，`to_memory`和`to_metadata`方
 >   None
 >   ```
 
-下面我们具体介绍两种字典类型的解析器。
+#### 解析器
 
-#### MarkdownJsonDictParser
+针对字典类型的返回值，AgentScope 提供了多种不同的解析器，开发者可以根据自己的需求进行选择。
 
-##### 初始化 & 响应格式模版
+##### RegexTaggedContentParser
+
+###### 初始化
+
+`RegexTaggedContentParser` 主要用于1）不确定的标签名，以及2）不确定标签数量的场景。在这种情况下，该解析器无法提供一个泛用性广的响应格式说明，因此需要开发者在初始化时提供对应的相应格式说明（`format_instruction`）。
+除此之外，用户可以通过设置`try_parse_json`，`required_keys`等参数，设置解析器的行为。
+
+```python
+from agentscope.parsers import RegexTaggedContentParser
+
+parser = RegexTaggedContentParser(
+    format_instruction="""Respond with specific tags as outlined below
+<thought>what you thought</thought>
+<speak>what you speak</speak>
+""",
+    try_parse_json=True,                    # 尝试将标签内容解析成 JSON 对象
+    required_keys=["thought", "speak"]      # 必须包含的键
+)
+```
+
+##### MarkdownJsonDictParser
+
+###### 初始化 & 响应格式模版
 
 - `MarkdownJsonDictParser`要求 LLM 在 \```json 和 \``` 标识的代码块中产生指定内容的字典。
 - 除了`to_content`，`to_memory`和`to_metadata`参数外，可以通过提供 `content_hint` 参数提供响应结果样例和说明，即提示LLM应该产生什么样子的字典，该参数可以是字符串，也可以是字典，在构建响应格式提示的时候将会被自动转换成字符串进行拼接。
@@ -300,7 +331,7 @@ AgentScope中，我们通过调用`to_content`，`to_memory`和`to_metadata`方
   ```
   ````
 
-##### 类型校验
+###### 类型校验
 
 `MarkdownJsonDictParser`中的`content_hint`参数还支持基于Pydantic的类型校验。初始化时，可以将`content_hint`设置为一个Pydantic的模型类，AgentScope将根据这个类来修改`instruction_format`属性，并且利用Pydantic在解析时对LLM返回的字典进行类型校验。
 该功能需要LLM能够理解JSON schema格式的提示，因此适用于能力较强的大模型。
@@ -344,11 +375,11 @@ parser.parser("""
 """)
 ````
 
-#### MultiTaggedContentParser
+##### MultiTaggedContentParser
 
 `MultiTaggedContentParser`要求 LLM 在多个指定的标签对中产生指定的内容，这些不同标签的内容将一同被解析为一个 Python 字典。使用方法与`MarkdownJsonDictParser`类似，只是初始化方法不同，更适合能力较弱的LLM，或是比较复杂的返回内容。
 
-##### 初始化 & 响应格式模版
+###### 初始化 & 响应格式模版
 
 `MultiTaggedContentParser`中，每一组标签将会以`TaggedContent`对象的形式传入，其中`TaggedContent`对象包含了
 - 标签名（`name`），即返回字典中的key值
@@ -391,7 +422,7 @@ Respond with specific tags as outlined below, and the content between [FINISH_DI
 [FINISH_DISCUSSION]true/false, whether the discussion is finished[/FINISH_DISCUSSION]
 ```
 
-##### 解析函数
+###### 解析函数
 
 - `MultiTaggedContentParser`的解析结果为字典，其中key为`TaggedContent`对象的`name`的值，以下是狼人杀中解析 LLM 返回的样例：
 

diff --git a/examples/conversation_with_react_agent/code/conversation_with_react_agent.py b/examples/conversation_with_react_agent/code/conversation_with_react_agent.py
@@ -70,6 +70,7 @@ def execute_python_code(code: str) -> ServiceResponse:  # pylint: disable=C0301
 agentscope.init(
     model_configs=YOUR_MODEL_CONFIGURATION,
     project="Conversation with ReActAgent",
+    save_api_invoke=True,
 )
 
 # Create agents