Support Pydantic based validation within the MarkdownJsonDictParser (#…

…261)
modelscope · Jun 5, 2024 · 21c8826 · 21c8826
1 parent a0064ea
commit 21c8826
Show file tree

Hide file tree

Showing 6 changed files with 214 additions and 6 deletions.
diff --git a/docs/sphinx_doc/en/source/tutorial/203-parser.md b/docs/sphinx_doc/en/source/tutorial/203-parser.md
@@ -15,6 +15,7 @@
   - [Dictionary Type](#dictionary-type)
     - [MarkdownJsonDictParser](#markdownjsondictparser)
       - [Initialization & Format Instruction Template](#initialization--format-instruction-template)
+      - [Validation](#validation)
     - [MultiTaggedContentParser](#multitaggedcontentparser)
       - [Initialization & Format Instruction Template](#initialization--format-instruction-template-1)
       - [Parse Function](#parse-function-1)
@@ -77,6 +78,8 @@ AgentScope provides multiple built-in parsers, and developers can choose accordi
 > In contrast, `MultiTaggedContentParser` guides LLM to generate each key-value pair separately in individual tags and then combines them into a dictionary, thus reducing the difficulty.
 
 
+>**NOTE**: The built-in strategies to construct format instruction just provide some examples. In AgentScope, developer has complete control over prompt construction. So they can choose not to use the format instruction provided by parsers, customizing their format instruction by hand or implementing new parser class are all feasible.
+
 In the following sections, we will introduce the usage of these parsers based on different target formats.
 
 ### String Type
@@ -300,6 +303,49 @@ This parameter can be a string or a dictionary. For dictionary, it will be autom
   ```
   ````
 
+##### Validation
+
+The `content_hint` parameter in `MarkdownJsonDictParser` also supports type validation based on Pydantic. When initializing, you can set `content_hint` to a Pydantic model class, and AgentScope will modify the `instruction_format` attribute based on this class. Besides, Pydantic will be used to validate the dictionary returned by LLM during parsing.
+
+A simple example is as follows, where `"..."` can be filled with specific type validation rules, which can be referred to the [Pydantic](https://docs.pydantic.dev/latest/) documentation.
+
+  ```python
+  from pydantic import BaseModel, Field
+  from agentscope.parsers import MarkdownJsonDictParser
+
+  class Schema(BaseModel):
+      thought: str = Field(..., description="what you thought")
+      speak: str = Field(..., description="what you speak")
+      end_discussion: bool = Field(..., description="whether the discussion is finished")
+
+  parser = MarkdownJsonDictParser(content_hint=Schema)
+  ```
+
+- The corresponding `instruction_format` attribute
+
+````
+Respond a JSON dictionary in a markdown's fenced code block as follows:
+```json
+{a_JSON_dictionary}
+```
+The generated JSON dictionary MUST follow this schema:
+{'properties': {'speak': {'description': 'what you speak', 'title': 'Speak', 'type': 'string'}, 'thought': {'description': 'what you thought', 'title': 'Thought', 'type': 'string'}, 'end_discussion': {'description': 'whether the discussion reached an agreement or not', 'title': 'End Discussion', 'type': 'boolean'}}, 'required': ['speak', 'thought', 'end_discussion'], 'title': 'Schema', 'type': 'object'}
+````
+
+- During the parsing process, Pydantic will be used for type validation, and an exception will be thrown if the validation fails. Meanwhile, Pydantic also provides some fault tolerance capabilities, such as converting the string `"true"` to Python's `True`:
+
+````
+parser.parser("""
+```json
+{
+  "thought": "The others didn't realize I was a werewolf. I should end the discussion soon.",
+  "speak": "I agree with you.",
+  "end_discussion": "true"
+}
+```
+""")
+````
+
 #### MultiTaggedContentParser
 
 `MultiTaggedContentParser` asks LLM to generate specific content within multiple tag pairs. The content from different tag pairs will be parsed into a single Python dictionary. Its usage is similar to `MarkdownJsonDictParser`, but the initialization method is different, and it is more suitable for weak LLMs or complex return content.

diff --git a/docs/sphinx_doc/zh_CN/source/tutorial/203-parser.md b/docs/sphinx_doc/zh_CN/source/tutorial/203-parser.md
@@ -15,6 +15,7 @@
   - [字典类型](#字典dict类型)
     - [MarkdownJsonDictParser](#markdownjsondictparser)
       - [初始化 & 响应格式模版](#初始化--响应格式模版)
+      - [类型校验](#类型校验)
     - [MultiTaggedContentParser](#multitaggedcontentparser)
       - [初始化 & 响应格式模版](#初始化--响应格式模版-1)
       - [解析函数](#解析函数-1)
@@ -75,6 +76,8 @@ AgentScope提供了多种不同解析器，开发者可以根据自己的需求
 
 > **NOTE**: 相比`MarkdownJsonDictParser`，`MultiTaggedContentParser`更适合于模型能力不强，以及需要 LLM 返回内容过于复杂的情况。例如 LLM 返回 Python 代码，如果直接在字典中返回代码，那么 LLM 需要注意特殊字符的转义（\t,\n,...），`json.loads`读取时对双引号和单引号的区分等问题。而`MultiTaggedContentParser`实际是让大模型在每个单独的标签中返回各个键值，然后再将它们组成字典，从而降低了LLM返回的难度。
 
+> **NOTE**：AgentScope 内置的响应格式说明并不一定是最优的选择。在 AgentScope 中，开发者可以完全控制提示构建的过程，因此，选择不使用parser中内置的相应格式说明，而是自定义新的相应格式说明，或是实现新的parser类都是可行的技术方案。
+
 下面我们将根据不同的目标格式，介绍这些解析器的用法。
 
 ### 字符串（`str`）类型
@@ -297,6 +300,50 @@ AgentScope中，我们通过调用`to_content`，`to_memory`和`to_metadata`方
   ```
   ````
 
+##### 类型校验
+
+`MarkdownJsonDictParser`中的`content_hint`参数还支持基于Pydantic的类型校验。初始化时，可以将`content_hint`设置为一个Pydantic的模型类，AgentScope将根据这个类来修改`instruction_format`属性，并且利用Pydantic在解析时对LLM返回的字典进行类型校验。
+该功能需要LLM能够理解JSON schema格式的提示，因此适用于能力较强的大模型。
+
+一个简单的例子如下，`"..."`处可以填写具体的类型校验规则，可以参考[Pydantic](https://docs.pydantic.dev/latest/)文档。
+
+  ```python
+  from pydantic import BaseModel, Field
+  from agentscope.parsers import MarkdownJsonDictParser
+
+  class Schema(BaseModel):
+      thought: str = Field(..., description="what you thought")
+      speak: str = Field(..., description="what you speak")
+      end_discussion: bool = Field(..., description="whether the discussion is finished")
+
+  parser = MarkdownJsonDictParser(content_hint=Schema)
+  ```
+
+- 对应的`format_instruction`属性
+
+````
+Respond a JSON dictionary in a markdown's fenced code block as follows:
+```json
+{a_JSON_dictionary}
+```
+The generated JSON dictionary MUST follow this schema:
+{'properties': {'speak': {'description': 'what you speak', 'title': 'Speak', 'type': 'string'}, 'thought': {'description': 'what you thought', 'title': 'Thought', 'type': 'string'}, 'end_discussion': {'description': 'whether the discussion reached an agreement or not', 'title': 'End Discussion', 'type': 'boolean'}}, 'required': ['speak', 'thought', 'end_discussion'], 'title': 'Schema', 'type': 'object'}
+````
+
+- 同时在解析的过程中，也将使用Pydantic进行类型校验，校验错误将抛出异常。同时，Pydantic也将提供一定的容错处理能力，例如将字符串`"true"`转换成Python的`True`：
+
+````
+parser.parser("""
+```json
+{
+  "thought": "The others didn't realize I was a werewolf. I should end the discussion soon.",
+  "speak": "I agree with you.",
+  "end_discussion": "true"
+}
+```
+""")
+````
+
 #### MultiTaggedContentParser
 
 `MultiTaggedContentParser`要求 LLM 在多个指定的标签对中产生指定的内容，这些不同标签的内容将一同被解析为一个 Python 字典。使用方法与`MarkdownJsonDictParser`类似，只是初始化方法不同，更适合能力较弱的LLM，或是比较复杂的返回内容。

diff --git a/setup.py b/setup.py
@@ -54,6 +54,7 @@
 # released requires
 minimal_requires = [
     "docstring_parser",
+    "pydantic",
     "loguru==0.6.0",
     "tiktoken",
     "Pillow",

diff --git a/src/agentscope/exception.py b/src/agentscope/exception.py
@@ -24,6 +24,10 @@ class JsonParsingError(ResponseParsingError):
     """The exception class for JSON parsing error."""
 
 
+class JsonDictValidationError(ResponseParsingError):
+    """The exception class for JSON dict validation error."""
+
+
 class JsonTypeError(ResponseParsingError):
     """The exception class for JSON type error."""
 

diff --git a/src/agentscope/parsers/json_object_parser.py b/src/agentscope/parsers/json_object_parser.py
@@ -1,10 +1,12 @@
 # -*- coding: utf-8 -*-
 """The parser for JSON object in the model response."""
+import inspect
 import json
 from copy import deepcopy
 from typing import Optional, Any, List, Sequence, Union
 
 from loguru import logger
+from pydantic import BaseModel
 
 from agentscope.exception import (
     TagNotFoundError,
@@ -139,11 +141,22 @@ class MarkdownJsonDictParser(MarkdownJsonObjectParser, DictFilterMixin):
     """Closing end for a code block."""
 
     _format_instruction = (
-        "You should respond a json object in a json fenced code block as "
+        "Respond a JSON dictionary in a markdown's fenced code block as "
         "follows:\n```json\n{content_hint}\n```"
     )
     """The instruction for the format of the json object."""
 
+    _format_instruction_with_schema = (
+        "Respond a JSON dictionary in a markdown's fenced code block as "
+        "follows:\n"
+        "```json\n"
+        "{content_hint}\n"
+        "```\n"
+        "The generated JSON dictionary MUST follow this schema: \n"
+        "{schema}"
+    )
+    """The schema instruction for the format of the json object."""
+
     required_keys: List[str]
     """A list of required keys in the JSON dictionary object. If the response
     misses any of the required keys, it will raise a
@@ -164,7 +177,8 @@ def __init__(
                 The hint used to remind LLM what should be fill between the
                 tags. If it is a string, it will be used as the content hint
                 directly. If it is a dict, it will be converted to a json
-                string and used as the content hint.
+                string and used as the content hint. If it's a Pydantic model,
+                the schema will be displayed in the instruction.
             required_keys (`List[str]`, defaults to `[]`):
                 A list of required keys in the JSON dictionary object. If the
                 response misses any of the required keys, it will raise a
@@ -177,7 +191,7 @@ def __init__(
                 - `str`, the corresponding value will be returned
                 - `List[str]`, a filtered dictionary will be returned
                 - `True`, the whole dictionary will be returned
-            keys_to_content (`Optional[Union[str, bool, Sequence[str]]`,
+            keys_to_content (`Optional[Union[str, bool, Sequence[str]]]`,
             defaults to `True`):
                 The key or keys to be filtered in `to_content` method. If
                 it's
@@ -195,8 +209,23 @@ def __init__(
                 - `True`, the whole dictionary will be returned
 
         """
-        # Initialize the markdown json object parser
-        MarkdownJsonObjectParser.__init__(self, content_hint)
+        self.pydantic_class = None
+
+        # Initialize the content_hint according to the type of content_hint
+        if inspect.isclass(content_hint) and issubclass(
+            content_hint,
+            BaseModel,
+        ):
+            self.pydantic_class = content_hint
+            self.content_hint = "{a_JSON_dictionary}"
+        elif content_hint is not None:
+            if isinstance(content_hint, str):
+                self.content_hint = content_hint
+            else:
+                self.content_hint = json.dumps(
+                    content_hint,
+                    ensure_ascii=False,
+                )
 
         # Initialize the mixin class to allow filtering the parsed response
         DictFilterMixin.__init__(
@@ -208,6 +237,21 @@ def __init__(
 
         self.required_keys = required_keys or []
 
+    @property
+    def format_instruction(self) -> str:
+        """Get the format instruction for the json object, if the
+        format_example is provided, it will be used as the example.
+        """
+        if self.pydantic_class is None:
+            return self._format_instruction.format(
+                content_hint=self.content_hint,
+            )
+        else:
+            return self._format_instruction_with_schema.format(
+                content_hint=self.content_hint,
+                schema=self.pydantic_class.model_json_schema(),
+            )
+
     def parse(self, response: ModelResponse) -> ModelResponse:
         """Parse the text field of the response to a JSON dictionary object,
         store it in the parsed field of the response object, and check if the
@@ -224,6 +268,16 @@ def parse(self, response: ModelResponse) -> ModelResponse:
                 response.text,
             )
 
+        # Requirement checking by Pydantic
+        if self.pydantic_class is not None:
+            try:
+                response.parsed = dict(self.pydantic_class(**response.parsed))
+            except Exception as e:
+                raise JsonParsingError(
+                    message=str(e),
+                    raw_response=response.text,
+                ) from None
+
         # Check if the required keys exist
         keys_missing = []
         for key in self.required_keys:

diff --git a/tests/parser_test.py b/tests/parser_test.py
@@ -2,6 +2,8 @@
 """Unit test for model response parser."""
 import unittest
 
+from pydantic import BaseModel, Field
+
 from agentscope.models import ModelResponse
 from agentscope.parsers import (
     MarkdownJsonDictParser,
@@ -27,7 +29,7 @@ def setUp(self) -> None:
             ),
         )
         self.instruction_dict_1 = (
-            "You should respond a json object in a json fenced code block "
+            "Respond a JSON dictionary in a markdown's fenced code block "
             "as follows:\n"
             "```json\n"
             '{"speak": "what you speak", '
@@ -59,6 +61,22 @@ def setUp(self) -> None:
             '"end_discussion": true/false}'
         )
 
+        self.instruction_dict_3 = (
+            "Respond a JSON dictionary in a markdown's fenced code block as "
+            "follows:\n"
+            "```json\n"
+            "{a_JSON_dictionary}\n"
+            "```\n"
+            "The generated JSON dictionary MUST follow this schema: \n"
+            "{'properties': {'speak': {'description': 'what you speak', "
+            "'title': 'Speak', 'type': 'string'}, 'thought': {'description': "
+            "'what you thought', 'title': 'Thought', 'type': 'string'}, "
+            "'end_discussion': {'description': 'whether the discussion "
+            "reached an agreement or not', 'title': 'End Discussion', "
+            "'type': 'boolean'}}, 'required': ['speak', 'thought', "
+            "'end_discussion'], 'title': 'Schema', 'type': 'object'}"
+        )
+
         self.gt_to_memory = {"speak": "Hello, world!", "thought": "xxx"}
         self.gt_to_content = "Hello, world!"
         self.gt_to_metadata = {"end_discussion": True}
@@ -104,6 +122,44 @@ def setUp(self) -> None:
         )
         self.gt_code = """\nprint("Hello, world!")\n"""
 
+    def test_markdownjsondictparser_with_schema(self) -> None:
+        """Test for MarkdownJsonDictParser with schema"""
+
+        class Schema(BaseModel):  # pylint: disable=missing-class-docstring
+            speak: str = Field(description="what you speak")
+            thought: str = Field(description="what you thought")
+            end_discussion: bool = Field(
+                description="whether the discussion reached an agreement or "
+                "not",
+            )
+
+        parser = MarkdownJsonDictParser(
+            content_hint=Schema,
+            keys_to_memory=["speak", "thought"],
+            keys_to_content="speak",
+            keys_to_metadata=["end_discussion"],
+        )
+
+        self.assertEqual(parser.format_instruction, self.instruction_dict_3)
+
+        res = parser.parse(self.res_dict_1)
+
+        self.assertDictEqual(res.parsed, self.gt_dict)
+
+        res = parser.parse(
+            ModelResponse(
+                text="""```json
+        {
+            "speak" : "Hello, world!",
+            "thought" : "xxx",
+            "end_discussion" : "true"
+        }
+        ```""",
+            ),
+        )
+
+        self.assertDictEqual(res.parsed, self.gt_dict)
+
     def test_markdownjsondictparser(self) -> None:
         """Test for MarkdownJsonDictParser"""
         parser = MarkdownJsonDictParser(