Skip to content

Commit

Permalink
Add table of contents
Browse files Browse the repository at this point in the history
  • Loading branch information
DavdGao committed May 13, 2024
1 parent 60efe5b commit 4085798
Show file tree
Hide file tree
Showing 2 changed files with 78 additions and 29 deletions.
52 changes: 38 additions & 14 deletions docs/sphinx_doc/en/source/tutorial/203-parser.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,32 @@

# Model Response Parser

## Table of Contents

---
- [Background](#background)
- [Parser Module](#parser-module)
- [Overview](#overview)
- [String Type](#string-type)
- [MarkdownCodeBlockParser](#summary-markdowncodeblockparser-summary)
- [Initialization](#initialization)
- [Format Instruction Template](#format-instruction-template)
- [Parse Function](#parse-function)
- [Dictionary Type](#dictionary-type)
- [MarkdownJsonDictParser](#summary-markdownjsondictparser-summary)
- [Initialization & Format Instruction Template](#initialization--format-instruction-template)
- [MultiTaggedContentParser](#summary-multitaggedcontentparser-summary)
- [Initialization & Format Instruction Template](#initialization--format-instruction-template-1)
- [Parse Function](#parse-function-1)
- [JSON / Python Object Type](#json--python-object-type)
- [MarkdownJsonObjectParser](#summary-markdownjsonobjectparser-summary)
- [Initialization & Format Instruction Template](#initialization--format-instruction-template-2)
- [Parse Function](#parse-function-2)
- [Typical Use Cases](#typical-use-cases)
- [WereWolf Game](#werewolf-game)
- [ReAct Agent and Tool Usage](#react-agent-and-tool-usage)
- [Customized Parser](#customized-parser)

## Background

In the process of building LLM-empowered application, parsing the LLM generated string into a specific format and extracting the required information is a very important step.
Expand Down Expand Up @@ -38,8 +64,6 @@ The main functions of the parser module include:

3. Post-processing for dictionary format. After parsing the text into a dictionary, different fields may have different uses.

### Built-in Parsers

AgentScope provides multiple built-in parsers, and developers can choose according to their needs.

| Target Format | Parser Class | Description |
Expand All @@ -61,9 +85,9 @@ In the following sections, we will introduce the usage of these parsers based on

<details>

<summary> MarkdownCodeBlockParser </summary>
#### <summary> MarkdownCodeBlockParser </summary>

#### Initialization
##### Initialization

- `MarkdownCodeBlockParser` requires LLM to generate specific text within a specified code block in Markdown format. Different languages can be specified with the `language_name` parameter to utilize the large model's ability to produce corresponding outputs. For example, when asking the large model to produce Python code, initialize as follows:

Expand All @@ -73,7 +97,7 @@ In the following sections, we will introduce the usage of these parsers based on
parser = MarkdownCodeBlockParser(language_name="python")
```

#### Format Instruction Template
##### Format Instruction Template

- `MarkdownCodeBlockParser` provides the following format instruction template. When the user calls the `format_instruction` attribute, `{language_name}` will be replaced with the string entered at initialization:

Expand All @@ -99,7 +123,7 @@ In the following sections, we will introduce the usage of these parsers based on
>
> \```

#### Parse Function
##### Parse Function

- `MarkdownCodeBlockParser` provides a `parse` method to parse the text generated by LLM。Its input and output are both `ModelResponse` objects, and the parsing result will be mounted on the `parsed` attribute of the output object.

Expand Down Expand Up @@ -235,10 +259,10 @@ Next we will introduce two parsers for dictionary type.
<details>
<summary>MarkdownJsonDictParser</summary>
#### <summary> MarkdownJsonDictParser </summary>
#### Initialization & Format Instruction Template
##### Initialization & Format Instruction Template
- `MarkdownJsonDictParser` requires LLM to generate dictionary within a code block fenced by \```json and \``` tags.
Expand Down Expand Up @@ -278,11 +302,11 @@ This parameter can be a string or a dictionary. For dictionary, it will be autom
<details>
<summary>MultiTaggedContentParser</summary>
#### <summary> MultiTaggedContentParser </summary>
`MultiTaggedContentParser` asks LLM to generate specific content within multiple tag pairs. The content from different tag pairs will be parsed into a single Python dictionary. Its usage is similar to `MarkdownJsonDictParser`, but the initialization method is different, and it is more suitable for weak LLMs or complex return content.
#### Initialization & Format Instruction Template
##### Initialization & Format Instruction Template
Within `MultiTaggedContentParser`, each tag pair will be specified by as `TaggedContent` object, which contains
- Tag name (`name`), the key value in the returned dictionary
Expand Down Expand Up @@ -326,7 +350,7 @@ print(parser.format_instruction)
>
> [FINISH_DISCUSSION]true/false, whether the discussion is finished[/FINISH_DISCUSSION]
#### Parse Function
##### Parse Function
- `MultiTaggedContentParser`'s parsing result is a dictionary, whose keys are the value of `name` in the `TaggedContent` objects.
The following is an example of parsing the LLM response in the werewolf game:
Expand Down Expand Up @@ -360,11 +384,11 @@ print(res_dict)

<details>

<summary>MarkdownJsonObjectParser</summary>
#### <summary> MarkdownJsonObjectParser </summary>

`MarkdownJsonObjectParser` also uses the ```json and ``` tags in Markdown, but does not limit the content type. It can be a list, dictionary, number, string, etc., which can be parsed into a Python object via `json.loads`.

#### Initialization & Format Instruction Template
##### Initialization & Format Instruction Template

```python
from agentscope.parsers import MarkdownJsonObjectParser
Expand All @@ -384,7 +408,7 @@ print(parser.format_instruction)
>
> \```
#### Parse Function
##### Parse Function

````python
res = parser.parse(
Expand Down
55 changes: 40 additions & 15 deletions docs/sphinx_doc/zh_CN/source/tutorial/203-parser.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,33 @@

# 模型结果解析

## 目录

---
- [背景](#背景)
- [解析器模块](#解析器模块)
- [功能说明](#功能说明)
- [字符串类型](#字符串str类型)
- [MarkdownCodeBlockParser](#summary-markdowncodeblockparser-summary)
- [初始化](#初始化)
- [响应格式模版](#响应格式模版)
- [解析函数](#解析函数)
- [字典类型](#字典dict类型)
- [MarkdownJsonDictParser](#summary-markdownjsondictparser-summary)
- [初始化 & 响应格式模版](#初始化--响应格式模版)
- [MultiTaggedContentParser](#summary-multitaggedcontentparser-summary)
- [初始化 & 响应格式模版](#初始化--响应格式模版-1)
- [解析函数](#解析函数-1)
- [JSON / Python 对象类型](#json--python-对象类型)
- [MarkdownJsonObjectParser](#summary-markdownjsonobjectparser-summary)
- [初始化 & 响应格式模版](#初始化--响应格式模版-2)
- [解析函数](#解析函数-2)
- [典型使用样例](#典型使用样例)
- [狼人杀游戏](#狼人杀游戏)
- [ReAct 智能体和工具使用](#react-智能体和工具使用)
- [自定义解析器](#自定义解析器)


## 背景

利用LLM构建应用的过程中,将 LLM 产生的字符串解析成指定的格式,提取出需要的信息,是一个非常重要的环节。
Expand Down Expand Up @@ -39,9 +66,6 @@ AgentScope中,解析器模块的设计原则是:

3. 针对字典格式的后处理功能。在将文本解析成字典后,其中不同的字段可能有不同的用处


### 解析器类型

AgentScope提供了多种不同解析器,开发者可以根据自己的需求进行选择。

| 目标格式 | 解析器 | 说明 |
Expand All @@ -58,9 +82,10 @@ AgentScope提供了多种不同解析器,开发者可以根据自己的需求
### 字符串(`str`)类型

<details>
<summary> MarkdownCodeBlockParser </summary>

#### 初始化
#### <summary> MarkdownCodeBlockParser </summary>

##### 初始化

- `MarkdownCodeBlockParser`采用 Markdown 代码块的形式,要求 LLM 将指定文本产生到指定的代码块中。可以通过`language_name`参数指定不同的语言,从而利用大模型代码能力产生对应的输出。例如要求大模型产生 Python 代码时,初始化如下:

Expand All @@ -70,7 +95,7 @@ AgentScope提供了多种不同解析器,开发者可以根据自己的需求
parser = MarkdownCodeBlockParser(language_name="python")
```

#### 响应格式模版
##### 响应格式模版

- `MarkdownCodeBlockParser`类提供如下的“响应格式说明”模版,在用户调用`format_instruction`属性时,会将`{language_name}`替换为初始化时输入的字符串:

Expand All @@ -96,7 +121,7 @@ AgentScope提供了多种不同解析器,开发者可以根据自己的需求
>
> \```

#### 解析函数
##### 解析函数

- `MarkdownCodeBlockParser`类提供`parse`方法,用于解析LLM产生的文本,返回的是字符串。

Expand Down Expand Up @@ -234,9 +259,9 @@ AgentScope中,我们通过调用`to_content`,`to_memory`和`to_metadata`方
<details>
<summary>MarkdownJsonDictParser</summary>
#### <summary> MarkdownJsonDictParser </summary>
#### 初始化 & 响应格式模版
##### 初始化 & 响应格式模版
- `MarkdownJsonDictParser`要求 LLM 在 \```json 和 \``` 标识的代码块中产生指定内容的字典。
- 除了`to_content``to_memory``to_metadata`参数外,可以通过提供 `content_hint` 参数提供响应结果样例和说明,即提示LLM应该产生什么样子的字典,该参数可以是字符串,也可以是字典,在构建响应格式提示的时候将会被自动转换成字符串进行拼接。
Expand Down Expand Up @@ -273,11 +298,11 @@ AgentScope中,我们通过调用`to_content`,`to_memory`和`to_metadata`方

<details>

<summary>MultiTaggedContentParser</summary>
#### <summary> MultiTaggedContentParser </summary>

`MultiTaggedContentParser`要求 LLM 在多个指定的标签对中产生指定的内容,这些不同标签的内容将一同被解析为一个 Python 字典。使用方法与`MarkdownJsonDictParser`类似,只是初始化方法不同,更适合能力较弱的LLM,或是比较复杂的返回内容。

#### 初始化 & 响应格式说明
##### 初始化 & 响应格式模版

`MultiTaggedContentParser`中,每一组标签将会以`TaggedContent`对象的形式传入,其中`TaggedContent`对象包含了
- 标签名(`name`),即返回字典中的key值
Expand Down Expand Up @@ -321,7 +346,7 @@ print(parser.format_instruction)
>
> [FINISH_DISCUSSION]true/false, whether the discussion is finished[/FINISH_DISCUSSION]
#### 解析函数
##### 解析函数

- `MultiTaggedContentParser`的解析结果为字典,其中key为`TaggedContent`对象的`name`的值,以下是狼人杀中解析 LLM 返回的样例:

Expand Down Expand Up @@ -354,11 +379,11 @@ print(res_dict)

<details>

<summary>MarkdownJsonObjectParser</summary>
#### <summary> MarkdownJsonObjectParser </summary>

`MarkdownJsonObjectParser`同样采用 Markdown 的```json和```标识,但是不限制解析的内容的类型,可以是列表,字典,数值,字符串等可以通过`json.loads`进行解析字符串。

#### 初始化 & 响应格式说明
##### 初始化 & 响应格式模版

```python
from agentscope.parsers import MarkdownJsonObjectParser
Expand All @@ -378,7 +403,7 @@ print(parser.format_instruction)
>
> \```
#### 解析函数
##### 解析函数

````python
res = parser.parse(
Expand Down

0 comments on commit 4085798

Please sign in to comment.