Skip to content

Commit

Permalink
docs: update contributing guide
Browse files Browse the repository at this point in the history
  • Loading branch information
mrdrivingduck committed Apr 8, 2024
1 parent 243bc69 commit 61dc4c6
Show file tree
Hide file tree
Showing 9 changed files with 46 additions and 25 deletions.
1 change: 1 addition & 0 deletions docs/.vuepress/config.ts
Original file line number Diff line number Diff line change
Expand Up @@ -124,6 +124,7 @@ export default defineUserConfig({
},
}),
mdEnhancePlugin({
katex: true,
footnote: true,
}),
registerComponentsPlugin({
Expand Down
16 changes: 8 additions & 8 deletions docs/contributing/contributing-polardb-docs.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ PolarDB for PostgreSQL 的文档使用 [VuePress 2](https://v2.vuepress.vuejs.or

## 本地文档开发

若您发现文档中存在内容或格式错误,或者您希望能够贡献新文档,那么您需要在本地安装并配置文档开发环境。本项目的文档是一个 Node.js 工程,以 [Yarn](https://yarnpkg.com/) 作为软件包管理器。[Node.js®](https://nodejs.org/en/) 是一个基于 Chrome V8 引擎的 JavaScript 运行时环境。
若您发现文档中存在内容或格式错误,或者您希望能够贡献新文档,那么您需要在本地安装并配置文档开发环境。本项目的文档是一个 Node.js 工程,以 [pnpm](https://pnpm.io/) 作为软件包管理器。[Node.js®](https://nodejs.org/en/) 是一个基于 Chrome V8 引擎的 JavaScript 运行时环境。

### Node 环境准备

Expand Down Expand Up @@ -40,27 +40,27 @@ node -v
npm -v
```

使用 `npm` 全局安装软件包管理器 `yarn`
使用 `npm` 全局安装软件包管理器 `pnpm`

```bash:no-line-numbers
npm install -g yarn
yarn -v
npm install -g pnpm
pnpm -v
```

### 文档依赖安装

在 PolarDB for PostgreSQL 工程的根目录下运行以下命令,`yarn` 将会根据 `package.json` 安装所有依赖:
在 PolarDB for PostgreSQL 工程的根目录下运行以下命令,`pnpm` 将会根据 `package.json` 安装所有依赖:

```bash:no-line-numbers
yarn
pnpm install
```

### 运行文档开发服务器

在 PolarDB for PostgreSQL 工程的根目录下运行以下命令:

```bash:no-line-numbers
yarn docs:dev
pnpm run docs:dev
```

文档开发服务器将运行于 `http://localhost:8080/PolarDB-for-PostgreSQL/`,打开浏览器即可访问。对 Markdown 文件作出修改后,可以在网页上实时查看变化。
Expand Down Expand Up @@ -94,7 +94,7 @@ PolarDB for PostgreSQL 的文档资源位于工程根目录的 `docs/` 目录下

`.vuepress/` 目录下包含文档工程的全局配置信息:

- `config.js`:文档配置
- `config.ts`:文档配置
- `configs/`:文档配置模块(导航栏 / 侧边栏、英文 / 中文等配置)
- `public/`:公共静态资源
- `styles/`:文档主题默认样式覆盖
Expand Down
2 changes: 1 addition & 1 deletion docs/theory/buffer-management.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ To apply the WAL records of a page up to a specified LSN, each read-only node ma

For a specific page, more changes mean more LSNs and a longer period of time required to apply WAL records. To minimize the number of WAL records that need to be applied for each page, PolarDB provides consistent LSNs.

After all changes that are made up to the consistent LSN of a page are written to the shared storage, the page is persistently stored. The primary node sends the write LSN and consistent LSN of the page to each read-only node, and each read-only node sends the apply LSN of the page to the primary node. The read-only nodes do not need to apply the WAL records that are generated before the consistent LSN of the page. Therefore, all LSNs that are smaller than the consistent LSN can be removed from the LogIndex of the page. This reduces the number of WAL records that the read-only nodes need to apply. This also reduces the storage space that is occupied by LogIndex records.
After all changes that are made up to the consistent LSN of a page are written to the shared storage, the page is persistently stored. The primary node sends the write LSN and consistent LSN of the page to each read-only node, and each read-only node sends the apply LSN of the page and the min used LSN of the page to the primary node. The read-only nodes do not need to apply the WAL records that are generated before the consistent LSN of the page while reading it from shared storage. But the read-only nodes may still need to apply the WAL records that are generated before the consistent LSN of the page while replaying outdated page in buffer pool. Therefore, all LSNs that are smaller than the consistent LSN and the min used LSN can be removed from the LogIndex of the page. This reduces the number of WAL records that the read-only nodes need to apply. This also reduces the storage space that is occupied by LogIndex records.

### Flush Lists

Expand Down
3 changes: 2 additions & 1 deletion docs/theory/logindex.md
Original file line number Diff line number Diff line change
Expand Up @@ -80,7 +80,7 @@ After the data in an Inactive LogIndex Memtable is flushed to the disk, the LogI

![image.png](../imgs/58_LogIndex_10.png)

All modified data pages recorded in WAL logs before the LSN of consistent data are persisted to the shared storage based on the information described in [Buffer Management](./buffer-management.md). The LSN of consistent data is the LSN before which data is consistent between the primary node and read-only nodes. Read-only nodes do not need to replay WAL logs generated before the LSN of consistent data. In this case, the WAL logs for the LSNs that are smaller than the LSN of consistent data can be cleared from LogIndex Tables. This way, the primary node can truncate LogIndex Tables that are no longer used in the storage. This enables more efficient log replay for read-only nodes and reduces the space occupied by LogIndex Tables.
All modified data pages recorded in WAL logs before the consistent LSN are persisted to the shared storage based on the information described in [Buffer Management](./buffer-management.md). The primary node sends the write LSN and consistent LSN to each read-only node, and each read-only node sends the apply LSN and the min used LSN to the primary node. In this case, the WAL logs whose LSNs are smaller than the consistent LSN and the min used LSN can be cleared from LogIndex Tables. This way, the primary node can truncate LogIndex Tables that are no longer used in the storage. This enables more efficient log replay for read-only nodes and reduces the space occupied by LogIndex Tables.

## Log replay

Expand All @@ -90,6 +90,7 @@ For scenarios in which LogIndex Tables are used, the startup processes of read-o

- The background replay process replays WAL logs in the sequence of WAL logs. The process retrieves modified pages from LogIndex Memtables and LogIndex Tables based on the LSN of a page that you want to replay. If a page exists in a buffer pool, the page is replayed. Otherwise, the page is skipped. The background replay process replays WAL logs generated for the next LSN of a page in a buffer pool in the sequence of LSNs. This prevents a large number of LSNs for a single page that you want to replay from being accumulated.
- The backend process replays only the pages it must access. If the backend process must access a page that does not exist in a buffer pool, the process reads this page from the shared storage, writes the page to a buffer pool, and replays this page. If the page exists in a buffer pool and is marked as an outdated page, the process replays the most recent WAL logs of this page. The backend process retrieves the LSNs of the page from LogIndex Memtables and LogIndex Tables based on the value of PageTag. After the process retrieves the LSNs, the process generates the LSNs for the page in sequence. Then, the process reads the complete WAL logs from the shared storage based on the generated LSNs to replay the page.
- According to the above two points, we can know that both the background replay process and the backend process will use LogIndex information to apply WAL logs on some pages. Therefore, the min used LSN of current RO node is defined as the minimum LSN of WALs which being applying by the background replay process and all backend processes. The RO node sends the current min used LSN to the primary node which would use this LSN to truncate those no longer used LogIndex Tables.

![image.png](../imgs/59_LogIndex_11.png)

Expand Down
16 changes: 8 additions & 8 deletions docs/zh/contributing/contributing-polardb-docs.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ PolarDB for PostgreSQL 的文档使用 [VuePress 2](https://v2.vuepress.vuejs.or

## 本地文档开发

若您发现文档中存在内容或格式错误,或者您希望能够贡献新文档,那么您需要在本地安装并配置文档开发环境。本项目的文档是一个 Node.js 工程,以 [Yarn](https://www.yarnpkg.cn/) 作为软件包管理器。[Node.js®](https://nodejs.org/zh-cn/) 是一个基于 Chrome V8 引擎的 JavaScript 运行时环境。
若您发现文档中存在内容或格式错误,或者您希望能够贡献新文档,那么您需要在本地安装并配置文档开发环境。本项目的文档是一个 Node.js 工程,以 [pnpm](https://pnpm.io/) 作为软件包管理器。[Node.js®](https://nodejs.org/zh-cn/) 是一个基于 Chrome V8 引擎的 JavaScript 运行时环境。

### Node 环境准备

Expand Down Expand Up @@ -36,27 +36,27 @@ node -v
npm -v
```

使用 `npm` 全局安装软件包管理器 `yarn`
使用 `npm` 全局安装软件包管理器 `pnpm`

```bash:no-line-numbers
npm install -g yarn
yarn -v
npm install -g pnpm
pnpm -v
```

### 文档依赖安装

在 PolarDB for PostgreSQL 工程的根目录下运行以下命令,`yarn` 将会根据 `package.json` 安装所有依赖:
在 PolarDB for PostgreSQL 工程的根目录下运行以下命令,`pnpm` 将会根据 `package.json` 安装所有依赖:

```bash:no-line-numbers
yarn
pnpm install
```

### 运行文档开发服务器

在 PolarDB for PostgreSQL 工程的根目录下运行以下命令:

```bash:no-line-numbers
yarn docs:dev
pnpm run docs:dev
```

文档开发服务器将运行于 `http://localhost:8080/PolarDB-for-PostgreSQL/`,打开浏览器即可访问。对 Markdown 文件作出修改后,可以在网页上实时查看变化。
Expand Down Expand Up @@ -90,7 +90,7 @@ PolarDB for PostgreSQL 的文档资源位于工程根目录的 `docs/` 目录下

`.vuepress/` 目录下包含文档工程的全局配置信息:

- `config.js`:文档配置
- `config.ts`:文档配置
- `configs/`:文档配置模块(导航栏 / 侧边栏、英文 / 中文等配置)
- `public/`:公共静态资源
- `styles/`:文档主题默认样式覆盖
Expand Down
2 changes: 1 addition & 1 deletion docs/zh/theory/buffer-management.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ else

可见,数据页上的修改越多,其对应的 LSN 也越多,回放所需耗时也越长。为了尽量减少数据页需要回放的 LSN 数量,PolarDB 中引入了一致性位点的概念。

一致性位点表示该位点之前的所有 WAL 日志修改的数据页均已经持久化到存储。主备之间,主节点向备节点发送当前 WAL 日志的写入位点和一致性位点,备节点向主节点发送当前回放的位点。由于一致性位点之前的 WAL 修改都已经写入共享存储,备节点无需再回放该位点之前的 WAL 日志。因此,可以将 LogIndex 中所有小于一致性位点的 LSN 清理掉,既加速回放效率,同时还能减少 LogIndex 占用的空间。
一致性位点表示该位点之前的所有 WAL 日志修改的数据页均已经持久化到存储。主备之间,主节点向备节点发送当前 WAL 日志的写入位点和一致性位点,备节点向主节点反馈当前回放的位点和当前使用的最小 WAL 日志位点。由于一致性位点之前的 WAL 修改都已经写入共享存储,备节点从存储上读取新的数据页面时,无需再回放该位点之前的 WAL 日志,但是备节点回放 Buffer Pool 中的被标记为 Outdate 的数据页面时,有可能需要回放该位点之前的 WAL 日志。因此,主库节点可以根据备节点传回的‘当前使用的最小 WAL 日志位点’和一致性位点,将 LogIndex 中所有小于两个位点的 LSN 清理掉,既加速回放效率,同时还能减少 LogIndex 占用的空间。

### FlushList

Expand Down
3 changes: 2 additions & 1 deletion docs/zh/theory/logindex.md
Original file line number Diff line number Diff line change
Expand Up @@ -80,7 +80,7 @@ LogIndex 实质为一个 HashTable 结构,其 key 为 PageTag,可标识一

![image.png](../imgs/58_LogIndex_10.png)

[Buffer 管理](./buffer-management.md) 可知,一致性位点之前的所有 WAL 日志修改的数据页均已持久化到共享存储中,RO 节点无需回放该位点之前的 WAL 日志,故 LogIndex Table 中小于一致性位点的 LSN 均可清除。RW 据此 Truncate 掉存储上不再使用的 LogIndex Table,在加速 RO 回放效率的同时还可减少 LogIndex Table 占用的空间。
[Buffer 管理](./buffer-management.md) 可知,一致性位点之前的所有 WAL 日志修改的数据页均已持久化到共享存储中,RO 节点通过流复制向 RW 节点反馈当前回放的位点和当前使用的最小 WAL 日志位点,故 LogIndex Table 中小于两个位点的 LSN 均可清除。RW 据此 Truncate 掉存储上不再使用的 LogIndex Table,在加速 RO 回放效率的同时还可减少 LogIndex Table 占用的空间。

## 日志回放

Expand All @@ -90,6 +90,7 @@ LogIndex 机制下,RO 节点的 Startup 进程基于接收到的 WAL Meta 生

- 背景回放进程按照 WAL 顺序依次进行日志回放操作,根据要回放的 LSN 检索 LogIndex Memtable 及 LogIndex Table,获取该 LSN 修改的 Page List,若某个 Page 存在于 Buffer Pool 中则对其进行回放,否则直接跳过。背景回放进程按照 LSN 的顺序逐步推进 Buffer Pool 中的页面位点,避免单个 Page 需要回放的 LSN 数量堆积太多;
- Backend 进程则仅对其实际需要访问的 Page 进行回放,当 Backend 进程需要访问一个 Page 时,如果该 Page 在 Buffer Pool 中不存在,则将该 Page 读到 Buffer Pool 后进行回放;如果该 Page 已经在 Buffer Pool 中且标记为 outdate,则将该 Page 回放到最新。Backend 进程依据 Page TAG 对 LogIndex Memtable 及 LogIndex Table 进行检索,按序生成与该 Page 相关的 LSN List,基于 LSN List 从共享存储中读取完整的 WAL 日志来对该 Page 进行回放。
- 由上述两点可知:背景回放进程和 Backend 进程均会检索 Logindex,并使用 LogIndex 中记录的信息对 Page 执行回放操作。这两类进程均有一个当前正在回放的 WAL 日志位点信息,因此,我们定义背景回放进程和所有的 Backend 进程正在回放的 WAL 日志位点信息的最小值,为该 RO 节点当前正在使用的最小 WAL 日志位点。RO 节点通过流复制将该位点信息回传给 RW 节点,RW 节点据此来判断存储上的 LogIndex Table 是否可以删除。

![image.png](../imgs/59_LogIndex_11.png)

Expand Down
3 changes: 2 additions & 1 deletion package.json
Original file line number Diff line number Diff line change
Expand Up @@ -26,9 +26,10 @@
"@vuepress/plugin-docsearch": "^2.0.0-rc.9",
"@vuepress/plugin-register-components": "^2.0.0-rc.9",
"@vuepress/theme-default": "^2.0.0-rc.9",
"katex": "^0.16.10",
"prettier": "3.2.5",
"vue": "^3.4.0",
"vuepress": "^2.0.0-rc.9",
"prettier": "3.2.5",
"vuepress-plugin-md-enhance": "^2.0.0-rc.33"
}
}
25 changes: 21 additions & 4 deletions pnpm-lock.yaml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

0 comments on commit 61dc4c6

Please sign in to comment.