Skip to content

Commit

Permalink
Merge pull request #469 from ApsaraDB/POLARDB_11_DEV
Browse files Browse the repository at this point in the history
merge: 20231228
  • Loading branch information
mrdrivingduck authored Dec 28, 2023
2 parents 8ad5ff6 + cbfe634 commit e6aa579
Show file tree
Hide file tree
Showing 19 changed files with 291 additions and 37 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/regression-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ jobs:
-t \
--name polardb_${{ matrix.container_image }} \
-v `pwd`:/home/postgres/PolarDB-for-PostgreSQL \
mrdrivingduck/polardb_pg_devel:${{ matrix.container_image }} \
polardb/polardb_pg_devel:${{ matrix.container_image }} \
bash && \
docker start polardb_${{ matrix.container_image }}
Expand Down
10 changes: 5 additions & 5 deletions README-CN.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,14 @@
<div align="center">

[![logo](docs/.vuepress/public/images/polardb.png)](https://developer.aliyun.com/topic/polardb-for-pg)
[![logo](docs/.vuepress/public/images/polardb.png)](https://www.polardbpg.com/home)

# PolarDB for PostgreSQL

**阿里云自主研发的云原生数据库产品**

#### [English](README.md) | 简体中文

[![official](https://img.shields.io/badge/官方网站-blueviolet?style=for-the-badge&logo=alibabacloud)](https://developer.aliyun.com/topic/polardb-for-pg)
[![official](https://img.shields.io/badge/官方网站-blueviolet?style=for-the-badge&logo=alibabacloud)](https://www.polardbpg.com/home)

[![cirrus-ci-stable](https://img.shields.io/cirrus/github/ApsaraDB/PolarDB-for-PostgreSQL/POLARDB_11_STABLE?style=for-the-badge&logo=cirrusci)](https://cirrus-ci.com/github/ApsaraDB/PolarDB-for-PostgreSQL/POLARDB_11_STABLE)
[![cirrus-ci-dev](https://img.shields.io/cirrus/github/ApsaraDB/PolarDB-for-PostgreSQL/POLARDB_11_DEV?style=for-the-badge&logo=cirrusci)](https://cirrus-ci.com/github/ApsaraDB/PolarDB-for-PostgreSQL/POLARDB_11_DEV)
Expand Down Expand Up @@ -58,11 +58,11 @@ PolarDB 采用了基于 Shared-Storage 的存储计算分离架构。数据库

```bash
# 拉取单节点 PolarDB 镜像
docker pull polardb/polardb_pg_local_instance:single
docker pull polardb/polardb_pg_local_instance
# 创建运行并进入容器
docker run -it --cap-add=SYS_PTRACE --privileged=true --name polardb_pg_single polardb/polardb_pg_local_instance:single bash
docker run -it --rm polardb/polardb_pg_local_instance psql
# 测试实例可用性
psql -h 127.0.0.1 -c 'select version();'
postgres=# SELECT version();
version
--------------------------------
PostgreSQL 11.9 (POLARDB 11.9)
Expand Down
10 changes: 5 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,14 @@
<div align="center">

[![logo](docs/.vuepress/public/images/polardb.png)](https://developer.aliyun.com/topic/polardb-for-pg)
[![logo](docs/.vuepress/public/images/polardb.png)](https://www.polardbpg.com/home)

# PolarDB for PostgreSQL

**A cloud-native database developed by Alibaba Cloud**

#### English | [简体中文](README-CN.md)

[![official](https://img.shields.io/badge/official%20site-blueviolet?style=for-the-badge&logo=alibabacloud)](https://developer.aliyun.com/topic/polardb-for-pg)
[![official](https://img.shields.io/badge/official%20site-blueviolet?style=for-the-badge&logo=alibabacloud)](https://www.polardbpg.com/home)

[![cirrus-ci-stable](https://img.shields.io/cirrus/github/ApsaraDB/PolarDB-for-PostgreSQL/POLARDB_11_STABLE?style=for-the-badge&logo=cirrusci)](https://cirrus-ci.com/github/ApsaraDB/PolarDB-for-PostgreSQL/POLARDB_11_STABLE)
[![cirrus-ci-dev](https://img.shields.io/cirrus/github/ApsaraDB/PolarDB-for-PostgreSQL/POLARDB_11_DEV?style=for-the-badge&logo=cirrusci)](https://cirrus-ci.com/github/ApsaraDB/PolarDB-for-PostgreSQL/POLARDB_11_DEV)
Expand Down Expand Up @@ -58,11 +58,11 @@ If you have Docker installed already,then you can pull the instance image of P

```bash
# pull the instance image from DockerHub
docker pull polardb/polardb_pg_local_instance:single
docker pull polardb/polardb_pg_local_instance
# create, run and enter the container
docker run -it --cap-add=SYS_PTRACE --privileged=true --name polardb_pg_single polardb/polardb_pg_local_instance:single bash
docker run -it --rm polardb/polardb_pg_local_instance psql
# check
psql -h 127.0.0.1 -c 'select version();'
postgres=# SELECT version();
version
--------------------------------
PostgreSQL 11.9 (POLARDB 11.9)
Expand Down
5 changes: 3 additions & 2 deletions docs/.vuepress/configs/navbar/zh.ts
Original file line number Diff line number Diff line change
Expand Up @@ -67,10 +67,10 @@ export const zh: NavbarConfig = [
],
},
{
text: "内核增强功能",
text: "自研功能",
children: [
{
text: "文档入口",
text: "功能总览",
link: "/zh/features/",
},
{
Expand All @@ -81,6 +81,7 @@ export const zh: NavbarConfig = [
"/zh/features/v11/availability/",
"/zh/features/v11/security/",
"/zh/features/v11/epq/",
"/zh/features/v11/extensions/",
],
},
],
Expand Down
10 changes: 9 additions & 1 deletion docs/.vuepress/configs/sidebar/zh.ts
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,7 @@ export const zh: SidebarConfig = {
],
"/zh/features": [
{
text: "内核增强功能",
text: "自研功能",
link: "/zh/features/",
children: [
{
Expand Down Expand Up @@ -122,6 +122,14 @@ export const zh: SidebarConfig = {
"/zh/features/v11/epq/epq-ctas-mtview-bulk-insert.md",
],
},
{
text: "第三方插件",
link: "/zh/features/v11/extensions/",
children: [
"/zh/features/v11/extensions/pgvector.md",
"/zh/features/v11/extensions/smlar.md",
],
},
],
},
],
Expand Down
4 changes: 2 additions & 2 deletions docs/deploying/fs-pfs.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,13 +17,13 @@ PolarDB File System,简称 PFS 或 PolarFS,是由阿里云自主研发的高
推荐使用 [DockerHub](https://hub.docker.com/u/polardb) 上的 PolarDB for PostgreSQL [可执行文件镜像](https://hub.docker.com/r/polardb/polardb_pg_binary/tags),目前支持 `linux/amd64``linux/arm64` 两种架构,其中已经包含了编译完毕的 PFS 工具,无需手动编译安装。通过以下命令进入容器即可:

```shell:no-line-numbers
docker pull polardb/polardb_pg_binary:pfs
docker pull polardb/polardb_pg_binary
docker run -it \
--cap-add=SYS_PTRACE \
--privileged=true \
--name polardb_pg \
--shm-size=512m \
polardb/polardb_pg_binary:pfs \
polardb/polardb_pg_binary \
bash
```

Expand Down
4 changes: 2 additions & 2 deletions docs/operation/ro-online-promote.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,13 +19,13 @@ PolarDB for PostgreSQL 是一款存储与计算分离的云原生数据库,所
为方便起见,本示例使用基于本地磁盘的实例来进行演示。拉取如下镜像并启动容器,可以得到一个基于本地磁盘的 HTAP 实例:

```shell:no-line-numbers
docker pull polardb/polardb_pg_local_instance:htap
docker pull polardb/polardb_pg_local_instance
docker run -it \
--cap-add=SYS_PTRACE \
--privileged=true \
--name polardb_pg_htap \
--shm-size=512m \
polardb/polardb_pg_local_instance:htap \
polardb/polardb_pg_local_instance \
bash
```

Expand Down
8 changes: 4 additions & 4 deletions docs/operation/scale-out.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,13 +17,13 @@ PolarDB for PostgreSQL 是一款存储与计算分离的数据库,所有计算
首先,在已经搭建完毕的共享存储集群上,初始化并启动第一个计算节点,即读写节点,该节点可以对共享存储进行读写。我们在下面的镜像中提供了已经编译完毕的 PolarDB for PostgreSQL 内核和周边工具的可执行文件:

```shell:no-line-numbers
$ docker pull polardb/polardb_pg_binary:pfs
$ docker pull polardb/polardb_pg_binary
$ docker run -it \
--cap-add=SYS_PTRACE \
--privileged=true \
--name polardb_pg \
--shm-size=512m \
polardb/polardb_pg_binary:pfs \
polardb/polardb_pg_binary \
bash
$ ls ~/tmp_basedir_polardb_pg_1100_bld/bin/
Expand Down Expand Up @@ -130,13 +130,13 @@ $HOME/tmp_basedir_polardb_pg_1100_bld/bin/psql \
类似地,在用于部署新计算节点的机器上,拉取镜像并启动带有可执行文件的容器:

```shell:no-line-numbers
docker pull polardb/polardb_pg_binary:pfs
docker pull polardb/polardb_pg_binary
docker run -it \
--cap-add=SYS_PTRACE \
--privileged=true \
--name polardb_pg \
--shm-size=512m \
polardb/polardb_pg_binary:pfs \
polardb/polardb_pg_binary \
bash
```

Expand Down
4 changes: 2 additions & 2 deletions docs/operation/tpch-test.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,13 +23,13 @@ minute: 20
使用 Docker 快速拉起一个基于本地存储的 PolarDB for PostgreSQL 集群:

```shell:no-line-numbers
docker pull polardb/polardb_pg_local_instance:htap
docker pull polardb/polardb_pg_local_instance
docker run -it \
--cap-add=SYS_PTRACE \
--privileged=true \
--name polardb_pg_htap \
--shm-size=512m \
polardb/polardb_pg_local_instance:htap \
polardb/polardb_pg_local_instance \
bash
```

Expand Down
3 changes: 2 additions & 1 deletion docs/zh/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,12 +49,13 @@ postgres=# SELECT version();
</div>

<div class="feature">
<h3>内核增强功能</h3>
<h3>自研功能</h3>
<ul style="position: relative;z-index: 10;">
<li><a href="./features/v11/performance/">高性能</a></li>
<li><a href="./features/v11/availability/">高可用</a></li>
<li><a href="./features/v11/security/">安全</a></li>
<li><a href="./features/v11/epq/">弹性跨机并行查询(ePQ)</a></li>
<li><a href="./features/v11/extensions/">第三方插件</a></li>
</ul>
</div>

Expand Down
4 changes: 2 additions & 2 deletions docs/zh/deploying/fs-pfs.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,13 +17,13 @@ PolarDB File System,简称 PFS 或 PolarFS,是由阿里云自主研发的高
推荐使用 [DockerHub](https://hub.docker.com/u/polardb) 上的 PolarDB for PostgreSQL [可执行文件镜像](https://hub.docker.com/r/polardb/polardb_pg_binary/tags),目前支持 `linux/amd64``linux/arm64` 两种架构,其中已经包含了编译完毕的 PFS 工具,无需手动编译安装。通过以下命令进入容器即可:

```shell:no-line-numbers
docker pull polardb/polardb_pg_binary:pfs
docker pull polardb/polardb_pg_binary
docker run -it \
--cap-add=SYS_PTRACE \
--privileged=true \
--name polardb_pg \
--shm-size=512m \
polardb/polardb_pg_binary:pfs \
polardb/polardb_pg_binary \
bash
```

Expand Down
17 changes: 16 additions & 1 deletion docs/zh/features/README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# 内核增强功能
# 自研功能

- [PolarDB for PostgreSQL 11](./v11/README.md)

Expand Down Expand Up @@ -118,5 +118,20 @@
<td style="text-align:center">/</td>
<td style="text-align:center"><a href="./v11/epq/epq-ctas-mtview-bulk-insert.html"><Badge type="tip" text="V11 / v1.1.30-" vertical="top" /></a></td>
</tr>
<tr>
<td><strong>第三方插件</strong></td>
<td style="text-align:center">...</td>
<td style="text-align:center"><a href="./v11/extensions/">...</a></td>
</tr>
<tr>
<td>pgvector</td>
<td style="text-align:center">/</td>
<td style="text-align:center"><a href="./v11/extensions/pgvector.html"><Badge type="tip" text="V11 / v1.1.35-" vertical="top" /></a></td>
</tr>
<tr>
<td>smlar</td>
<td style="text-align:center">/</td>
<td style="text-align:center"><a href="./v11/extensions/smlar.html"><Badge type="tip" text="V11 / v1.1.35-" vertical="top" /></a></td>
</tr>
</tbody>
</table>
3 changes: 2 additions & 1 deletion docs/zh/features/v11/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
# 内核增强功能
# 自研功能

- [高性能](./performance/README.md)
- [高可用](./availability/README.md)
- [安全](./security/README.md)
- [弹性跨机并行查询(ePQ)](./epq/README.md)
- [第三方插件](./extensions/README.md)
4 changes: 4 additions & 0 deletions docs/zh/features/v11/extensions/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# 第三方插件

- [pgvector](./pgvector.md) <Badge type="tip" text="V11 / v1.1.35-" vertical="top" />
- [smlar](./smlar.md) <Badge type="tip" text="V11 / v1.1.28-" vertical="top" />
81 changes: 81 additions & 0 deletions docs/zh/features/v11/extensions/pgvector.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
---
author: 山现
date: 2023/12/25
minute: 10
---

# pgvector

<Badge type="tip" text="V11 / v1.1.35-" vertical="top" />

<ArticleInfo :frontmatter=$frontmatter></ArticleInfo>

[[toc]]

## 背景

[`pgvector`](https://github.com/pgvector/pgvector) 作为一款高效的向量数据库插件,基于 PostgreSQL 的扩展机制,利用 C 语言实现了多种向量数据类型和运算算法,同时还能够高效存储与查询以向量表示的 AI Embedding。

`pgvector` 支持 IVFFlat 索引。IVFFlat 索引能够将向量空间分为若干个划分区域,每个区域都包含一些向量,并创建倒排索引,用于快速地查找与给定向量相似的向量。IVFFlat 是 IVFADC 索引的简化版本,适用于召回精度要求高,但对查询耗时要求不严格(100ms 级别)的场景。相比其他索引类型,IVFFlat 索引具有高召回率、高精度、算法和参数简单、空间占用小的优势。

`pgvector` 插件算法的具体流程如下:

1. 高维空间中的点基于隐形的聚类属性,按照 K-Means 等聚类算法对向量进行聚类处理,使得每个类簇有一个中心点
2. 检索向量时首先遍历计算所有类簇的中心点,找到与目标向量最近的 n 个类簇中心
3. 遍历计算 n 个类簇中心所在聚类中的所有元素,经过全局排序得到距离最近的 k 个向量

## 使用方法

`pgvector` 可以顺序检索或索引检索高维向量,关于索引类型和更多参数介绍可以参考插件源代码的 [README](https://github.com/pgvector/pgvector/blob/master/README.md)

### 安装插件

```sql:no-line-numbers
CREATE EXTENSION vector;
```

### 向量操作

执行如下命令,创建一个含有向量字段的表:

```sql:no-line-numbers
CREATE TABLE t (val vector(3));
```

执行如下命令,可以插入向量数据:

```sql:no-line-numbers
INSERT INTO t (val) VALUES ('[0,0,0]'), ('[1,2,3]'), ('[1,1,1]'), (NULL);
```

创建 IVFFlat 类型的索引:

1. `val vector_ip_ops` 表示需要创建索引的列名为 `val`,并且使用向量操作符 `vector_ip_ops` 来计算向量之间的相似度。该操作符支持向量之间的点积、余弦相似度、欧几里得距离等计算方式
2. `WITH (lists = 1)` 表示使用的划分区域数量为 1,这意味着所有向量都将被分配到同一个区域中。在实际应用中,划分区域数量需要根据数据规模和查询性能进行调整

```sql:no-line-numbers
CREATE INDEX ON t USING ivfflat (val vector_ip_ops) WITH (lists = 1);
```

计算近似向量:

```sql:no-line-numbers
=> SELECT * FROM t ORDER BY val <#> '[3,3,3]';
val
---------
[1,2,3]
[1,1,1]
[0,0,0]
(4 rows)
```

### 卸载插件

```sql:no-line-numbers
DROP EXTENSION vector;
```

## 注意事项

- [ePQ](../epq/README.md) 支持通过排序遍历高维向量,不支持通过索引查询向量类型
Loading

0 comments on commit e6aa579

Please sign in to comment.