Skip to content

Commit

Permalink
Renaming updates
Browse files Browse the repository at this point in the history
Signed-off-by: Dj Walker-Morgan <[email protected]>
  • Loading branch information
djw-m committed Dec 11, 2024
1 parent 3f91fd4 commit 674421e
Show file tree
Hide file tree
Showing 13 changed files with 209 additions and 182 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -58,22 +58,22 @@ INSERT 0 9

So now we have a table with some data in it, food products and some very personal opinions about them.

## Registering a Retriever
## Creating a Retriever

The first step to using Pipelines with this data is to register a retriever. A retriever is a way to access the data in the table and use it in AI workflows.
The first step to using Pipelines with this data is to create a retriever. A retriever is a way to access the data in the table and use it in AI workflows.

```sql
select aidb.register_retriever_for_table('products_retriever', 't5', 'products', 'description', 'Text');
select aidb.create_retriever_for_table('products_retriever', 't5', 'products', 'description', 'Text');
__OUTPUT__
register_retriever_for_table
create_retriever_for_table
------------------------------
products_retriever
(1 row)
```

## Querying the retriever

Now that we have a retriever registered, we can query it to get similar results based on the data in the table.
Now that we have created a retriever, we can query it to get similar results based on the data in the table.

```sql
select * from aidb.retrieve_key('products_retriever','I like it',5);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -31,17 +31,18 @@ Read more about [BERT on Wikipedia](https://en.wikipedia.org/wiki/BERT_(language
* sentence-transformers/paraphrase-multilingual-mpnet-base-v2
* sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2

## Register the default implementation

## Creating the default model

```sql
SELECT aidb.register_model('my_bert_model', 'bert_local');
SELECT aidb.create_model('my_bert_model', 'bert_local');
```

## Register another model
## Creating a specific model

You can specify a model and revision in the options JSONB object. In this example, we are creating a `sentence-transformers/all-distilroberta-v1` model with the name `another_bert_model`:

```sql
select aidb.register_model(
select aidb.create_model(
'another_bert_model',
'bert_local',
'{"model": "sentence-transformers/all-distilroberta-v1", "revision": "main"}'::JSONB
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -24,15 +24,15 @@ Read more about [CLIP on OpenAI's website](https://openai.com/research/clip/).

* openai/clip-vit-base-patch32 (default)

## Register the default implementation
## Creating the default model

```sql
SELECT aidb.register_model('my_clip_model', 'clip_local');
SELECT aidb.create_model('my_clip_model', 'clip_local');
```

There is only one model, the default `openai/clip-vit-base-patch32`, so we do not need to specify the model in the configuration. No credentials are required for the CLIP model.

## Register another model
## Creating a specific model

There are no other model configurations available for the CLIP model.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,18 +21,18 @@ See a list of supported OpenAI models [here](https://platform.openai.com/docs/mo

* Any text generation model that is supported by OpenAI. This includes models such as GPT-4o, GPT-4o mini, GPT-4 and GPT-3.5.

## Registering the default model
## Creating the default model

There is no default model for OpenAI Completions. You can register any supported OpenAI model using the `aidb.register_model` function. See [Registering a model](#registering-a-model).
There is no default model for OpenAI Completions. You can register any supported OpenAI model using the `aidb.create_model` function. See [Crating a model](#creating-a-specific-model).

## Registering a model
## Creating a specific model

You can register any supported OpenAI model using the `aidb.register_model` function.
You can register any supported OpenAI model using the `aidb.create_model` function.

In this example, we are registering a GPT-4o model with the name `my_openai_model`:

```sql
SELECT aidb.register_model(
SELECT aidb.create_model(
'my_openai_model',
'openai_completions',
'{"model": "gpt-4o"}::JSONB,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -22,22 +22,22 @@ See a list of supported OpenAI models [here](https://platform.openai.com/docs/gu
* Any text embedding model that is supported by OpenAI. This includes `text-embedding-3-small`, `text-embedding-3-large`, and `text-embedding-ada-002`.
* Defaults to `text-embedding-3-small`.

## Registering the default model
## Creating the default model

```sql
SELECT aidb.register_model('my_openai_embeddings',
SELECT aidb.create_model('my_openai_embeddings',
'openai_embeddings',
credentials=>'{"api_key": "sk-abc123xyz456def789ghi012jkl345mn"'::JSONB);
```

As we are defaulting the model to `text-embedding-3-small`, we do not need to specify the model in the configuration. But we do need to pass an OpenAI API key in the credentials, and for that we have to pass credentials as a named parameter.

## Registering a model
## Creating a specific model

You can register any supported OpenAI embedding model using the `aidb.register_model` function. In this example, we are registering a `text-embedding-3-large` model with the name `my_openai_model`:
You can create any supported OpenAI embedding model using the `aidb.create_model` function. In this example, we are creating a `text-embedding-3-large` model with the name `my_openai_model`:

```sql
SELECT aidb.register_model(
SELECT aidb.create_model(
'my_openai_model',
'openai_embeddings',
'{"model": "text-embedding-3-large"}'::JSONB,
Expand All @@ -55,6 +55,20 @@ The following configuration settings are available for OpenAI models:
* `url` - The URL of the OpenAI model to use. This is optional and can be used to specify a custom model URL. Defaults to `https://api.openai.com/v1/chat/completions`.
* `max_concurrent_requests` - The maximum number of concurrent requests to make to the OpenAI model. Defaults to `25`.

## Available OpenAI Embeddings models

* sentence-transformers/all-MiniLM-L6-v2 (default)
* sentence-transformers/all-MiniLM-L6-v1
* sentence-transformers/all-MiniLM-L12-v1
* sentence-transformers/msmarco-bert-base-dot-v5
* sentence-transformers/multi-qa-MiniLM-L6-dot-v1
* sentence-transformers/paraphrase-TinyBERT-L6-v2
* sentence-transformers/all-distilroberta-v1
* sentence-transformers/all-MiniLM-L6-v2
* sentence-transformers/multi-qa-MiniLM-L6-cos-v1
* sentence-transformers/paraphrase-multilingual-mpnet-base-v2
* sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2

## Model credentials

The following credentials are required for OpenAI models:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -27,16 +27,16 @@ Read more about [T5 on Wikipedia](https://en.wikipedia.org/wiki/T5_(language_mod
* t5-3b
* t5-11b

## Registering the default model
## Creating the default model

```sql
SELECT aidb.register_model('my_t5_model', 't5_local');
SELECT aidb.create_model('my_t5_model', 't5_local');
```

## Registering a specific model
## Creating a specific model

```sql
SELECT aidb.register_model(
SELECT aidb.create_model(
'another_t5_model',
't5_local',
'{"model": "t5-large", "revision": "main"}'::JSONB
Expand Down
Original file line number Diff line number Diff line change
@@ -1,19 +1,19 @@
---
title: Using Models in AI Accelerator Pipelines
navTitle: Using Models
description: How to register and use models in AI Accelerator Pipelines.
description: How to create and use models in AI Accelerator Pipelines.
---

Pipelines has a model registry that manages configured instances of models. Any Pipelines functions that use models, such as embedding and retrieving, must reference a registered model.
Pipelines has a model registry that manages configured instances of models. Any Pipelines functions that use models, such as embedding and retrieving, must reference a model in this registry.

## Discover the preloaded models

Pipelines comes with a set of pre-registerd models that you can use out of the box.
Pipelines comes with a set of pre-created models that you can use out of the box.

To find them, you can run the following query:

```sql
SELECT * FROM aidb.list_registered_models();
SELECT * FROM aidb.list_models();
```

This will return a list of all the models that are currently registered in the system. If you have not registered any models, you'll see the default models that come with Pipelines.
Expand All @@ -29,36 +29,56 @@ This will return a list of all the models that are currently registered in the s

The `bert`, `clip`, and `t5` models are all registered and ready to use. The `dummy` model is a placeholder model that can be used for testing purposes.

## Registering a Model
## Creating a Model

You can also register your own models. To do this, you can use the `aidb.register_model` function. Here is an example of how to register a model:
You can also create your own models. To do this, you can use the `aidb.create_model` function. Here is an example of how to create a model:

```sql
SELECT aidb.register_model('my_model', 'bert_local');
SELECT aidb.create_model('my_model', 'bert_local');
```

This will register a model named `my_model` that uses the default `bert_local` model provider. But, this is essentially the same as using the bert model thats already registered.
This will create a model named `my_model` that uses the default `bert_local` model provider. But, this is essentially the same as using the bert model thats already registered.

## Registering a Model with a Configuration
## Discovering the Model Providers

You can also find out what model providers are available by running the following query:

```sql
SELECT * FROM aidb.model_providers;
__OUTPUT__
server_name | server_options
--------------------+----------------
t5_local |
openai_embeddings |
openai_completions |
bert_local |
clip_local |
dummy |
```

This will return a list of all the model providers that are currently available in the system. You can find out more about these providers and their capabilities in the [Supported Models](./supported-models) section.

## Creating a Model with a Configuration

You can also pass options to the model when registering it. For example, you can specify the model configuration:

```sql
SELECT aidb.register_model('my_model',
SELECT aidb.create_model('my_model',
'bert_local',
'{"model": "sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2", "revision": "main"}'::JSONB);
'{"model": "sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2",
"revision": "main"}'::JSONB);
```

This will register a model named `my_model` that uses the `bert_local` model provider and the `sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2` model from HuggingFace. The `revision` option specifies the version of the model to use. The options are passed as a JSONB object, with a single quoted string that is then cast to JSONB. Within the string are the key-value pairs that define the model configuration in a single JSON object.
This will create a model named `my_model` that uses the `bert_local` model provider and the `sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2` model from HuggingFace. The `revision` option specifies the version of the model to use. The options are passed as a JSONB object, with a single quoted string that is then cast to JSONB. Within the string are the key-value pairs that define the model configuration in a single JSON object.

## Registering a Model with Configuration and Credentials

This is where the other [supported models](./supported-models) come in. You can register a different model by specifying the model name in the configuration. The `OpenAI Completions` and `OpenAI Embeddings` models are both models which you can register to make use of OpenAI's completions and embeddings APIs.
This is where the other [supported models](./supported-models) come in. You can create a different model by specifying the model name in the configuration. The `OpenAI Completions` and `OpenAI Embeddings` models are both models which you can create to make use of OpenAI's completions and embeddings APIs.

You need to provide more information to the `aidb.register_model` function when registering a model like these. Completions has a number of options, including selecting which model it will use on OpenAI. Both Completions and Embeddings requires API credentials. Here is an example of how to register the OpenAI Completions model:
You need to provide more information to the `aidb.create_model` function when registering a model like these. Completions has a number of options, including selecting which model it will use on OpenAI. Both Completions and Embeddings requires API credentials. Here is an example of how to create the OpenAI Completions model:

```sql
SELECT aidb.register_model(
SELECT aidb.create_model(
'my_openai_model',
'openai_completions',
'{"model": "gpt-4o"}'::JSONB,
Expand All @@ -68,22 +88,22 @@ SELECT aidb.register_model(

You should replace the `api_key` value with your own OpenAI API key. Now you can use the `my_openai_model` model in your Pipelines functions and, in this example, leverage the GPT-4o model from OpenAI.

You can also register the OpenAI Embeddings model in a similar way.
You can also create the OpenAI Embeddings model in a similar way.

```sql
SELECT aidb.register_model(
SELECT aidb.create_model(
'my_openai_embeddings',
'openai_embeddings',
'{"model": "text-embedding-3-large"}'::JSONB,
'{"api_key": "sk-abc123xyz456def789ghi012jkl345mn"}'::JSONB
};
```

This will register the `text-embedding-3-large` model with the name `my_openai_embeddings`. You can now use this model in your Pipelines functions to generate embeddings for text data.
This will create the `text-embedding-3-large` model with the name `my_openai_embeddings`. You can now use this model in your Pipelines functions to generate embeddings for text data.

## Using models with OpenAI compatible APIs

These OpenAI models work with any OpenAI compatible API. This allows you to connect and use an even wider range of models, just by passing the appropriate API endpoint to the `url` option in the `aidb.register_model` function's options.
These OpenAI models work with any OpenAI compatible API. This allows you to connect and use an even wider range of models, just by passing the appropriate API endpoint to the `url` option in the `aidb.create_model` function's options.

For more information about the OpenAI models, see the [OpenAI Completions](./supported-models/openai-completions) and [OpenAI Embeddings](./supported-models/openai-embeddings) pages.

Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ Pipelines delivers its functionality through the Pipelines aidb extension, embed

Pipelines' aidb extension introduces the concept of a “retriever” that you can create for a given type and location of AI data. Currently, Pipelines supports unstructured plain text documents as well as a set of image formats. This data can either reside in regular columns of a Postgres table or it can reside in an S3-compatible object storage bucket.

A retriever encapsulates all processing that is needed to make the AI data in the provided source location searchable and retrievable through similarity. The application just needs to create a retriever via the `aidb.register_retriever_for_table()` function for Postgres tables or `aidb.register_retriever_for_volume` for externally stored data on S3 or local filesystems.
A retriever encapsulates all processing that is needed to make the AI data in the provided source location searchable and retrievable through similarity. The application just needs to create a retriever via the `aidb.create_retriever_for_table()` function for Postgres tables or `aidb.create_retriever_for_volume` for externally stored data on S3 or local filesystems.

### Auto embedding

Expand Down
12 changes: 6 additions & 6 deletions advocacy_docs/edb-postgres-ai/ai-accelerator/reference/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -16,10 +16,10 @@ navigation:
* [aidb.model_providers](models#aidbmodel_providers)

### Functions
* [aidb.register_model](models#aidbregister_model)
* [aidb.list_registered_models](models#aidblist_registered_models)
* [aidb.get_registered_model](models#aidbget_registered_model)
* [aidb.delete_registered_model](models#aidbdelete_registered_model)
* [aidb.create_model](models#aidbcreate_model)
* [aidb.list_models](models#aidblist_models)
* [aidb.get_model](models#aidbget_model)
* [aidb.delete_model](models#aidbdelete_model)

## Retrievers

Expand All @@ -29,8 +29,8 @@ navigation:

### Functions

* [aidb.register_retriever_for_table](retrievers#aidbregister_retriever_for_table)
* [aidb.register_retriever_for_volume](retrievers#aidbregister_retriever_for_volume)
* [aidb.create_retriever_for_table](retrievers#aidbcreate_retriever_for_table)
* [aidb.create_retriever_for_volume](retrievers#aidbcreate_retriever_for_volume)
* [aidb.enable_auto_embedding_for_table](retrievers#aidbenable_auto_embedding_for_table)
* [aidb.disable_auto_embedding_for_table](retrievers#aidbdisable_auto_embedding_for_table)
* [aidb.bulk_embedding](retrievers#aidbbulk_embedding)
Expand Down
Loading

0 comments on commit 674421e

Please sign in to comment.