diff --git a/docs/api_refs/blacklisted-entrypoints.json b/docs/api_refs/blacklisted-entrypoints.json index ea3491ec2294..15193d347808 100644 --- a/docs/api_refs/blacklisted-entrypoints.json +++ b/docs/api_refs/blacklisted-entrypoints.json @@ -56,6 +56,7 @@ "../../langchain/src/vectorstores/faiss.ts", "../../langchain/src/vectorstores/weaviate.ts", "../../langchain/src/vectorstores/lancedb.ts", + "../../langchain/src/vectorstores/mariadb.ts", "../../langchain/src/vectorstores/momento_vector_index.ts", "../../langchain/src/vectorstores/mongodb_atlas.ts", "../../langchain/src/vectorstores/pinecone.ts", diff --git a/docs/core_docs/.gitignore b/docs/core_docs/.gitignore index 235b78c4b8e2..0f902d0246fb 100644 --- a/docs/core_docs/.gitignore +++ b/docs/core_docs/.gitignore @@ -254,6 +254,8 @@ docs/integrations/vectorstores/pinecone.md docs/integrations/vectorstores/pinecone.mdx docs/integrations/vectorstores/pgvector.md docs/integrations/vectorstores/pgvector.mdx +docs/integrations/vectorstores/mariadb.md +docs/integrations/vectorstores/mariadb.mdx docs/integrations/vectorstores/mongodb_atlas.md docs/integrations/vectorstores/mongodb_atlas.mdx docs/integrations/vectorstores/memory.md diff --git a/docs/core_docs/docs/how_to/indexing.mdx b/docs/core_docs/docs/how_to/indexing.mdx index 846b59b504dc..ebb9de356cda 100644 --- a/docs/core_docs/docs/how_to/indexing.mdx +++ b/docs/core_docs/docs/how_to/indexing.mdx @@ -63,7 +63,7 @@ When content is mutated (e.g., the source PDF file was revised) there will be a b). delete by id (delete method with ids argument) Compatible Vectorstores: [`PGVector`](/docs/integrations/vectorstores/pgvector), [`Chroma`](/docs/integrations/vectorstores/chroma), [`CloudflareVectorize`](/docs/integrations/vectorstores/cloudflare_vectorize), -[`ElasticVectorSearch`](/docs/integrations/vectorstores/elasticsearch), [`FAISS`](/docs/integrations/vectorstores/faiss), [`MomentoVectorIndex`](/docs/integrations/vectorstores/momento_vector_index), +[`ElasticVectorSearch`](/docs/integrations/vectorstores/elasticsearch), [`FAISS`](/docs/integrations/vectorstores/faiss), [`MariaDB`](/docs/integrations/vectorstores/mariadb), [`MomentoVectorIndex`](/docs/integrations/vectorstores/momento_vector_index), [`Pinecone`](/docs/integrations/vectorstores/pinecone), [`SupabaseVectorStore`](/docs/integrations/vectorstores/supabase), [`VercelPostgresVectorStore`](/docs/integrations/vectorstores/vercel_postgres), [`Weaviate`](/docs/integrations/vectorstores/weaviate), [`Xata`](/docs/integrations/vectorstores/xata) diff --git a/docs/core_docs/docs/integrations/vectorstores/mariadb.ipynb b/docs/core_docs/docs/integrations/vectorstores/mariadb.ipynb new file mode 100644 index 000000000000..313490f8cb3b --- /dev/null +++ b/docs/core_docs/docs/integrations/vectorstores/mariadb.ipynb @@ -0,0 +1,508 @@ +{ + "cells": [ + { + "cell_type": "raw", + "id": "1957f5cb", + "metadata": { + "vscode": { + "languageId": "raw" + } + }, + "source": [ + "---\n", + "sidebar_label: MariaDB\n", + "sidebar_class_name: node-only\n", + "---" + ] + }, + { + "cell_type": "markdown", + "id": "ef1f0986", + "metadata": {}, + "source": [ + "# MariaDB\n", + "\n", + "```{=mdx}\n", + ":::tip Compatibility\n", + "Only available on Node.js.\n", + ":::\n", + "```\n", + "\n", + "This requires MariaDB 11.7 or later version\n", + "\n", + "This guide provides a quick overview for getting started with mariadb [vector stores](/docs/concepts/#vectorstores). For detailed documentation of all `MariaDB store` features and configurations head to the [API reference](https://api.js.langchain.com/classes/langchain_community_vectorstores_mariadb.MariaDBStore.html)." + ] + }, + { + "cell_type": "markdown", + "id": "c824838d", + "metadata": {}, + "source": [ + "## Overview\n", + "\n", + "### Integration details\n", + "\n", + "| Class | Package | [PY support](https://python.langchain.com/docs/integrations/vectorstores/mariadb/) | Package latest |\n", + "| :--- | :--- | :---: | :---: |\n", + "| [`MariaDBStore`](https://api.js.langchain.com/classes/langchain_community_vectorstores_mariadb.MariaDBStore.html) | [`@langchain/community`](https://npmjs.com/@langchain/community) | ✅ | ![NPM - Version](https://img.shields.io/npm/v/@langchain/community?style=flat-square&label=%20&) |" + ] + }, + { + "cell_type": "markdown", + "id": "36fdc060", + "metadata": {}, + "source": [ + "## Setup\n", + "\n", + "To use MariaDBVector vector stores, you'll need to set up a MariaDB 11.7 version or later with the [`mariadb`](https://www.npmjs.com/package/mariadb) connector as a peer dependency.\n", + "\n", + "This guide will also use [OpenAI embeddings](/docs/integrations/text_embedding/openai), which require you to install the `@langchain/openai` integration package. You can also use [other supported embeddings models](/docs/integrations/text_embedding) if you wish.\n", + "\n", + "We'll also use the [`uuid`](https://www.npmjs.com/package/uuid) package to generate ids in the required format.\n", + "\n", + "```{=mdx}\n", + "import IntegrationInstallTooltip from \"@mdx_components/integration_install_tooltip.mdx\";\n", + "import Npm2Yarn from \"@theme/Npm2Yarn\";\n", + "\n", + "\n", + "\n", + "\n", + " @langchain/community @langchain/openai @langchain/core mariadb uuid\n", + "\n", + "```\n", + "\n", + "### Setting up an instance\n", + "\n", + "Create a file with the below content named docker-compose.yml:\n", + "\n", + "```yaml\n", + "# Run this command to start the database:\n", + "# docker-compose up --build\n", + "version: \"3\"\n", + "services:\n", + " db:\n", + " hostname: 127.0.0.1\n", + " image: mariadb/mariadb:11.7-rc\n", + " ports:\n", + " - 3306:3306\n", + " restart: always\n", + " environment:\n", + " - MARIADB_DATABASE=api\n", + " - MARIADB_USER=myuser\n", + " - MARIADB_PASSWORD=ChangeMe\n", + " - MARIADB_ROOT_PASSWORD=ChangeMe\n", + " volumes:\n", + " - ./init.sql:/docker-entrypoint-initdb.d/init.sql\n", + "```\n", + "\n", + "And then in the same directory, run docker compose up to start the container.\n", + "\n", + "### Credentials\n", + "\n", + "To connect to you MariaDB instance, you'll need corresponding credentials. For a full list of supported options, see the [`mariadb` docs](https://github.com/mariadb-corporation/mariadb-connector-nodejs/blob/master/documentation/promise-api.md#connection-options).\n", + "\n", + "If you are using OpenAI embeddings for this guide, you'll need to set your OpenAI key as well:\n", + "\n", + "```typescript\n", + "process.env.OPENAI_API_KEY = \"YOUR_API_KEY\";\n", + "```\n", + "\n", + "If you want to get automated tracing of your model calls you can also set your [LangSmith](https://docs.smith.langchain.com/) API key by uncommenting below:\n", + "\n", + "```typescript\n", + "// process.env.LANGCHAIN_TRACING_V2=\"true\"\n", + "// process.env.LANGCHAIN_API_KEY=\"your-api-key\"\n", + "```" + ] + }, + { + "cell_type": "markdown", + "id": "93df377e", + "metadata": {}, + "source": [ + "## Instantiation\n", + "\n", + "To instantiate the vector store, call the `.initialize()` static method. This will automatically check for the presence of a table, given by `tableName` in the passed `config`. If it is not there, it will create it with the required columns.\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "id": "dc37144c-208d-4ab3-9f3a-0407a69fe052", + "metadata": { + "tags": [] + }, + "outputs": [], + "source": [ + "import { OpenAIEmbeddings } from \"@langchain/openai\";\n", + "import { FilterExpressionBuilder } from \"@langchain/core/filter\";\n", + "import {\n", + " DistanceStrategy,\n", + " MariaDBStore,\n", + "} from \"@langchain/community/vectorstores/mariadb\";\n", + "import { PoolConfig } from \"mariadb\";\n", + "\n", + "const config = {\n", + " connectionOptions: {\n", + " type: \"mariadb\",\n", + " host: \"127.0.0.1\",\n", + " port: 3306,\n", + " user: \"myuser\",\n", + " password: \"ChangeMe\",\n", + " database: \"api\",\n", + " } as PoolConfig,\n", + " distanceStrategy: 'EUCLIDEAN' as DistanceStrategy,\n", + "};\n", + + "const vectorStore = await MariaDBStore.initialize(\n", + " new OpenAIEmbeddings(),\n", + " config\n", + ");" + ] + }, + { + "cell_type": "markdown", + "id": "ac6071d4", + "metadata": {}, + "source": [ + "## Manage vector store\n", + "\n", + "### Add items to vector store" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "id": "17f5efc0", + "metadata": {}, + "outputs": [], + "source": [ + "import { v4 as uuidv4 } from \"uuid\";\n", + "import type { Document } from \"@langchain/core/documents\";\n", + "\n", + "const document1: Document = {\n", + " pageContent: \"The powerhouse of the cell is the mitochondria\",\n", + " metadata: { source: \"https://example.com\" }\n", + "};\n", + "\n", + "const document2: Document = {\n", + " pageContent: \"Buildings are made out of brick\",\n", + " metadata: { source: \"https://example.com\" }\n", + "};\n", + "\n", + "const document3: Document = {\n", + " pageContent: \"Mitochondria are made out of lipids\",\n", + " metadata: { source: \"https://example.com\" }\n", + "};\n", + "\n", + "const document4: Document = {\n", + " pageContent: \"The 2024 Olympics are in Paris\",\n", + " metadata: { source: \"https://example.com\" }\n", + "}\n", + "\n", + "const documents = [document1, document2, document3, document4];\n", + "\n", + "const ids = [uuidv4(), uuidv4(), uuidv4(), uuidv4()]\n", + "\n", + "# ids are not mandatory, but that's for the example\n", + "await vectorStore.addDocuments(documents, { ids: ids });" + ] + }, + { + "cell_type": "markdown", + "id": "dcf1b905", + "metadata": {}, + "source": [ + "### Delete items from vector store" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "id": "ef61e188", + "metadata": {}, + "outputs": [], + "source": [ + "const id4 = ids[ids.length - 1];\n", + "\n", + "await vectorStore.delete({ ids: [id4] });" + ] + }, + { + "cell_type": "markdown", + "id": "c3620501", + "metadata": {}, + "source": [ + "## Query vector store\n", + "\n", + "Once your vector store has been created and the relevant documents have been added you will most likely wish to query it during the running of your chain or agent. \n", + "\n", + "### Query directly\n", + "\n", + "Performing a simple similarity search can be done as follows:" + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "id": "aa0a16fa", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "* The powerhouse of the cell is the mitochondria [{\"year\": 2021}]\n", + "* Mitochondria are made out of lipids [{\"year\": 2022}]\n" + ] + } + ], + "source": [ + "const b = new FilterExpressionBuilder();\n", + "const filter = b.gte(\"year\", 2021); // year >= 2021\n" + "\n", + "const similaritySearchResults = await vectorStore.similaritySearch(\"biology\", 2, filter);\n", + "\n", + "for (const doc of similaritySearchResults) {\n", + " console.log(`* ${doc.pageContent} [${JSON.stringify(doc.metadata, null)}]`);\n", + "}" + ] + }, + { + "cell_type": "markdown", + "id": "3ed9d733", + "metadata": {}, + "source": [ + "The above filter syntax use be more complex:\n", + "\n", + "```json\n", + "// name = 'martin' OR firstname = 'john'" + "const anotherFilter = b.or(b.eq(\"name\", \"martin\"), b.eq(\"firstname", \"john\"))\n", + "```\n", + "\n", + "If you want to execute a similarity search and receive the corresponding scores you can run:" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "id": "5efd2eaa", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "* [SIM=0.835] The powerhouse of the cell is the mitochondria [{\"source\":\"https://example.com\"}]\n", + "* [SIM=0.852] Mitochondria are made out of lipids [{\"source\":\"https://example.com\"}]\n" + ] + } + ], + "source": [ + "const similaritySearchWithScoreResults = await vectorStore.similaritySearchWithScore(\"biology\", 2)\n", + "\n", + "for (const [doc, score] of similaritySearchWithScoreResults) {\n", + " console.log(`* [SIM=${score.toFixed(3)}] ${doc.pageContent} [${JSON.stringify(doc.metadata)}]`);\n", + "}" + ] + }, + { + "cell_type": "markdown", + "id": "0c235cdc", + "metadata": {}, + "source": [ + "### Query by turning into retriever\n", + "\n", + "You can also transform the vector store into a [retriever](/docs/concepts/retrievers) for easier usage in your chains. " + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "id": "f3460093", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[\n", + " Document {\n", + " pageContent: 'The powerhouse of the cell is the mitochondria',\n", + " metadata: { source: 'https://example.com' },\n", + " id: undefined\n", + " },\n", + " Document {\n", + " pageContent: 'Mitochondria are made out of lipids',\n", + " metadata: { source: 'https://example.com' },\n", + " id: undefined\n", + " }\n", + "]\n" + ] + } + ], + "source": [ + "const retriever = vectorStore.asRetriever({\n", + " // Optional filter\n", + " filter: filter,\n", + " k: 2,\n", + "});\n", + "await retriever.invoke(\"biology\");" + ] + }, + { + "cell_type": "markdown", + "id": "e2e0a211", + "metadata": {}, + "source": [ + "### Usage for retrieval-augmented generation\n", + "\n", + "For guides on how to use this vector store for retrieval-augmented generation (RAG), see the following sections:\n", + "\n", + "- [Tutorials: working with external knowledge](/docs/tutorials/#working-with-external-knowledge).\n", + "- [How-to: Question and answer with RAG](/docs/how_to/#qa-with-rag)\n", + "- [Retrieval conceptual docs](/docs/concepts/retrieval)" + ] + }, + { + "cell_type": "markdown", + "id": "371727a8", + "metadata": {}, + "source": [ + "## Advanced: reusing connections\n", + "\n", + "You can reuse connections by creating a pool, then creating new `MariaDBStore` instances directly via the constructor.\n", + "\n", + "Note that you should call `.initialize()` to set up your database at least once to set up your tables properly before using the constructor." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "09efeac4", + "metadata": {}, + "outputs": [], + "source": [ + "import { OpenAIEmbeddings } from \"@langchain/openai\";\n", + "import { MariaDBStore } from \"@langchain/community/vectorstores/mariadb\";\n", + "import mariadb from \"mariadb\";\n", + "\n", + "// First, follow set-up instructions at\n", + "// https://js.langchain.com/docs/modules/indexes/vector_stores/integrations/mariadb\n", + "\n", + "const reusablePool = new mariadb.createPool({\n", + " host: \"127.0.0.1\",\n", + " port: 3306,\n", + " user: \"myuser\",\n", + " password: \"ChangeMe\",\n", + " database: \"api\",\n", + "});\n", + "\n", + "const originalConfig = {\n", + " pool: reusablePool,\n", + " tableName: \"testlangchainjs\",\n", + " collectionName: \"sample\",\n", + " collectionTableName: \"collections\",\n", + " columns: {\n", + " idColumnName: \"id\",\n", + " vectorColumnName: \"vect\",\n", + " contentColumnName: \"content\",\n", + " metadataColumnName: \"metadata\",\n", + " },\n", + "};\n", + "\n", + "// Set up the DB.\n", + "// Can skip this step if you've already initialized the DB.\n", + "// await MariaDBStore.initialize(new OpenAIEmbeddings(), originalConfig);\n", + "const mariadbStore = new MariaDBStore(new OpenAIEmbeddings(), originalConfig);\n", + "\n", + "await mariadbStore.addDocuments([\n", + " { pageContent: \"what's this\", metadata: { a: 2 } },\n", + " { pageContent: \"Cat drinks milk\", metadata: { a: 1 } },\n", + "]);\n", + "\n", + "const results = await mariadbStore.similaritySearch(\"water\", 1);\n", + "\n", + "console.log(results);\n", + "\n", + "/*\n", + " [ Document { pageContent: 'Cat drinks milk', metadata: { a: 1 } } ]\n", + "*/\n", + "\n", + "const mariadbStore2 = new MariaDBStore(new OpenAIEmbeddings(), {\n", + " pool: reusablePool,\n", + " tableName: \"testlangchainjs\",\n", + " collectionTableName: \"collections\",\n", + " collectionName: \"some_other_collection\",\n", + " columns: {\n", + " idColumnName: \"id\",\n", + " vectorColumnName: \"vector\",\n", + " contentColumnName: \"content\",\n", + " metadataColumnName: \"metadata\",\n", + " },\n", + "});\n", + "\n", + "const results2 = await mariadbStore2.similaritySearch(\"water\", 1);\n", + "\n", + "console.log(results2);\n", + "\n", + "/*\n", + " []\n", + "*/\n", + "\n", + "await reusablePool.end();" + ] + }, + { + "cell_type": "markdown", + "id": "069f1b5f", + "metadata": {}, + "source": [ + "## Closing connections\n", + "\n", + "Make sure you close the connection when you are finished to avoid excessive resource consumption:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "f71ce986", + "metadata": {}, + "outputs": [], + "source": [ + "await vectorStore.end();" + ] + }, + { + "cell_type": "markdown", + "id": "8a27244f", + "metadata": {}, + "source": [ + "## API reference\n", + "\n", + "For detailed documentation of all `MariaDBStore` features and configurations head to the [API reference](https://api.js.langchain.com/classes/langchain_community_vectorstores_mariadb.MariaDBStore.html)." + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "TypeScript", + "language": "typescript", + "name": "tslab" + }, + "language_info": { + "codemirror_mode": { + "mode": "typescript", + "name": "javascript", + "typescript": true + }, + "file_extension": ".ts", + "mimetype": "text/typescript", + "name": "typescript", + "version": "3.7.2" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} \ No newline at end of file diff --git a/docs/core_docs/src/theme/FeatureTables.js b/docs/core_docs/src/theme/FeatureTables.js index b2dec1539d53..c36ac434db1c 100644 --- a/docs/core_docs/src/theme/FeatureTables.js +++ b/docs/core_docs/src/theme/FeatureTables.js @@ -673,6 +673,19 @@ const FEATURE_TABLES = { local: true, idsInAddDocuments: false, }, + { + name: "mariadb", + link: "mariadb", + deleteById: true, + filtering: true, + searchByVector: true, + searchWithScore: true, + async: true, + passesStandardTests: false, + multiTenancy: false, + local: true, + idsInAddDocuments: false, + }, { name: "Milvus", link: "milvus", diff --git a/examples/package.json b/examples/package.json index 177aa6433317..267c5390d7c4 100644 --- a/examples/package.json +++ b/examples/package.json @@ -94,6 +94,7 @@ "js-yaml": "^4.1.0", "langchain": "workspace:*", "langsmith": "^0.2.8", + "mariadb": "^3.4.0", "mongodb": "^6.3.0", "pg": "^8.11.0", "pickleparser": "^0.2.1", diff --git a/examples/src/indexes/vector_stores/mariadb_vectorstore/docker-compose.example.yml b/examples/src/indexes/vector_stores/mariadb_vectorstore/docker-compose.example.yml new file mode 100644 index 000000000000..c27be10e888a --- /dev/null +++ b/examples/src/indexes/vector_stores/mariadb_vectorstore/docker-compose.example.yml @@ -0,0 +1,10 @@ +services: + db: + image: mariadb/mariadb:11.7-rc + ports: + - 3306:3306 + environment: + - MARIADB_USER=myuser + - MARIADB_PASSWORD=ChangeMe + - MARIADB_ROOT_PASSWORD=ChangeMe + - MARIADB_DATABASE=api diff --git a/examples/src/indexes/vector_stores/mariadb_vectorstore/mariadb.ts b/examples/src/indexes/vector_stores/mariadb_vectorstore/mariadb.ts new file mode 100644 index 000000000000..06aff473f4dd --- /dev/null +++ b/examples/src/indexes/vector_stores/mariadb_vectorstore/mariadb.ts @@ -0,0 +1,67 @@ +import { OpenAIEmbeddings } from "@langchain/openai"; +import { FilterExpressionBuilder } from "@langchain/core/filter"; +import { + DistanceStrategy, + MariaDBStore, +} from "@langchain/community/vectorstores/mariadb"; +import { PoolConfig } from "mariadb"; + +// First, follow set-up instructions at +// https://js.langchain.com/docs/modules/indexes/vector_stores/integrations/mariadb + +const config = { + connectionOptions: { + type: "mariadb", + host: "127.0.0.1", + port: 3306, + user: "myuser", + password: "ChangeMe", + database: "api", + } as PoolConfig, + distanceStrategy: 'EUCLIDEAN' as DistanceStrategy, +}; + +const vectorStore = await MariaDBStore.initialize( + new OpenAIEmbeddings(), + config +); + +await vectorStore.addDocuments([ + { + pageContent: "what's this", + metadata: { country: "EN", year: 2021, city: "london" }, + }, + { pageContent: "Cat drinks milk", metadata: { country: "GE", year: 2020 } }, +]); + +const results = await vectorStore.similaritySearch("water", 1); + +console.log(results); +// [ Document { pageContent: 'Cat drinks milk', metadata: { country: 'GE', year: 2020 }, id: ... } ] + +// Filtering is supported +const b = new FilterExpressionBuilder(); +let filter = b.gte("year", 2021); // year >= 2021 +const results2 = await vectorStore.similaritySearch("water", 1, filter); +console.log(results2); +// [ Document { pageContent: 'what's this', metadata: { country: 'EN', year: 2021, city: 'london' } } ] + +// more complex filter +filter = b.and(b.gte("year", 2021), b.in("country", ["US", "EN"])); // year >= 2021 AND country IN ['US, 'EN'] +const results3 = await vectorStore.similaritySearch("water", 1, filter); +console.log(results3); +// [ Document { pageContent: 'what's this', metadata: { country: 'EN', year: 2021, city: 'london' }, id: ... } ] + +await vectorStore.delete({ filter: b.gte("year", 2021) }); + +const results4 = await vectorStore.similaritySearch("water", 1); +console.log(results4); +// [ Document { pageContent: 'Cat drinks milk', metadata: { country: 'GE', year: 2020 }, id: ... } ] + +// Filtering using array is supported +const results5 = await vectorStore.similaritySearch("water", 1, b.in("b", ["tag1"])); + +console.log(results5); +// [ ] + +await vectorStore.end(); diff --git a/examples/src/indexes/vector_stores/mariadb_vectorstore/mariadb_pool.ts b/examples/src/indexes/vector_stores/mariadb_vectorstore/mariadb_pool.ts new file mode 100644 index 000000000000..f40304038ed8 --- /dev/null +++ b/examples/src/indexes/vector_stores/mariadb_vectorstore/mariadb_pool.ts @@ -0,0 +1,62 @@ +import { OpenAIEmbeddings } from "@langchain/openai"; +import { MariaDBStore } from "@langchain/community/vectorstores/mariadb"; +import mariadb from "mariadb"; + +// First, follow set-up instructions at +// https://js.langchain.com/docs/modules/indexes/vector_stores/integrations/mariadb + +const reusablePool = mariadb.createPool({ + host: "127.0.0.1", + port: 3306, + user: "myuser", + password: "ChangeMe", + database: "api", +}); + +const originalConfig = { + pool: reusablePool, + tableName: "testlangchain", + collectionName: "sample", + collectionTableName: "collections", + columns: { + idColumnName: "id", + vectorColumnName: "vect", + contentColumnName: "content", + metadataColumnName: "metadata", + }, +}; + +// Set up the DB. +// Can skip this step if you've already initialized the DB. +const vectorStore = await MariaDBStore.initialize(new OpenAIEmbeddings(), originalConfig); +// const vectorStore = new MariaDBStore(new OpenAIEmbeddings(), originalConfig); + +await vectorStore.addDocuments([ + { pageContent: "what's this", metadata: { a: 2 } }, + { pageContent: "Cat drinks milk", metadata: { a: 1 } }, +]); + +const results = await vectorStore.similaritySearch("water", 1); + +console.log(results); +// [ Document { pageContent: 'Cat drinks milk', metadata: { a: 1 }, id: ... } ] + +const vectorStore2 = new MariaDBStore(new OpenAIEmbeddings(), { + pool: reusablePool, + tableName: "testlangchain", + collectionTableName: "collections", + collectionName: "some_other_collection", + columns: { + idColumnName: "id", + vectorColumnName: "vect", + contentColumnName: "content", + metadataColumnName: "metadata", + }, +}); + +const results2 = await vectorStore2.similaritySearch("water", 1); + +console.log(results2); +// [] + +await reusablePool.end(); diff --git a/langchain-core/.gitignore b/langchain-core/.gitignore index 6876afce9643..3fee89cb3d66 100644 --- a/langchain-core/.gitignore +++ b/langchain-core/.gitignore @@ -54,6 +54,10 @@ example_selectors.cjs example_selectors.js example_selectors.d.ts example_selectors.d.cts +filter.cjs +filter.js +filter.d.ts +filter.d.cts indexing.cjs indexing.js indexing.d.ts diff --git a/langchain-core/langchain.config.js b/langchain-core/langchain.config.js index d51e73e0b122..67236bb53586 100644 --- a/langchain-core/langchain.config.js +++ b/langchain-core/langchain.config.js @@ -26,6 +26,7 @@ export const config = { "document_loaders/langsmith": "document_loaders/langsmith", embeddings: "embeddings", example_selectors: "example_selectors/index", + filter: "filter", indexing: "indexing/index", "language_models/base": "language_models/base", "language_models/chat_models": "language_models/chat_models", diff --git a/langchain-core/package.json b/langchain-core/package.json index 4fc519cf2922..f26acebec489 100644 --- a/langchain-core/package.json +++ b/langchain-core/package.json @@ -215,6 +215,15 @@ "import": "./example_selectors.js", "require": "./example_selectors.cjs" }, + "./filter": { + "types": { + "import": "./filter.d.ts", + "require": "./filter.d.cts", + "default": "./filter.d.ts" + }, + "import": "./filter.js", + "require": "./filter.cjs" + }, "./indexing": { "types": { "import": "./indexing.d.ts", @@ -680,6 +689,10 @@ "example_selectors.js", "example_selectors.d.ts", "example_selectors.d.cts", + "filter.cjs", + "filter.js", + "filter.d.ts", + "filter.d.cts", "indexing.cjs", "indexing.js", "indexing.d.ts", diff --git a/langchain-core/src/filter.ts b/langchain-core/src/filter.ts new file mode 100644 index 000000000000..e669210a8d44 --- /dev/null +++ b/langchain-core/src/filter.ts @@ -0,0 +1,577 @@ +/** + * A flexible, runtime-based metadata filtering system that creates + * platform-independent filter expressions. This generative approach + * allows defining search filters that can be later translated into + * specific vector database query languages. + * + * Supports standard comparison operations like: + * - Equality and inequality (==, !=) + * - Numeric comparisons (<, <=, >, >=) + * - Inclusion and exclusion checks (IN/NOT IN) + * - Logical combinations using AND and OR operators + */ + +/** + * Comprehensive metadata filtering expression operations: + * + * Comparison Operations: + * - Supports exact matching (Equal) and inequality comparisons + * - Includes comparison types: + * - Greater Than (GT) + * - Greater Than or Equal (GTE) + * - Less Than (LT) + * - Less Than or Equal (LTE) + * - These operations follow the pattern: "Key Operator Value" + * + * Logical Combination Operations: + * - AND and OR operators for combining multiple filter expressions + * - Can combine individual expressions or grouped expressions + * - Allows complex nested filtering logic + * + * Collection Membership Checks: + * - IN operator: Checks if a value is present in a collection + * - NOT IN (NIN) operator: Checks if a value is absent from a collection + * - Supports checking key membership against an array of values + */ +// test +export const enum Operator { + AND, + OR, + EQ, + NE, + GT, + GTE, + LT, + LTE, + IN, + NIN, + NOT, +} + +export type Operand = Key | Value | Expression | Group; + +export class Key { + constructor(public key: string) {} +} + +export class Value { + constructor( + public value: number | string | boolean | number[] | string[] | boolean[] + ) {} +} + +/** + * Represents a boolean filter expression with a specific structure: + * - Consists of a left operand, an operator, and an optional right operand + * - Enables construction of complex filtering logic using different types of comparisons + * + * The expression follows the pattern: `left operator right` + * (Note: Some operators may only require a left operand) + */ +export class Expression { + constructor( + public type: Operator, + public left: Operand, + public right?: Operand + ) {} +} + +/** + * Represents a grouped collection of filter expressions that should be evaluated together + * - Enables creating complex, nested filtering logic with specific evaluation precedence + * - Analogous to parentheses in mathematical or logical expressions + * - Allows nested or complex filtering conditions to be treated as a single logical unit + */ +export class Group { + constructor(public content: Expression) {} +} + +/** + * Fluent builder for creating flexible and composable filter expressions + * + * Purpose: + * - Provides an intuitive, method-chaining approach to constructing complex filter conditions + * - Supports various comparison and logical operations for metadata filtering + * + * Features: + * - Equality and inequality checks + * - Numeric comparisons (greater than, less than, etc.) + * - Logical combinations (AND, OR, NOT) + * - Collection membership tests (IN, NOT IN) + * - Expression grouping for complex nested conditions + * + * Examples: + * ```typescript + * // Simple equality filter + * const catGenreFilter = b.eq('genre', 'cat'); + * + * // Complex compound filter + * const advancedFilter = b.and( + * b.eq('genre', 'dog'), + * b.gte('birth', 2023) + * ); + * // Translates to: (genre == "dog") AND (birth >= 2023) + * ``` + */ +export class FilterExpressionBuilder { + eq( + key: string, + value: number | string | boolean | number[] | string[] | boolean[] + ): Expression { + return new Expression( + Operator.EQ, + new Key(key), + value ? new Value(value) : undefined + ); + } + + ne( + key: string, + value: number | string | boolean | number[] | string[] | boolean[] + ): Expression { + return new Expression( + Operator.NE, + new Key(key), + value ? new Value(value) : undefined + ); + } + + gt(key: string, value: number | string): Expression { + return new Expression( + Operator.GT, + new Key(key), + value ? new Value(value) : undefined + ); + } + + gte(key: string, value: number | string): Expression { + return new Expression( + Operator.GTE, + new Key(key), + value ? new Value(value) : undefined + ); + } + + lt(key: string, value: number | string): Expression { + return new Expression( + Operator.LT, + new Key(key), + value ? new Value(value) : undefined + ); + } + + lte(key: string, value: number | string): Expression { + return new Expression( + Operator.LTE, + new Key(key), + value ? new Value(value) : undefined + ); + } + + and(left: Operand, right: Operand): Expression { + return new Expression(Operator.AND, left, right); + } + + or(left: Operand, right: Operand): Expression { + return new Expression(Operator.OR, left, right); + } + + in(key: string, values: number[] | string[] | boolean[]): Expression { + return new Expression( + Operator.IN, + new Key(key), + values ? new Value(values) : undefined + ); + } + + nin(key: string, values: number[] | string[] | boolean[]): Expression { + return new Expression( + Operator.NIN, + new Key(key), + values ? new Value(values) : undefined + ); + } + + group(content: Expression): Group { + return new Group(content); + } + + not(content: Expression): Expression { + return new Expression(Operator.NOT, content); + } +} + +/** + * Simple StringBuilder + */ +export class StringBuilder { + buffer: string[] = []; + + append(str: string): void { + this.buffer.push(str); + } + + toString(): string { + return this.buffer.join(""); + } +} + +/** + * Defines a contract for converting filter expressions into various string-based query representations + * + * Purpose: + * - Provides a flexible mechanism for translating abstract filter expressions + * - Supports conversion of complex filter logic across different query languages or databases + * + * Key Conversion Methods: + * - Transform expressions, operands, keys, values into string representations + * - Handle nested expressions and grouped conditions + * - Support range and list value conversions + */ +export interface FilterExpressionConverter { + /** + * Converts a complete expression into its string representation + * @param expression The filter expression to convert + * @returns A string query representation of the expression + */ + convertExpression(expression: Expression): string; + + /** + * Determines the appropriate operation symbol for a given expression + * @param exp The expression to analyze + * @param context The string builder to append the representation + */ + convertSymbolToContext(exp: Expression, context: StringBuilder): void; + + /** + * Converts an operand into a string representation within a given context + * @param operand The operand to convert (Key, Value, Expression, or Group) + * @param context The string builder to append the representation + */ + convertOperandToContext( + operand: Key | Value | Expression | Group, + context: StringBuilder + ): void; + + // Additional conversion methods for specific components... + convertExpressionToContext(expression: Expression, context: StringBuilder): void; + convertKeyToContext(filterKey: Key, context: StringBuilder): void; + convertValueToContext(filterValue: Value, context: StringBuilder): void; + convertSingleValueToContext( + value: number | string | boolean | number[] | string[] | boolean[], + context: StringBuilder + ): void; + + // Group and range handling methods + writeGroupStart(group: Group, context: StringBuilder): void; + writeGroupEnd(group: Group, context: StringBuilder): void; + writeValueRangeStart(listValue: Value, context: StringBuilder): void; + writeValueRangeEnd(listValue: Value, context: StringBuilder): void; + writeValueRangeSeparator(listValue: Value, context: StringBuilder): void; +} + +// Define the negation map +const TYPE_NEGATION_MAP: Record = { + [Operator.AND]: Operator.OR, + [Operator.OR]: Operator.AND, + [Operator.EQ]: Operator.NE, + [Operator.NE]: Operator.EQ, + [Operator.GT]: Operator.LTE, + [Operator.GTE]: Operator.LT, + [Operator.LT]: Operator.GTE, + [Operator.LTE]: Operator.GT, + [Operator.IN]: Operator.NIN, + [Operator.NIN]: Operator.IN, + [Operator.NOT]: Operator.NOT, +}; + +/** + * Abstract base class for converting filter expressions into string representations + * + * Purpose: + * - Provides a flexible, extensible framework for converting complex filter expressions + * - Defines a standard conversion process with pluggable implementation details + * + * Key Features: + * - Supports conversion of various operand types (Keys, Values, Expressions, Groups) + * - Handles different operator types and conversion strategies + * - Provides default implementations with extensible methods + */ +export class BaseFilterExpressionConverter { + /** + * Transforms a filter expression into a string + * @param expression The filter condition to convert + * @returns A string version of the expression + */ + convertExpression(expression: Expression): string { + return this.convertOperand(expression); + } + + /** + * Converts an operand to a string using a StringBuilder + * @param operand The filter component to convert + * @returns The string representation of the operand + */ + private convertOperand(operand: Operand): string { + const context = new StringBuilder(); + this.convertOperandToContext(operand, context); + return context.toString(); + } + + /** + * Provides standard symbols for different logical and comparison operators + * @param exp The expression to get a symbol for + * @returns The corresponding operator symbol + */ + convertSymbolToContext(exp: Expression, context: StringBuilder): void { + const symbolMap = { + [Operator.AND]: " AND ", + [Operator.OR]: " OR ", + [Operator.EQ]: " = ", + [Operator.NE]: " != ", + [Operator.LT]: " < ", + [Operator.LTE]: " <= ", + [Operator.GT]: " > ", + [Operator.GTE]: " >= ", + [Operator.IN]: " IN ", + [Operator.NOT]: " NOT IN ", + [Operator.NIN]: " NOT IN ", + }; + context.append( + symbolMap[exp.type] || + (() => { + throw new Error(`Unsupported expression type: ${exp.type}`); + })() + ); + } + + /** + * Converts different types of operands (groups, keys, values, expressions) to strings + * @param operand The operand to convert + * @param context The StringBuilder to append the conversion result + */ + convertOperandToContext( + operand: Key | Value | Expression | Group, + context: StringBuilder + ): void { + const conversionMap = { + [Group.name]: () => this.convertGroupToContext(operand as Group, context), + [Key.name]: () => this.convertKeyToContext(operand as Key, context), + [Value.name]: () => this.convertValueToContext(operand as Value, context), + [Expression.name]: () => { + const exp = operand as Expression; + + // Validate expression structure + if ( + exp.type !== Operator.NOT && + exp.type !== Operator.AND && + exp.type !== Operator.OR && + // eslint-disable-next-line no-instanceof/no-instanceof + !(exp.right instanceof Value) + ) { + throw new Error( + "Non AND/OR expression must have Value right argument!" + ); + } + + // Handle different expression types + // eslint-disable-next-line no-unused-expressions + exp.type === Operator.NOT + ? this.convertNotExpressionToContext(exp, context) + : this.convertExpressionToContext(exp, context); + }, + }; + + const converter = conversionMap[operand.constructor.name]; + if (converter) { + converter(); + } else { + throw new Error("Unexpected operand type"); + } + } + + /** + * Transforms a NOT expression into its logically equivalent form + * @param expression The NOT expression to convert + * @param context The context to append the converted expression + */ + convertNotExpressionToContext(expression: Expression, context: StringBuilder): void { + this.convertOperandToContext(this.negateOperand(expression), context); + } + + /** + * Reverses the logic of an operand + * Handles complex negation scenarios for different types of expressions + * @param operand The operand to negate + * @returns The logically negated operand + */ + /* eslint-disable no-instanceof/no-instanceof */ + negateOperand(operand: Operand): Operand { + if (operand instanceof Group) { + let inEx = this.negateOperand(operand.content); + + // If the negated content is another group, extract its content + if (inEx instanceof Group) { + inEx = inEx.content; + } + + return new Group(inEx as Expression); + } else if (operand instanceof Expression) { + const exp = operand as Expression; + + switch (exp.type) { + case Operator.NOT: // NOT(NOT(a)) = a + return this.negateOperand(exp.left as Expression); + + case Operator.AND: // NOT(a AND b) = NOT(a) OR NOT(b) + case Operator.OR: // NOT(a OR b) = NOT(a) AND NOT(b) + return new Expression( + TYPE_NEGATION_MAP[exp.type], + this.negateOperand(exp.left as Expression) as Expression, + this.negateOperand(exp.right as Expression) as Expression + ); + + case Operator.EQ: // NOT(e EQ b) = e NE b + case Operator.NE: // NOT(e NE b) = e EQ b + case Operator.GT: // NOT(e GT b) = e LTE b + case Operator.GTE: // NOT(e GTE b) = e LT b + case Operator.LT: // NOT(e LT b) = e GTE b + case Operator.LTE: // NOT(e LTE b) = e GT b + case Operator.IN: // NOT(e IN [...]) = e NIN [...] + case Operator.NIN: // NOT(e NIN [...]) = e IN [...] + return new Expression( + TYPE_NEGATION_MAP[exp.type], + exp.left, + exp.right + ); + + default: + throw new Error(`Unknown expression type: ${exp.type}`); + } + } else { + throw new Error(`Cannot negate operand of type: ${operand}`); + } + } + /* eslint-enable no-instanceof/no-instanceof */ + + // Abstract methods to be implemented by subclasses + /** + * Convert the given expression into a string representation + * @param expression the expression to convert + * @param context the context to append the string representation to + */ + // eslint-disable-next-line @typescript-eslint/ban-ts-comment + // @ts-ignore + convertExpressionToContext(expression: Expression, context: StringBuilder): void { + throw new Error("must be implemented in derived class"); + } + + /** + * Convert the given key into a string representation + * @param filterKey the key to convert + * @param context the context to append the string representation to + */ + // eslint-disable-next-line @typescript-eslint/ban-ts-comment + // @ts-ignore + convertKeyToContext(filterKey: Key, context: StringBuilder): void { + throw new Error("must be implemented in derived class"); + } + + /** + * Converts a value (single or list) to its string representation + * @param filterValue The value to convert + * @param context The context to append the conversion result + */ + convertValueToContext(filterValue: Value, context: StringBuilder): void { + if (Array.isArray(filterValue.value)) { + this.writeValueRangeStart(filterValue, context); + for (let i = 0; i < filterValue.value.length; i += 1) { + this.convertSingleValueToContext(filterValue.value[i], context); + if (i < filterValue.value.length - 1) { + this.writeValueRangeSeparator(filterValue, context); + } + } + this.writeValueRangeEnd(filterValue, context); + } else { + this.convertSingleValueToContext(filterValue.value, context); + } + } + + /** + * Convert a single value into a string representation + * @param value the value to convert + * @param context the context to append the string representation to + */ + convertSingleValueToContext( + value: number | string | boolean, + context: StringBuilder + ): void { + if (typeof value === "string") { + context.append(`'${value}'`); + } else { + context.append(value.toString()); + } + } + + /** + * Convert a group into a string representation + * @param group the group to convert + * @param context the context to append the string representation to + */ + private convertGroupToContext(group: Group, context: StringBuilder): void { + this.writeGroupStart(group, context); + this.convertOperandToContext(group.content, context); + this.writeGroupEnd(group, context); + } + + /** + * Start group representation + * @param group the group to convert + * @param context the context to append the string representation to + */ + // eslint-disable-next-line @typescript-eslint/ban-ts-comment + // @ts-ignore + writeGroupStart(group: Group, context: StringBuilder): void {} + + /** + * End group representation + * @param group the group to convert + * @param context the context to append the string representation to + */ + // eslint-disable-next-line @typescript-eslint/ban-ts-comment + // @ts-ignore + writeGroupEnd(group: Group, context: StringBuilder): void {} + + /** + * Start value range representation + * @param listValue the value range to convert + * @param context the context to append the string representation to + */ + // eslint-disable-next-line @typescript-eslint/ban-ts-comment + // @ts-ignore + writeValueRangeStart(listValue: Value, context: StringBuilder): void { + context.append("["); + } + + /** + * End value range representation + * @param listValue the value range to convert + * @param context the context to append the string representation to + */ + // eslint-disable-next-line @typescript-eslint/ban-ts-comment + // @ts-ignore + writeValueRangeEnd(listValue: Value, context: StringBuilder): void { + context.append("]"); + } + + /** + * Add value range splitter + * @param listValue the value range to convert + * @param context the context to append the string representation to + */ + // eslint-disable-next-line @typescript-eslint/ban-ts-comment + // @ts-ignore + writeValueRangeSeparator(listValue: Value, context: StringBuilder): void { + context.append(","); + } +} diff --git a/langchain-core/src/tests/filter.test.ts b/langchain-core/src/tests/filter.test.ts new file mode 100644 index 000000000000..850b21ca1f45 --- /dev/null +++ b/langchain-core/src/tests/filter.test.ts @@ -0,0 +1,144 @@ +import { test, expect, describe } from "@jest/globals"; +import { + FilterExpressionBuilder, + BaseFilterExpressionConverter, + StringBuilder, + FilterExpressionConverter, + Expression, + Key, + Value, + Group, +} from "../filter.js"; + +class TestFilterExpressionConverter + extends BaseFilterExpressionConverter + implements FilterExpressionConverter +{ + constructor() { + super(); + } + + convertExpression(expression: Expression): string { + return super.convertExpression(expression); + } + + convertExpressionToContext(expression: Expression, context: StringBuilder): void { + super.convertOperandToContext(expression.left, context); + super.convertSymbolToContext(expression, context); + super.convertOperandToContext(expression.right!, context); + } + + convertKeyToContext(key: Key, context: StringBuilder): void { + context.append(`'$.${key.key}'`); + } + + writeValueRangeStart(_listValue: Value, context: StringBuilder): void { + context.append("["); + } + + writeValueRangeEnd(_listValue: Value, context: StringBuilder): void { + context.append("]"); + } + + writeGroupStart(_group: Group, context: StringBuilder): void { + context.append("("); + } + + writeGroupEnd(_group: Group, context: StringBuilder): void { + context.append(")"); + } +} +describe("filter test", () => { + const b = new FilterExpressionBuilder(); + const impl = new TestFilterExpressionConverter(); + + test("filter base", async () => { + expect(impl.convertExpression(b.eq("a", 2015))).toEqual("'$.a' = 2015"); + expect(impl.convertExpression(b.ne("a", 2015))).toEqual("'$.a' != 2015"); + expect(impl.convertExpression(b.gte("a", 2015))).toEqual("'$.a' >= 2015"); + expect(impl.convertExpression(b.in("a", [2015, 2018]))).toEqual( + "'$.a' IN [2015,2018]" + ); + expect(impl.convertExpression(b.nin("a", [2015, 2018]))).toEqual( + "'$.a' NOT IN [2015,2018]" + ); + expect(impl.convertExpression(b.lte("a", 2015))).toEqual("'$.a' <= 2015"); + expect(impl.convertExpression(b.gt("a", 2015))).toEqual("'$.a' > 2015"); + expect(impl.convertExpression(b.lt("a", 2015))).toEqual("'$.a' < 2015"); + }); + + test("filter negate base", async () => { + expect(impl.convertExpression(b.not(b.eq("a", 2015)))).toEqual( + "'$.a' != 2015" + ); + expect(impl.convertExpression(b.not(b.ne("a", 2015)))).toEqual( + "'$.a' = 2015" + ); + expect(impl.convertExpression(b.not(b.gte("a", 2015)))).toEqual( + "'$.a' < 2015" + ); + expect(impl.convertExpression(b.not(b.in("a", [2015, 2018])))).toEqual( + "'$.a' NOT IN [2015,2018]" + ); + expect(impl.convertExpression(b.not(b.nin("a", [2015, 2018])))).toEqual( + "'$.a' IN [2015,2018]" + ); + expect(impl.convertExpression(b.not(b.lte("a", 2015)))).toEqual( + "'$.a' > 2015" + ); + expect(impl.convertExpression(b.not(b.gt("a", 2015)))).toEqual( + "'$.a' <= 2015" + ); + expect(impl.convertExpression(b.not(b.lt("a", 2015)))).toEqual( + "'$.a' >= 2015" + ); + }); + + test("filter Group", async () => { + expect( + impl.convertExpression( + b.and(b.eq("name", "martin"), b.eq("firstname", "john")) + ) + ).toEqual("'$.name' = 'martin' AND '$.firstname' = 'john'"); + expect( + impl.convertExpression( + b.or(b.eq("name", "martin"), b.eq("firstname", "john")) + ) + ).toEqual("'$.name' = 'martin' OR '$.firstname' = 'john'"); + + expect( + impl.convertExpression( + b.not(b.and(b.eq("name", "martin"), b.eq("firstname", "john"))) + ) + ).toEqual("'$.name' != 'martin' OR '$.firstname' != 'john'"); + expect( + impl.convertExpression( + b.not(b.or(b.eq("name", "martin"), b.eq("firstname", "john"))) + ) + ).toEqual("'$.name' != 'martin' AND '$.firstname' != 'john'"); + + expect( + impl.convertExpression( + b.and( + b.eq("name", "martin"), + b.group(b.or(b.eq("firstname", "john"), b.eq("firstname", "jack"))) + ) + ) + ).toEqual( + "'$.name' = 'martin' AND ('$.firstname' = 'john' OR '$.firstname' = 'jack')" + ); + + expect( + impl.convertExpression( + b.not( + b.and( + b.eq("name", "martin"), + b.group(b.or(b.eq("firstname", "john"), b.eq("firstname", "jack"))) + ) + ) + ) + ).toEqual( + "'$.name' != 'martin' OR ('$.firstname' != 'john' AND '$.firstname' != 'jack')" + ); + }); +}); diff --git a/libs/langchain-community/.gitignore b/libs/langchain-community/.gitignore index 5064c1f14c79..005a27082efc 100644 --- a/libs/langchain-community/.gitignore +++ b/libs/langchain-community/.gitignore @@ -386,6 +386,10 @@ vectorstores/libsql.cjs vectorstores/libsql.js vectorstores/libsql.d.ts vectorstores/libsql.d.cts +vectorstores/mariadb.cjs +vectorstores/mariadb.js +vectorstores/mariadb.d.ts +vectorstores/mariadb.d.cts vectorstores/milvus.cjs vectorstores/milvus.js vectorstores/milvus.d.ts diff --git a/libs/langchain-community/langchain.config.js b/libs/langchain-community/langchain.config.js index b0207b8612ab..4b354e36db07 100644 --- a/libs/langchain-community/langchain.config.js +++ b/libs/langchain-community/langchain.config.js @@ -134,6 +134,7 @@ export const config = { "vectorstores/hanavector": "vectorstores/hanavector", "vectorstores/lancedb": "vectorstores/lancedb", "vectorstores/libsql": "vectorstores/libsql", + "vectorstores/mariadb": "vectorstores/mariadb", "vectorstores/milvus": "vectorstores/milvus", "vectorstores/momento_vector_index": "vectorstores/momento_vector_index", "vectorstores/mongodb_atlas": "vectorstores/mongodb_atlas", diff --git a/libs/langchain-community/package.json b/libs/langchain-community/package.json index 6cdbe97e2664..0ee6df472ea7 100644 --- a/libs/langchain-community/package.json +++ b/libs/langchain-community/package.json @@ -110,6 +110,7 @@ "@tensorflow/tfjs-backend-cpu": "^3", "@tensorflow/tfjs-converter": "^3.6.0", "@tensorflow/tfjs-core": "^3.6.0", + "@testcontainers/mariadb": "^10.16.0", "@tsconfig/recommended": "^1.0.2", "@types/better-sqlite3": "^7.6.10", "@types/crypto-js": "^4.2.2", @@ -186,6 +187,7 @@ "lodash": "^4.17.21", "lunary": "^0.7.10", "mammoth": "^1.6.0", + "mariadb": "^3.4.0", "mongodb": "^5.2.0", "mysql2": "^3.9.8", "neo4j-driver": "^5.17.0", @@ -318,6 +320,7 @@ "lodash": "^4.17.21", "lunary": "^0.7.10", "mammoth": "^1.6.0", + "mariadb": "^3.4.0", "mongodb": ">=5.2.0", "mysql2": "^3.9.8", "neo4j-driver": "*", @@ -629,6 +632,9 @@ "mammoth": { "optional": true }, + "mariadb": { + "optional": true + }, "mongodb": { "optional": true }, @@ -1585,6 +1591,15 @@ "import": "./vectorstores/libsql.js", "require": "./vectorstores/libsql.cjs" }, + "./vectorstores/mariadb": { + "types": { + "import": "./vectorstores/mariadb.d.ts", + "require": "./vectorstores/mariadb.d.cts", + "default": "./vectorstores/mariadb.d.ts" + }, + "import": "./vectorstores/mariadb.js", + "require": "./vectorstores/mariadb.cjs" + }, "./vectorstores/milvus": { "types": { "import": "./vectorstores/milvus.d.ts", @@ -3507,6 +3522,10 @@ "vectorstores/libsql.js", "vectorstores/libsql.d.ts", "vectorstores/libsql.d.cts", + "vectorstores/mariadb.cjs", + "vectorstores/mariadb.js", + "vectorstores/mariadb.d.ts", + "vectorstores/mariadb.d.cts", "vectorstores/milvus.cjs", "vectorstores/milvus.js", "vectorstores/milvus.d.ts", diff --git a/libs/langchain-community/src/load/import_map.ts b/libs/langchain-community/src/load/import_map.ts index cfc0af93456c..056beadc2a47 100644 --- a/libs/langchain-community/src/load/import_map.ts +++ b/libs/langchain-community/src/load/import_map.ts @@ -40,6 +40,7 @@ export * as llms__friendli from "../llms/friendli.js"; export * as llms__ollama from "../llms/ollama.js"; export * as llms__togetherai from "../llms/togetherai.js"; export * as llms__yandex from "../llms/yandex.js"; +export * as vectorstores__mariadb from "../vectorstores/mariadb.js"; export * as vectorstores__prisma from "../vectorstores/prisma.js"; export * as vectorstores__turbopuffer from "../vectorstores/turbopuffer.js"; export * as vectorstores__vectara from "../vectorstores/vectara.js"; diff --git a/libs/langchain-community/src/vectorstores/mariadb.ts b/libs/langchain-community/src/vectorstores/mariadb.ts new file mode 100644 index 000000000000..f098add63e81 --- /dev/null +++ b/libs/langchain-community/src/vectorstores/mariadb.ts @@ -0,0 +1,760 @@ +import mariadb, { type Pool, type PoolConfig } from "mariadb"; +import { VectorStore } from "@langchain/core/vectorstores"; +import type { EmbeddingsInterface } from "@langchain/core/embeddings"; +import { Document } from "@langchain/core/documents"; +import { getEnvironmentVariable } from "@langchain/core/utils/env"; +import { + FilterExpressionConverter, + StringBuilder, + BaseFilterExpressionConverter, + Expression, + Value, + Key, + Group, +} from "@langchain/core/filter"; + +type Metadata = Record; + +export type DistanceStrategy = "COSINE" | "EUCLIDEAN"; + +/** + * Converts Expression into JSON metadata filter expression format + * for MariaDB + */ +class MariaDBFilterExpressionConverter + extends BaseFilterExpressionConverter + implements FilterExpressionConverter +{ + private metadataFieldName: string; + + constructor(metadataFieldName: string) { + super(); + this.metadataFieldName = metadataFieldName; + } + + convertExpressionToContext( + expression: Expression, + context: StringBuilder + ): void { + super.convertOperandToContext(expression.left, context); + super.convertSymbolToContext(expression, context); + if (expression.right) { + super.convertOperandToContext(expression.right, context); + } + } + + convertKeyToContext(key: Key, context: StringBuilder): void { + context.append(`JSON_VALUE(${this.metadataFieldName}, '$.${key.key}')`); + } + + writeValueRangeStart(_listValue: Value, context: StringBuilder): void { + context.append("("); + } + + writeValueRangeEnd(_listValue: Value, context: StringBuilder): void { + context.append(")"); + } + + writeGroupStart(_group: Group, context: StringBuilder): void { + context.append("("); + } + + writeGroupEnd(_group: Group, context: StringBuilder): void { + context.append(")"); + } +} + +/** + * Interface that defines the arguments required to create a + * `MariaDBStore` instance. It includes MariaDB connection options, + * table name and verbosity level. + */ +export interface MariaDBStoreArgs { + connectionOptions?: PoolConfig; + pool?: Pool; + tableName?: string; + collectionTableName?: string; + collectionName?: string; + collectionMetadata?: Metadata | null; + schemaName?: string | null; + columns?: { + idColumnName?: string; + vectorColumnName?: string; + contentColumnName?: string; + metadataColumnName?: string; + }; + verbose?: boolean; + /** + * The amount of documents to chunk by when + * adding vectors. + * @default 500 + */ + chunkSize?: number; + ids?: string[]; + distanceStrategy?: DistanceStrategy; +} + +/** + * MariaDB vector store integration. + * + * Setup: + * Install `@langchain/community` and `mariadb`. + * + * If you wish to generate ids, you should also install the `uuid` package. + * + * ```bash + * npm install @langchain/community mariadb uuid + * ``` + * + * ## [Constructor args](https://api.js.langchain.com/classes/_langchain_community.vectorstores_mariadb.MariaDB.html#constructor) + * + *
+ * Instantiate + * + * ```typescript + * import { + * MariaDBStore, + * DistanceStrategy, + * } from "@langchain/community/vectorstores/mariadb"; + * + * // Or other embeddings + * import { OpenAIEmbeddings } from "@langchain/openai"; + * import { PoolConfig } from "mariadb"; + * + * const embeddings = new OpenAIEmbeddings({ + * model: "text-embedding-3-small", + * }); + * + * // Sample config + * const config = { + * connectionOptions: { + * host: "127.0.0.1", + * port: 3306, + * user: "myuser", + * password: "ChangeMe", + * database: "api", + * } as PoolConfig, + * tableName: "testlangchainjs", + * columns: { + * idColumnName: "id", + * vectorColumnName: "vector", + * contentColumnName: "content", + * metadataColumnName: "metadata", + * }, + * // supported distance strategies: COSINE (default) or EUCLIDEAN + * distanceStrategy: "COSINE" as DistanceStrategy, + * }; + * + * const vectorStore = await MariaDBStore.initialize(embeddings, config); + * ``` + *
+ * + *
+ * + *
+ * Add documents + * + * ```typescript + * import type { Document } from '@langchain/core/documents'; + * + * const document1 = { pageContent: "foo", metadata: { baz: "bar" } }; + * const document2 = { pageContent: "thud", metadata: { bar: "baz" } }; + * const document3 = { pageContent: "i will be deleted :(", metadata: {} }; + * + * const documents: Document[] = [document1, document2, document3]; + * const ids = ["1", "2", "3"]; + * await vectorStore.addDocuments(documents, { ids }); + * ``` + *
+ * + *
+ * + *
+ * Delete documents + * + * ```typescript + * await vectorStore.delete({ ids: ["3"] }); + * ``` + *
+ * + *
+ * + *
+ * Similarity search + * + * ```typescript + * const results = await vectorStore.similaritySearch("thud", 1); + * for (const doc of results) { + * console.log(`* ${doc.pageContent} [${JSON.stringify(doc.metadata, null)}]`); + * } + * // Output: * thud [{"baz":"bar"}] + * ``` + *
+ * + *
+ * + * + *
+ * Similarity search with filter + * + * ```typescript + * const resultsWithFilter = await vectorStore.similaritySearch("thud", 1, new Expression( ExpressionType.EQ, new Key("country"), new Value("BG"))); + * + * for (const doc of resultsWithFilter) { + * console.log(`* ${doc.pageContent} [${JSON.stringify(doc.metadata, null)}]`); + * } + * // Output: * foo [{"baz":"bar"}] + * ``` + *
+ * + *
+ * + * + *
+ * Similarity search with score + * + * ```typescript + * const resultsWithScore = await vectorStore.similaritySearchWithScore("qux", 1); + * for (const [doc, score] of resultsWithScore) { + * console.log(`* [SIM=${score.toFixed(6)}] ${doc.pageContent} [${JSON.stringify(doc.metadata, null)}]`); + * } + * // Output: * [SIM=0.000000] qux [{"bar":"baz","baz":"bar"}] + * ``` + *
+ * + *
+ * + *
+ * As a retriever + * + * ```typescript + * const retriever = vectorStore.asRetriever({ + * searchType: "mmr", // Leave blank for standard similarity search + * k: 1, + * }); + * const resultAsRetriever = await retriever.invoke("thud"); + * console.log(resultAsRetriever); + * + * // Output: [Document({ metadata: { "baz":"bar" }, pageContent: "thud" })] + * ``` + *
+ * + *
+ */ +export class MariaDBStore extends VectorStore { + tableName: string; + + collectionTableName?: string; + + collectionName = "langchain"; + + collectionId?: string; + + collectionMetadata: Metadata | null; + + schemaName: string | null; + + idColumnName: string; + + vectorColumnName: string; + + contentColumnName: string; + + metadataColumnName: string; + + _verbose?: boolean; + + pool: Pool; + + chunkSize = 500; + + distanceStrategy: DistanceStrategy; + + expressionConverter: MariaDBFilterExpressionConverter; + + constructor(embeddings: EmbeddingsInterface, config: MariaDBStoreArgs) { + super(embeddings, config); + this.tableName = this.escapeId(config.tableName ?? "langchain", false); + if ( + config.collectionName !== undefined && + config.collectionTableName === undefined + ) { + throw new Error( + `If supplying a "collectionName", you must also supply a "collectionTableName".` + ); + } + + this.collectionTableName = config.collectionTableName + ? this.escapeId(config.collectionTableName, false) + : undefined; + + this.collectionName = config.collectionName + ? this.escapeId(config.collectionName, false) + : "langchaincol"; + + this.collectionMetadata = config.collectionMetadata ?? null; + this.schemaName = config.schemaName + ? this.escapeId(config.schemaName, false) + : null; + + this.vectorColumnName = this.escapeId( + config.columns?.vectorColumnName ?? "embedding", + false + ); + this.contentColumnName = this.escapeId( + config.columns?.contentColumnName ?? "text", + false + ); + this.idColumnName = this.escapeId( + config.columns?.idColumnName ?? "id", + false + ); + this.metadataColumnName = this.escapeId( + config.columns?.metadataColumnName ?? "metadata", + false + ); + this.expressionConverter = new MariaDBFilterExpressionConverter( + this.metadataColumnName + ); + + if (!config.connectionOptions && !config.pool) { + throw new Error( + "You must provide either a `connectionOptions` object or a `pool` instance." + ); + } + + const langchainVerbose = getEnvironmentVariable("LANGCHAIN_VERBOSE"); + + if (langchainVerbose === "true") { + this._verbose = true; + } else if (langchainVerbose === "false") { + this._verbose = false; + } else { + this._verbose = config.verbose; + } + + if (config.pool) { + this.pool = config.pool; + } else { + const poolConf = { ...config.connectionOptions, rowsAsArray: true }; + // add query to log if verbose + if (this._verbose) poolConf.logger = { query: console.log }; + this.pool = mariadb.createPool(poolConf); + } + this.chunkSize = config.chunkSize ?? 500; + + this.distanceStrategy = + config.distanceStrategy ?? ("COSINE" as DistanceStrategy); + } + + get computedTableName() { + return this.schemaName == null + ? this.tableName + : `${this.schemaName}.${this.tableName}`; + } + + get computedCollectionTableName() { + return this.schemaName == null + ? `${this.collectionTableName}` + : `"${this.schemaName}"."${this.collectionTableName}"`; + } + + /** + * Escape identifier + * + * @param identifier identifier value + * @param alwaysQuote must identifier be quoted if not required + */ + private escapeId(identifier: string, alwaysQuote: boolean): string { + if (!identifier || identifier === "") + throw new Error("Identifier is required"); + + const len = identifier.length; + const simpleIdentifier = /^[0-9a-zA-Z$_]*$/; + if (simpleIdentifier.test(identifier)) { + if (len < 1 || len > 64) { + throw new Error("Invalid identifier length"); + } + if (alwaysQuote) return `\`${identifier}\``; + + // Identifier names may begin with a numeral, but can't only contain numerals unless quoted. + if (/^\d+$/.test(identifier)) { + // identifier containing only numerals must be quoted + return `\`${identifier}\``; + } + // identifier containing only numerals must be quoted + return identifier; + } else { + if (identifier.includes("\u0000")) { + throw new Error("Invalid name - containing u0000 character"); + } + let ident = identifier; + if (/^`.+`$/.test(identifier)) { + ident = identifier.substring(1, identifier.length - 1); + } + if (len < 1 || len > 64) { + throw new Error("Invalid identifier length"); + } + return `\`${ident.replace(/`/g, "``")}\``; + } + } + + private printable(definition: string): string { + return definition.replaceAll(/[^0-9a-zA-Z_]/g, ""); + } + + /** + * Static method to create a new `MariaDBStore` instance from a + * connection. It creates a table if one does not exist, and calls + * `connect` to return a new instance of `MariaDBStore`. + * + * @param embeddings - Embeddings instance. + * @param fields - `MariaDBStoreArgs` instance + * @param fields.dimensions Number of dimensions in your vector data type. default to 1536. + * @returns A new instance of `MariaDBStore`. + */ + static async initialize( + embeddings: EmbeddingsInterface, + config: MariaDBStoreArgs & { dimensions?: number } + ): Promise { + const { dimensions, ...rest } = config; + const mariadbStore = new MariaDBStore(embeddings, rest); + await mariadbStore.ensureTableInDatabase(dimensions); + await mariadbStore.ensureCollectionTableInDatabase(); + await mariadbStore.loadCollectionId(); + + return mariadbStore; + } + + /** + * Static method to create a new `MariaDBStore` instance from an + * array of texts and their metadata. It converts the texts into + * `Document` instances and adds them to the store. + * + * @param texts - Array of texts. + * @param metadatas - Array of metadata objects or a single metadata object. + * @param embeddings - Embeddings instance. + * @param dbConfig - `MariaDBStoreArgs` instance. + * @returns Promise that resolves with a new instance of `MariaDBStore`. + */ + static async fromTexts( + texts: string[], + metadatas: object[] | object, + embeddings: EmbeddingsInterface, + dbConfig: MariaDBStoreArgs & { dimensions?: number } + ): Promise { + const docs = []; + for (let i = 0; i < texts.length; i += 1) { + const metadata = Array.isArray(metadatas) ? metadatas[i] : metadatas; + const newDoc = new Document({ + pageContent: texts[i], + metadata, + }); + docs.push(newDoc); + } + + return MariaDBStore.fromDocuments(docs, embeddings, dbConfig); + } + + /** + * Static method to create a new `MariaDBStore` instance from an + * array of `Document` instances. It adds the documents to the store. + * + * @param docs - Array of `Document` instances. + * @param embeddings - Embeddings instance. + * @param dbConfig - `MariaDBStoreArgs` instance. + * @returns Promise that resolves with a new instance of `MariaDBStore`. + */ + static async fromDocuments( + docs: Document[], + embeddings: EmbeddingsInterface, + dbConfig: MariaDBStoreArgs & { dimensions?: number } + ): Promise { + const instance = await MariaDBStore.initialize(embeddings, dbConfig); + await instance.addDocuments(docs, { ids: dbConfig.ids }); + return instance; + } + + _vectorstoreType(): string { + return "mariadb"; + } + + /** + * Method to add documents to the vector store. It converts the documents into + * vectors, and adds them to the store. + * + * @param documents - Array of `Document` instances. + * @param options - Optional arguments for adding documents + * @returns Promise that resolves when the documents have been added. + */ + async addDocuments( + documents: Document[], + options?: { ids?: string[] } + ): Promise { + const texts = documents.map(({ pageContent }) => pageContent); + + return this.addVectors( + await this.embeddings.embedDocuments(texts), + documents, + options + ); + } + + /** + * Inserts a row for the collectionName provided at initialization if it does not + * exist and set the collectionId. + */ + private async loadCollectionId(): Promise { + if (this.collectionId) { + return; + } + + if (this.collectionTableName) { + const queryResult = await this.pool.query( + { + sql: `SELECT uuid from ${this.computedCollectionTableName} WHERE label = ?`, + rowsAsArray: true, + }, + [this.collectionName] + ); + if (queryResult.length > 0) { + this.collectionId = queryResult[0][0]; + } else { + const insertString = `INSERT INTO ${this.computedCollectionTableName}(label, cmetadata) VALUES (?, ?) RETURNING uuid`; + const insertResult = await this.pool.query( + { sql: insertString, rowsAsArray: true }, + [this.collectionName, this.collectionMetadata] + ); + this.collectionId = insertResult[0][0]; + } + } + } + + /** + * Method to add vectors to the vector store. It converts the vectors into + * rows and inserts them into the database. + * + * @param vectors - Array of vectors. + * @param documents - Array of `Document` instances. + * @param options - Optional arguments for adding documents + * @returns Promise that resolves when the vectors have been added. + */ + async addVectors( + vectors: number[][], + documents: Document[], + options?: { ids?: string[] } + ): Promise { + const ids = options?.ids; + + // Either all documents have ids or none of them do to avoid confusion. + if (ids !== undefined && ids.length !== vectors.length) { + throw new Error( + "The number of ids must match the number of vectors provided." + ); + } + await this.loadCollectionId(); + + const insertQuery = `INSERT INTO ${this.computedTableName}(${ + this.idColumnName + },${this.contentColumnName},${this.metadataColumnName},${ + this.vectorColumnName + }${this.collectionId ? ",collection_id" : ""}) VALUES (${ + ids ? "?" : "UUID_v7()" + }, ?, ?, ?${this.collectionId ? ", ?" : ""})`; + + try { + const batchParams = []; + for (let i = 0; i < vectors.length; i += 1) { + const param = [ + ids ? ids[i] : null, + documents[i].pageContent, + documents[i].metadata, + this.getFloat32Buffer(vectors[i]), + this.collectionId, + ]; + if (!ids) param.shift(); + if (!this.collectionId) param.pop(); + batchParams.push(param); + } + await this.pool.batch(insertQuery, batchParams); + } catch (e) { + console.error(e); + throw new Error(`Error inserting: ${(e as Error).message}`); + } + } + + /** + * Convert float array to binary value + * @param vector embedding value + * @private + */ + private getFloat32Buffer(vector: number[]) { + return Buffer.from(new Float32Array(vector).buffer); + } + + /** + * Method to delete documents from the vector store. It deletes the + * documents that match the provided ids + * + * @param ids - array of ids + * @returns Promise that resolves when the documents have been deleted. + * @example + * await vectorStore.delete(["id1", "id2"]); + */ + async delete(params: { ids?: string[]; filter?: Expression }): Promise { + const { ids, filter } = params; + + if (!(ids || filter)) { + throw new Error( + "You must specify either ids or a filter when deleting documents." + ); + } + await this.loadCollectionId(); + + if (ids) { + // delete by ids + await this.pool.query( + `DELETE FROM ${this.computedTableName} WHERE ${ + this.idColumnName + } IN (?) ${this.collectionId ? " AND collection_id = ?" : ""}`, + [ids, this.collectionId] + ); + } else if (filter) { + // delete by filter + const filterPart = this.expressionConverter.convertExpression(filter); + if (filterPart.length === 0) throw new Error("Wrong filter."); + await this.pool.execute( + `DELETE FROM ${this.computedTableName} WHERE ${filterPart} ${ + this.collectionId ? " AND collection_id = ?" : "" + }`, + [this.collectionId] + ); + } + } + + /** + * Method to perform a similarity search in the vector store. It returns + * the `k` most similar documents to the query vector, along with their + * similarity scores. + * + * @param query - Query vector. + * @param k - Number of most similar documents to return. + * @param filter - Optional filter to apply to the search. + * @returns Promise that resolves with an array of tuples, each containing a `Document` and its similarity score. + */ + async similaritySearchVectorWithScore( + query: number[], + k: number, + filter?: Expression + ): Promise<[Document, number][]> { + // eslint-disable-next-line @typescript-eslint/no-explicit-any + const parameters: unknown[] = [this.getFloat32Buffer(query)]; + const whereClauses = []; + + await this.loadCollectionId(); + + if (this.collectionId) { + whereClauses.push("collection_id = ?"); + parameters.push(this.collectionId); + } + + if (filter) { + const filterPart = this.expressionConverter.convertExpression(filter); + whereClauses.push(filterPart); + } + + // limit + parameters.push(k); + + const whereClause = whereClauses.length + ? `WHERE ${whereClauses.join(" AND ")}` + : ""; + + const queryString = `SELECT ${this.idColumnName},${this.contentColumnName},${this.metadataColumnName},VEC_DISTANCE_${this.distanceStrategy}(${this.vectorColumnName}, ?) as distance FROM ${this.computedTableName} ${whereClause} ORDER BY distance ASC LIMIT ?`; + + const documents = await this.pool.execute( + { sql: queryString, rowsAsArray: true }, + parameters + ); + + const results = [] as [Document, number][]; + for (const doc of documents) { + if (doc[3] != null && doc[1] != null) { + const document = new Document({ + id: doc[0], + pageContent: doc[1], + metadata: doc[2], + }); + results.push([document, doc[3]]); + } + } + return results; + } + + /** + * Method to ensure the existence of the table in the database. It creates + * the table if it does not already exist. + * @param dimensions Number of dimensions in your vector data type. Default to 1536. + * @returns Promise that resolves when the table has been ensured. + */ + async ensureTableInDatabase(dimensions = 1536): Promise { + const tableQuery = `CREATE TABLE IF NOT EXISTS ${this.computedTableName}(${ + this.idColumnName + } UUID NOT NULL DEFAULT UUID_v7() PRIMARY KEY,${ + this.contentColumnName + } TEXT,${this.metadataColumnName} JSON,${ + this.vectorColumnName + } VECTOR(${dimensions}) NOT NULL, VECTOR INDEX ${this.printable( + this.tableName + "_" + this.vectorColumnName + )}_idx (${this.vectorColumnName}) ) ENGINE=InnoDB`; + await this.pool.query(tableQuery); + } + + /** + * Method to ensure the existence of the collection table in the database. + * It creates the table if it does not already exist. + * + * @returns Promise that resolves when the collection table has been ensured. + */ + async ensureCollectionTableInDatabase(): Promise { + try { + if (this.collectionTableName != null) { + await Promise.all([ + this.pool.query( + `CREATE TABLE IF NOT EXISTS ${ + this.computedCollectionTableName + }(uuid UUID NOT NULL DEFAULT UUID_v7() PRIMARY KEY, + label VARCHAR(256), cmetadata JSON, UNIQUE KEY idx_${this.printable( + this.collectionTableName + )}_label + (label))` + ), + this.pool.query( + `ALTER TABLE ${this.computedTableName} + ADD COLUMN IF NOT EXISTS collection_id uuid, + ADD CONSTRAINT FOREIGN KEY IF NOT EXISTS ${this.printable( + this.tableName + )}_collection_id_fkey (collection_id) + REFERENCES ${ + this.computedCollectionTableName + }(uuid) ON DELETE CASCADE` + ), + ]); + } + } catch (e) { + console.error(e); + throw new Error( + `Error adding column or creating index: ${(e as Error).message}` + ); + } + } + + /** + * Close the pool. + * + * @returns Promise that resolves when the pool is terminated. + */ + async end(): Promise { + return this.pool.end(); + } +} diff --git a/libs/langchain-community/src/vectorstores/tests/mariadb.int.test.ts b/libs/langchain-community/src/vectorstores/tests/mariadb.int.test.ts new file mode 100644 index 000000000000..6ba24f7c0834 --- /dev/null +++ b/libs/langchain-community/src/vectorstores/tests/mariadb.int.test.ts @@ -0,0 +1,308 @@ +import { + MariaDbContainer, + StartedMariaDbContainer, +} from "@testcontainers/mariadb"; +import { OpenAIEmbeddings } from "@langchain/openai"; +import { FilterExpressionBuilder } from "@langchain/core/filter"; +import { type Pool, PoolConfig } from "mariadb"; +import { MariaDBStore, MariaDBStoreArgs } from "../mariadb.js"; + +const isFullyQualifiedTableExists = async ( + pool: Pool, + schema: string, + tableName: string +): Promise => { + const sql = + "SELECT EXISTS (SELECT * FROM information_schema.tables WHERE table_schema = ? AND table_name = ?) as results"; + const res = await pool.query(sql, [schema, tableName]); + return res[0][0] as boolean; +}; +const removeQuotes = (field: string): string => { + if (field.charAt(0) === "`") return field.substring(1, field.length - 1); + return field; +}; +const areColumnsExisting = async ( + pool: Pool, + schema: string, + tableName: string, + fieldNames: string[] +): Promise => { + const sql = + "SELECT EXISTS (SELECT * FROM information_schema.columns WHERE table_schema= ? AND table_name = ? AND column_name = ?)"; + + for (let i = 0; i < fieldNames.length; i += 1) { + const res = await pool.query(sql, [ + schema, + removeQuotes(tableName), + removeQuotes(fieldNames[i]), + ]); + if (res[0][0]) continue; + return false; + } + return true; +}; + +describe("MariaDBVectorStore", () => { + let container: StartedMariaDbContainer; + + beforeAll(async () => { + container = await new MariaDbContainer("mariadb:11.7-rc").start(); + }); + + afterAll(async () => { + await container.stop(); + }); + + describe("automatic table creation", () => { + it.each([ + ["myTable", "myId", "myVector", "myContent", "myMetadata", undefined], + [ + "myTable 2", + "myId 2", + "myVector 2", + "myContent 2", + "myMetadata 2", + undefined, + ], + [ + "myTable", + "myId", + "myVector", + "myContent", + "myMetadata", + "myCollectionTableName", + ], + [ + "myTable` 2", + "myId` 2", + "myVector` 2", + "myContent` 2", + "myMetadata` 2", + "myCollectionTableName` 2", + ], + ])( + "automatic table %p %p %p %p %p", + async ( + tableName: string, + idColumnName: string, + vectorColumnName: string, + contentColumnName: string, + metadataColumnName: string, + collectionTableName?: string + ) => { + const localStore = await MariaDBStore.initialize( + new OpenAIEmbeddings(), + { + connectionOptions: { + host: container.getHost(), + port: container.getFirstMappedPort(), + user: container.getUsername(), + password: container.getUserPassword(), + database: container.getDatabase(), + } as PoolConfig, + tableName, + columns: { + idColumnName, + vectorColumnName, + contentColumnName, + metadataColumnName, + }, + collectionTableName, + distanceStrategy: "EUCLIDEAN", + } as MariaDBStoreArgs + ); + expect( + isFullyQualifiedTableExists( + localStore.pool, + container.getDatabase(), + "myTable" + ) + ).toBeTruthy(); + expect( + areColumnsExisting( + localStore.pool, + container.getDatabase(), + "myTable", + ["myId", "myVector", "myContent", "myMetadata"] + ) + ).toBeTruthy(); + await localStore.similaritySearch("hello", 10); + await localStore.delete({ + ids: ["63ae8c92-799a-11ef-98b2-f859713e4be4"], + }); + const documents = [ + { pageContent: "hello", metadata: { a: 2023, country: "US" } }, + ]; + await localStore.addDocuments(documents); + await localStore.pool.query("DROP TABLE " + localStore.tableName); + } + ); + }); + + describe("without collection", () => { + let store: MariaDBStore; + + beforeAll(async () => { + store = await MariaDBStore.initialize(new OpenAIEmbeddings(), { + connectionOptions: { + type: "mariadb", + host: container.getHost(), + port: container.getFirstMappedPort(), + user: container.getUsername(), + password: container.getUserPassword(), + database: container.getDatabase(), + } as PoolConfig, + } as MariaDBStoreArgs); + }); + + const documents = [ + { pageContent: "hello", metadata: { a: 2023, country: "US" } }, + { pageContent: "Cat drinks milk", metadata: { a: 2025, country: "EN" } }, + { pageContent: "hi", metadata: { a: 2025, country: "FR" } }, + ]; + const ids = [ + "cd41294a-afb0-11df-bc9b-00241dd75637", + "a2443495-1b94-415b-b6fa-fe8e79ba4812", + "63ae8c92-799a-11ef-98b2-f859713e4be4", + ]; + beforeEach(async () => { + await store.pool.query("TRUNCATE TABLE " + store.tableName); + await store.addDocuments(documents, { ids }); + }); + test("similarity limit", async () => { + let results = await store.similaritySearch("hello", 10); + expect(results.length).toEqual(3); + expect(results[0].pageContent).toEqual("hello"); + expect(results[0].metadata.a).toEqual(2023); + + results = await store.similaritySearch("hello", 1); + expect(results.length).toEqual(1); + expect(results[0].pageContent).toEqual("hello"); + expect(results[0].metadata.a).toEqual(2023); + }); + + test("similarity with filter", async () => { + const b = new FilterExpressionBuilder(); + let results = await store.similaritySearch("hi", 10, b.eq("a", 2025)); + expect(results.length).toEqual(2); + expect(results[0].pageContent).toEqual("hi"); + expect(results[0].metadata.a).toEqual(2025); + + results = await store.similaritySearch( + "hi", + 10, + b.and(b.gte("a", 2025), b.in("country", ["GE", "FR"])) + ); + expect(results.length).toEqual(1); + expect(results[0].pageContent).toEqual("hi"); + expect(results[0].metadata.a).toEqual(2025); + }); + + test("deletion with filter", async () => { + const b = new FilterExpressionBuilder(); + try { + await store.delete({}); + throw new Error("expected to fails"); + } catch (e) { + expect((e as Error).message).toEqual( + "You must specify either ids or a filter when deleting documents." + ); + } + + await store.delete({ filter: b.eq("a", 2023) }); + let res = await store.pool.query( + "SELECT COUNT(*) as a FROM " + store.tableName + ); + expect(res[0][0]).toEqual(2n); + + await store.delete({ ids: ["63ae8c92-799a-11ef-98b2-f859713e4be4"] }); + res = await store.pool.query("SELECT COUNT(*) FROM " + store.tableName); + expect(res[0][0]).toEqual(1n); + }); + }); + + describe("with collection", () => { + let store: MariaDBStore; + + beforeAll(async () => { + store = await MariaDBStore.initialize(new OpenAIEmbeddings(), { + connectionOptions: { + type: "mariadb", + host: container.getHost(), + port: container.getFirstMappedPort(), + user: container.getUsername(), + password: container.getUserPassword(), + database: container.getDatabase(), + } as PoolConfig, + collectionTableName: "myCollectionTable", + } as MariaDBStoreArgs); + }); + + const documents = [ + { pageContent: "hello", metadata: { a: 2023, country: "US" } }, + { pageContent: "Cat drinks milk", metadata: { a: 2025, country: "EN" } }, + { pageContent: "hi", metadata: { a: 2025, country: "FR" } }, + ]; + const ids = [ + "cd41294a-afb0-11df-bc9b-00241dd75637", + "a2443495-1b94-415b-b6fa-fe8e79ba4812", + "63ae8c92-799a-11ef-98b2-f859713e4be4", + ]; + + beforeEach(async () => { + await store.pool.query("TRUNCATE TABLE " + store.tableName); + await store.addDocuments(documents, { ids }); + }); + + test("similarity limit", async () => { + let results = await store.similaritySearch("hello", 10); + expect(results.length).toEqual(3); + expect(results[0].pageContent).toEqual("hello"); + expect(results[0].metadata.a).toEqual(2023); + + results = await store.similaritySearch("hello", 1); + expect(results.length).toEqual(1); + expect(results[0].pageContent).toEqual("hello"); + expect(results[0].metadata.a).toEqual(2023); + }); + + test("similarity with filter", async () => { + const b = new FilterExpressionBuilder(); + let results = await store.similaritySearch("hi", 10, b.eq("a", 2025)); + expect(results.length).toEqual(2); + expect(results[0].pageContent).toEqual("hi"); + expect(results[0].metadata.a).toEqual(2025); + + results = await store.similaritySearch( + "hi", + 10, + b.and(b.gte("a", 2025), b.in("country", ["GE", "FR"])) + ); + expect(results.length).toEqual(1); + expect(results[0].pageContent).toEqual("hi"); + expect(results[0].metadata.a).toEqual(2025); + }); + + test("deletion with filter", async () => { + const b = new FilterExpressionBuilder(); + try { + await store.delete({}); + throw new Error("expected to fails"); + } catch (e) { + expect((e as Error).message).toEqual( + "You must specify either ids or a filter when deleting documents." + ); + } + + await store.delete({ filter: b.eq("a", 2023) }); + let res = await store.pool.query( + "SELECT COUNT(*) as a FROM " + store.tableName + ); + expect(res[0][0]).toEqual(2n); + + await store.delete({ ids: ["63ae8c92-799a-11ef-98b2-f859713e4be4"] }); + res = await store.pool.query("SELECT COUNT(*) FROM " + store.tableName); + expect(res[0][0]).toEqual(1n); + }); + }); +}); diff --git a/yarn.lock b/yarn.lock index 466e5b0d4763..1d2399ec705a 100644 --- a/yarn.lock +++ b/yarn.lock @@ -8605,6 +8605,13 @@ __metadata: languageName: node linkType: hard +"@balena/dockerignore@npm:^1.0.2": + version: 1.0.2 + resolution: "@balena/dockerignore@npm:1.0.2" + checksum: 0d39f8fbcfd1a983a44bced54508471ab81aaaa40e2c62b46a9f97eac9d6b265790799f16919216db486331dedaacdde6ecbd6b7abe285d39bc50de111991699 + languageName: node + linkType: hard + "@bcherny/json-schema-ref-parser@npm:10.0.5-fork": version: 10.0.5-fork resolution: "@bcherny/json-schema-ref-parser@npm:10.0.5-fork" @@ -11835,6 +11842,7 @@ __metadata: "@tensorflow/tfjs-backend-cpu": ^3 "@tensorflow/tfjs-converter": ^3.6.0 "@tensorflow/tfjs-core": ^3.6.0 + "@testcontainers/mariadb": ^10.16.0 "@tsconfig/recommended": ^1.0.2 "@types/better-sqlite3": ^7.6.10 "@types/crypto-js": ^4.2.2 @@ -11917,6 +11925,7 @@ __metadata: lodash: ^4.17.21 lunary: ^0.7.10 mammoth: ^1.6.0 + mariadb: ^3.4.0 mongodb: ^5.2.0 mysql2: ^3.9.8 neo4j-driver: ^5.17.0 @@ -12051,6 +12060,7 @@ __metadata: lodash: ^4.17.21 lunary: ^0.7.10 mammoth: ^1.6.0 + mariadb: ^3.4.0 mongodb: ">=5.2.0" mysql2: ^3.9.8 neo4j-driver: "*" @@ -12267,6 +12277,8 @@ __metadata: optional: true mammoth: optional: true + mariadb: + optional: true mongodb: optional: true mysql2: @@ -18917,6 +18929,15 @@ __metadata: languageName: node linkType: hard +"@testcontainers/mariadb@npm:^10.16.0": + version: 10.16.0 + resolution: "@testcontainers/mariadb@npm:10.16.0" + dependencies: + testcontainers: ^10.16.0 + checksum: 25c4ffd53bce4a1a81f2eb0668a0c5523162e1c8f792cc1c09dcc8e45846d371027bc84247dab4eb251e88be6f310b2413529f2c9e06697dd2f2860b83304ae8 + languageName: node + linkType: hard + "@tinyhttp/content-disposition@npm:^2.2.0": version: 2.2.2 resolution: "@tinyhttp/content-disposition@npm:2.2.2" @@ -19159,6 +19180,27 @@ __metadata: languageName: node linkType: hard +"@types/docker-modem@npm:*": + version: 3.0.6 + resolution: "@types/docker-modem@npm:3.0.6" + dependencies: + "@types/node": "*" + "@types/ssh2": "*" + checksum: cc58e8189f6ec5a2b8ca890207402178a97ddac8c80d125dc65d8ab29034b5db736de15e99b91b2d74e66d14e26e73b6b8b33216613dd15fd3aa6b82c11a83ed + languageName: node + linkType: hard + +"@types/dockerode@npm:^3.3.29": + version: 3.3.32 + resolution: "@types/dockerode@npm:3.3.32" + dependencies: + "@types/docker-modem": "*" + "@types/node": "*" + "@types/ssh2": "*" + checksum: 17bfa92511cdc6ab51a67cb4678931b43670feffd737ba593c3ff90a6f71673aa04f8a81524690dddc08b483628d657a338a176171ff131de9e0efba4c3ecc11 + languageName: node + linkType: hard + "@types/dompurify@npm:^3.0.5": version: 3.0.5 resolution: "@types/dompurify@npm:3.0.5" @@ -19273,6 +19315,13 @@ __metadata: languageName: node linkType: hard +"@types/geojson@npm:^7946.0.14": + version: 7946.0.15 + resolution: "@types/geojson@npm:7946.0.15" + checksum: 226d7ab59540632b19f7889c76c4c586a5104c18c43a81f32974aa035eafe557f86bd5a79ca5568bb63cbe5bfa9014c8e9a29cb0bb3d2f0bd71b0cc13ad8ccb3 + languageName: node + linkType: hard + "@types/glob@npm:^7.1.3": version: 7.2.0 resolution: "@types/glob@npm:7.2.0" @@ -19686,6 +19735,15 @@ __metadata: languageName: node linkType: hard +"@types/node@npm:^22.5.4": + version: 22.10.2 + resolution: "@types/node@npm:22.10.2" + dependencies: + undici-types: ~6.20.0 + checksum: b22401e6e7d1484e437d802c72f5560e18100b1257b9ad0574d6fe05bebe4dbcb620ea68627d1f1406775070d29ace8b6b51f57e7b1c7b8bafafe6da7f29c843 + languageName: node + linkType: hard + "@types/node@npm:~10.14.19": version: 10.14.22 resolution: "@types/node@npm:10.14.22" @@ -20011,6 +20069,34 @@ __metadata: languageName: node linkType: hard +"@types/ssh2-streams@npm:*": + version: 0.1.12 + resolution: "@types/ssh2-streams@npm:0.1.12" + dependencies: + "@types/node": "*" + checksum: aa0aa45e40cfca34b4443dafa8d28ff49196c05c71867cbf0a8cdd5127be4d8a3840819543fcad16535653ca8b0e29217671ed6500ff1e7a3ad2442c5d1b40a6 + languageName: node + linkType: hard + +"@types/ssh2@npm:*": + version: 1.15.1 + resolution: "@types/ssh2@npm:1.15.1" + dependencies: + "@types/node": ^18.11.18 + checksum: 6a10b4da60817f2939cac18006a7ccbc6421facf2370a263072fc5290b1f5d445b385c5f309e93ce447bb33ad92dac18f562ccda20f092076da1c1a55da299fb + languageName: node + linkType: hard + +"@types/ssh2@npm:^0.5.48": + version: 0.5.52 + resolution: "@types/ssh2@npm:0.5.52" + dependencies: + "@types/node": "*" + "@types/ssh2-streams": "*" + checksum: bc1c76ac727ad73ddd59ba849cf0ea3ed2e930439e7a363aff24f04f29b74f9b1976369b869dc9a018223c9fb8ad041c09a0f07aea8cf46a8c920049188cddae + languageName: node + linkType: hard + "@types/stack-utils@npm:^2.0.0": version: 2.0.1 resolution: "@types/stack-utils@npm:2.0.1" @@ -21486,6 +21572,36 @@ __metadata: languageName: node linkType: hard +"archiver-utils@npm:^5.0.0, archiver-utils@npm:^5.0.2": + version: 5.0.2 + resolution: "archiver-utils@npm:5.0.2" + dependencies: + glob: ^10.0.0 + graceful-fs: ^4.2.0 + is-stream: ^2.0.1 + lazystream: ^1.0.0 + lodash: ^4.17.15 + normalize-path: ^3.0.0 + readable-stream: ^4.0.0 + checksum: 7dc4f3001dc373bd0fa7671ebf08edf6f815cbc539c78b5478a2eaa67e52e3fc0e92f562cdef2ba016c4dcb5468d3d069eb89535c6844da4a5bb0baf08ad5720 + languageName: node + linkType: hard + +"archiver@npm:^7.0.1": + version: 7.0.1 + resolution: "archiver@npm:7.0.1" + dependencies: + archiver-utils: ^5.0.2 + async: ^3.2.4 + buffer-crc32: ^1.0.0 + readable-stream: ^4.0.0 + readdir-glob: ^1.1.2 + tar-stream: ^3.0.0 + zip-stream: ^6.0.1 + checksum: f93bcc00f919e0bbb6bf38fddf111d6e4d1ed34721b73cc073edd37278303a7a9f67aa4abd6fd2beb80f6c88af77f2eb4f60276343f67605e3aea404e5ad93ea + languageName: node + linkType: hard + "are-we-there-yet@npm:^2.0.0": version: 2.0.0 resolution: "are-we-there-yet@npm:2.0.0" @@ -21762,6 +21878,15 @@ __metadata: languageName: node linkType: hard +"asn1@npm:^0.2.6": + version: 0.2.6 + resolution: "asn1@npm:0.2.6" + dependencies: + safer-buffer: ~2.1.0 + checksum: 39f2ae343b03c15ad4f238ba561e626602a3de8d94ae536c46a4a93e69578826305366dc09fbb9b56aec39b4982a463682f259c38e59f6fa380cd72cd61e493d + languageName: node + linkType: hard + "assemblyai@npm:^4.6.0": version: 4.6.0 resolution: "assemblyai@npm:4.6.0" @@ -21801,6 +21926,13 @@ __metadata: languageName: node linkType: hard +"async-lock@npm:^1.4.1": + version: 1.4.1 + resolution: "async-lock@npm:1.4.1" + checksum: 29e70cd892932b7c202437786cedc39ff62123cb6941014739bd3cabd6106326416e9e7c21285a5d1dc042cad239a0f7ec9c44658491ee4a615fd36a21c1d10a + languageName: node + linkType: hard + "async-mutex@npm:^0.5.0": version: 0.5.0 resolution: "async-mutex@npm:0.5.0" @@ -21826,6 +21958,13 @@ __metadata: languageName: node linkType: hard +"async@npm:^3.2.4": + version: 3.2.6 + resolution: "async@npm:3.2.6" + checksum: ee6eb8cd8a0ab1b58bd2a3ed6c415e93e773573a91d31df9d5ef559baafa9dab37d3b096fa7993e84585cac3697b2af6ddb9086f45d3ac8cae821bb2aab65682 + languageName: node + linkType: hard + "asynciterator.prototype@npm:^1.0.0": version: 1.0.0 resolution: "asynciterator.prototype@npm:1.0.0" @@ -22258,6 +22397,15 @@ __metadata: languageName: node linkType: hard +"bcrypt-pbkdf@npm:^1.0.2": + version: 1.0.2 + resolution: "bcrypt-pbkdf@npm:1.0.2" + dependencies: + tweetnacl: ^0.14.3 + checksum: 4edfc9fe7d07019609ccf797a2af28351736e9d012c8402a07120c4453a3b789a15f2ee1530dc49eee8f7eb9379331a8dd4b3766042b9e502f74a68e7f662291 + languageName: node + linkType: hard + "before-after-hook@npm:^2.2.0": version: 2.2.3 resolution: "before-after-hook@npm:2.2.3" @@ -22657,6 +22805,13 @@ __metadata: languageName: node linkType: hard +"buffer-crc32@npm:^1.0.0": + version: 1.0.0 + resolution: "buffer-crc32@npm:1.0.0" + checksum: bc114c0e02fe621249e0b5093c70e6f12d4c2b1d8ddaf3b1b7bbe3333466700100e6b1ebdc12c050d0db845bc582c4fce8c293da487cc483f97eea027c480b23 + languageName: node + linkType: hard + "buffer-crc32@npm:~0.2.3": version: 0.2.13 resolution: "buffer-crc32@npm:0.2.13" @@ -22722,6 +22877,13 @@ __metadata: languageName: node linkType: hard +"buildcheck@npm:~0.0.6": + version: 0.0.6 + resolution: "buildcheck@npm:0.0.6" + checksum: ad61759dc98d62e931df2c9f54ccac7b522e600c6e13bdcfdc2c9a872a818648c87765ee209c850f022174da4dd7c6a450c00357c5391705d26b9c5807c2a076 + languageName: node + linkType: hard + "builtins@npm:^5.0.0": version: 5.0.1 resolution: "builtins@npm:5.0.1" @@ -22758,6 +22920,13 @@ __metadata: languageName: node linkType: hard +"byline@npm:^5.0.0": + version: 5.0.0 + resolution: "byline@npm:5.0.0" + checksum: 737ca83e8eda2976728dae62e68bc733aea095fab08db4c6f12d3cee3cf45b6f97dce45d1f6b6ff9c2c947736d10074985b4425b31ce04afa1985a4ef3d334a7 + languageName: node + linkType: hard + "bytes@npm:3.0.0": version: 3.0.0 resolution: "bytes@npm:3.0.0" @@ -23857,6 +24026,19 @@ __metadata: languageName: node linkType: hard +"compress-commons@npm:^6.0.2": + version: 6.0.2 + resolution: "compress-commons@npm:6.0.2" + dependencies: + crc-32: ^1.2.0 + crc32-stream: ^6.0.0 + is-stream: ^2.0.1 + normalize-path: ^3.0.0 + readable-stream: ^4.0.0 + checksum: 37d79a54f91344ecde352588e0a128f28ce619b085acd4f887defd76978a0640e3454a42c7dcadb0191bb3f971724ae4b1f9d6ef9620034aa0427382099ac946 + languageName: node + linkType: hard + "compressible@npm:^2.0.12, compressible@npm:~2.0.16": version: 2.0.18 resolution: "compressible@npm:2.0.18" @@ -24281,6 +24463,36 @@ __metadata: languageName: node linkType: hard +"cpu-features@npm:~0.0.10": + version: 0.0.10 + resolution: "cpu-features@npm:0.0.10" + dependencies: + buildcheck: ~0.0.6 + nan: ^2.19.0 + node-gyp: latest + checksum: ab17e25cea0b642bdcfd163d3d872be4cc7d821e854d41048557799e990d672ee1cc7bd1d4e7c4de0309b1683d4c001d36ba8569b5035d1e7e2ff2d681f681d7 + languageName: node + linkType: hard + +"crc-32@npm:^1.2.0": + version: 1.2.2 + resolution: "crc-32@npm:1.2.2" + bin: + crc32: bin/crc32.njs + checksum: ad2d0ad0cbd465b75dcaeeff0600f8195b686816ab5f3ba4c6e052a07f728c3e70df2e3ca9fd3d4484dc4ba70586e161ca5a2334ec8bf5a41bf022a6103ff243 + languageName: node + linkType: hard + +"crc32-stream@npm:^6.0.0": + version: 6.0.0 + resolution: "crc32-stream@npm:6.0.0" + dependencies: + crc-32: ^1.2.0 + readable-stream: ^4.0.0 + checksum: e6edc2f81bc387daef6d18b2ac18c2ffcb01b554d3b5c7d8d29b177505aafffba574658fdd23922767e8dab1183d1962026c98c17e17fb272794c33293ef607c + languageName: node + linkType: hard + "create-jest@npm:^29.7.0": version: 29.7.0 resolution: "create-jest@npm:29.7.0" @@ -25726,6 +25938,38 @@ __metadata: languageName: node linkType: hard +"docker-compose@npm:^0.24.8": + version: 0.24.8 + resolution: "docker-compose@npm:0.24.8" + dependencies: + yaml: ^2.2.2 + checksum: 48f3564c46490f1f51899a144deb546b61450a76bffddb378379ac7702aa34b055e0237e0dc77507df94d7ad6f1f7daeeac27730230bce9aafe2e35efeda6b45 + languageName: node + linkType: hard + +"docker-modem@npm:^3.0.0": + version: 3.0.8 + resolution: "docker-modem@npm:3.0.8" + dependencies: + debug: ^4.1.1 + readable-stream: ^3.5.0 + split-ca: ^1.0.1 + ssh2: ^1.11.0 + checksum: e3675c9b1ad800be8fb1cb9c5621fbef20a75bfedcd6e01b69808eadd7f0165681e4e30d1700897b788a67dbf4769964fcccd19c3d66f6d2499bb7aede6b34df + languageName: node + linkType: hard + +"dockerode@npm:^3.3.5": + version: 3.3.5 + resolution: "dockerode@npm:3.3.5" + dependencies: + "@balena/dockerignore": ^1.0.2 + docker-modem: ^3.0.0 + tar-fs: ~2.0.1 + checksum: 7f6650422b07fa7ea9d5801f04b1a432634446b5fe37b995b8302b953b64e93abf1bb4596c2fb574ba47aafee685ef2ab959cc86c9654add5a26d09541bbbcc6 + languageName: node + linkType: hard + "doctrine@npm:^2.1.0": version: 2.1.0 resolution: "doctrine@npm:2.1.0" @@ -29147,6 +29391,13 @@ __metadata: languageName: node linkType: hard +"get-port@npm:^5.1.1": + version: 5.1.1 + resolution: "get-port@npm:5.1.1" + checksum: 0162663ffe5c09e748cd79d97b74cd70e5a5c84b760a475ce5767b357fb2a57cb821cee412d646aa8a156ed39b78aab88974eddaa9e5ee926173c036c0713787 + languageName: node + linkType: hard + "get-stdin@npm:^8.0.0": version: 8.0.0 resolution: "get-stdin@npm:8.0.0" @@ -29360,6 +29611,22 @@ __metadata: languageName: node linkType: hard +"glob@npm:^10.0.0": + version: 10.4.5 + resolution: "glob@npm:10.4.5" + dependencies: + foreground-child: ^3.1.0 + jackspeak: ^3.1.2 + minimatch: ^9.0.4 + minipass: ^7.1.2 + package-json-from-dist: ^1.0.0 + path-scurry: ^1.11.1 + bin: + glob: dist/esm/bin.mjs + checksum: 0bc725de5e4862f9f387fd0f2b274baf16850dcd2714502ccf471ee401803997983e2c05590cb65f9675a3c6f2a58e7a53f9e365704108c6ad3cbf1d60934c4a + languageName: node + linkType: hard + "glob@npm:^10.2.2": version: 10.3.12 resolution: "glob@npm:10.3.12" @@ -31538,7 +31805,7 @@ __metadata: languageName: node linkType: hard -"is-stream@npm:^2.0.0": +"is-stream@npm:^2.0.0, is-stream@npm:^2.0.1": version: 2.0.1 resolution: "is-stream@npm:2.0.1" checksum: b8e05ccdf96ac330ea83c12450304d4a591f9958c11fd17bed240af8d5ffe08aedafa4c0f4cfccd4d28dc9d4d129daca1023633d5c11601a6cbc77521f6fae66 @@ -32012,6 +32279,19 @@ __metadata: languageName: node linkType: hard +"jackspeak@npm:^3.1.2": + version: 3.4.3 + resolution: "jackspeak@npm:3.4.3" + dependencies: + "@isaacs/cliui": ^8.0.2 + "@pkgjs/parseargs": ^0.11.0 + dependenciesMeta: + "@pkgjs/parseargs": + optional: true + checksum: be31027fc72e7cc726206b9f560395604b82e0fddb46c4cbf9f97d049bcef607491a5afc0699612eaa4213ca5be8fd3e1e7cd187b3040988b65c9489838a7c00 + languageName: node + linkType: hard + "javascript-stringify@npm:^2.0.1": version: 2.1.0 resolution: "javascript-stringify@npm:2.1.0" @@ -33486,6 +33766,15 @@ __metadata: languageName: node linkType: hard +"lazystream@npm:^1.0.0": + version: 1.0.1 + resolution: "lazystream@npm:1.0.1" + dependencies: + readable-stream: ^2.0.5 + checksum: 822c54c6b87701a6491c70d4fabc4cafcf0f87d6b656af168ee7bb3c45de9128a801cb612e6eeeefc64d298a7524a698dd49b13b0121ae50c2ae305f0dcc5310 + languageName: node + linkType: hard + "leac@npm:^0.6.0": version: 0.6.0 resolution: "leac@npm:0.6.0" @@ -33903,7 +34192,7 @@ __metadata: languageName: node linkType: hard -"lodash@npm:4.17.21, lodash@npm:^4.17.19, lodash@npm:^4.17.20, lodash@npm:^4.17.21": +"lodash@npm:4.17.21, lodash@npm:^4.17.15, lodash@npm:^4.17.19, lodash@npm:^4.17.20, lodash@npm:^4.17.21": version: 4.17.21 resolution: "lodash@npm:4.17.21" checksum: eb835a2e51d381e561e508ce932ea50a8e5a68f4ebdd771ea240d3048244a8d13658acbd502cd4829768c56f2e16bdd4340b9ea141297d472517b83868e677f7 @@ -34086,6 +34375,13 @@ __metadata: languageName: node linkType: hard +"lru-cache@npm:^10.3.0": + version: 10.4.3 + resolution: "lru-cache@npm:10.4.3" + checksum: 6476138d2125387a6d20f100608c2583d415a4f64a0fecf30c9e2dda976614f09cad4baa0842447bd37dd459a7bd27f57d9d8f8ce558805abd487c583f3d774a + languageName: node + linkType: hard + "lru-cache@npm:^5.1.1": version: 5.1.1 resolution: "lru-cache@npm:5.1.1" @@ -34307,6 +34603,19 @@ __metadata: languageName: node linkType: hard +"mariadb@npm:^3.4.0": + version: 3.4.0 + resolution: "mariadb@npm:3.4.0" + dependencies: + "@types/geojson": ^7946.0.14 + "@types/node": ^22.5.4 + denque: ^2.1.0 + iconv-lite: ^0.6.3 + lru-cache: ^10.3.0 + checksum: 89e27ae2911541fa8ff5e5dfb20d5c4dd47005323027bf6bf2975a0710a2d4cde28e20cef4d7825058411ad673c9de1f64bc1f409f6b0e92237ddb0a0cc9d46f + languageName: node + linkType: hard + "markdown-escapes@npm:^1.0.0": version: 1.0.4 resolution: "markdown-escapes@npm:1.0.4" @@ -34655,7 +34964,7 @@ __metadata: languageName: node linkType: hard -"minimatch@npm:^5.0.1": +"minimatch@npm:^5.0.1, minimatch@npm:^5.1.0": version: 5.1.6 resolution: "minimatch@npm:5.1.6" dependencies: @@ -34810,6 +35119,13 @@ __metadata: languageName: node linkType: hard +"minipass@npm:^7.1.2": + version: 7.1.2 + resolution: "minipass@npm:7.1.2" + checksum: 2bfd325b95c555f2b4d2814d49325691c7bee937d753814861b0b49d5edcda55cbbf22b6b6a60bb91eddac8668771f03c5ff647dcd9d0f798e9548b9cdc46ee3 + languageName: node + linkType: hard + "minizlib@npm:^2.0.0, minizlib@npm:^2.1.1, minizlib@npm:^2.1.2": version: 2.1.2 resolution: "minizlib@npm:2.1.2" @@ -35123,6 +35439,15 @@ __metadata: languageName: node linkType: hard +"nan@npm:^2.19.0, nan@npm:^2.20.0": + version: 2.22.0 + resolution: "nan@npm:2.22.0" + dependencies: + node-gyp: latest + checksum: 222e3a090e326c72f6782d948f44ee9b81cfb2161d5fe53216f04426a273fd094deee9dcc6813096dd2397689a2b10c1a92d3885d2e73fd2488a51547beb2929 + languageName: node + linkType: hard + "nanoid@npm:^3.3.6": version: 3.3.6 resolution: "nanoid@npm:3.3.6" @@ -36568,6 +36893,13 @@ __metadata: languageName: node linkType: hard +"package-json-from-dist@npm:^1.0.0": + version: 1.0.1 + resolution: "package-json-from-dist@npm:1.0.1" + checksum: 58ee9538f2f762988433da00e26acc788036914d57c71c246bf0be1b60cdbd77dd60b6a3e1a30465f0b248aeb80079e0b34cb6050b1dfa18c06953bb1cbc7602 + languageName: node + linkType: hard + "package-json@npm:^10.0.0": version: 10.0.1 resolution: "package-json@npm:10.0.1" @@ -36863,6 +37195,16 @@ __metadata: languageName: node linkType: hard +"path-scurry@npm:^1.11.1": + version: 1.11.1 + resolution: "path-scurry@npm:1.11.1" + dependencies: + lru-cache: ^10.2.0 + minipass: ^5.0.0 || ^6.0.2 || ^7.0.0 + checksum: 890d5abcd593a7912dcce7cf7c6bf7a0b5648e3dee6caf0712c126ca0a65c7f3d7b9d769072a4d1baf370f61ce493ab5b038d59988688e0c5f3f646ee3c69023 + languageName: node + linkType: hard + "path-scurry@npm:^1.7.0": version: 1.9.2 resolution: "path-scurry@npm:1.9.2" @@ -38150,6 +38492,15 @@ __metadata: languageName: node linkType: hard +"properties-reader@npm:^2.3.0": + version: 2.3.0 + resolution: "properties-reader@npm:2.3.0" + dependencies: + mkdirp: ^1.0.4 + checksum: cbf59e862dc507f8ce1f8d7641ed9737119f16a1d4dad8e79f17b303aaca1c6af7d36ddfef0f649cab4d200ba4334ac159af0b238f6978a085f5b1b5126b6cc3 + languageName: node + linkType: hard + "property-information@npm:^5.0.0, property-information@npm:^5.3.0": version: 5.6.0 resolution: "property-information@npm:5.6.0" @@ -38778,7 +39129,7 @@ __metadata: languageName: node linkType: hard -"readable-stream@npm:3, readable-stream@npm:^3.0.6": +"readable-stream@npm:3, readable-stream@npm:^3.0.6, readable-stream@npm:^3.5.0": version: 3.6.2 resolution: "readable-stream@npm:3.6.2" dependencies: @@ -38789,7 +39140,7 @@ __metadata: languageName: node linkType: hard -"readable-stream@npm:>=4.0.0, readable-stream@npm:^4.5.2": +"readable-stream@npm:>=4.0.0, readable-stream@npm:^4.0.0, readable-stream@npm:^4.5.2": version: 4.5.2 resolution: "readable-stream@npm:4.5.2" dependencies: @@ -38802,7 +39153,7 @@ __metadata: languageName: node linkType: hard -"readable-stream@npm:^2.0.0, readable-stream@npm:^2.0.1, readable-stream@npm:^2.3.0, readable-stream@npm:^2.3.5, readable-stream@npm:~2.3.6": +"readable-stream@npm:^2.0.0, readable-stream@npm:^2.0.1, readable-stream@npm:^2.0.5, readable-stream@npm:^2.3.0, readable-stream@npm:^2.3.5, readable-stream@npm:~2.3.6": version: 2.3.8 resolution: "readable-stream@npm:2.3.8" dependencies: @@ -38837,6 +39188,15 @@ __metadata: languageName: node linkType: hard +"readdir-glob@npm:^1.1.2": + version: 1.1.3 + resolution: "readdir-glob@npm:1.1.3" + dependencies: + minimatch: ^5.1.0 + checksum: 1dc0f7440ff5d9378b593abe9d42f34ebaf387516615e98ab410cf3a68f840abbf9ff1032d15e0a0dbffa78f9e2c46d4fafdbaac1ca435af2efe3264e3f21874 + languageName: node + linkType: hard + "readdirp@npm:~3.6.0": version: 3.6.0 resolution: "readdirp@npm:3.6.0" @@ -39822,7 +40182,7 @@ __metadata: languageName: node linkType: hard -"safer-buffer@npm:>= 2.1.2 < 3, safer-buffer@npm:>= 2.1.2 < 3.0.0": +"safer-buffer@npm:>= 2.1.2 < 3, safer-buffer@npm:>= 2.1.2 < 3.0.0, safer-buffer@npm:~2.1.0": version: 2.1.2 resolution: "safer-buffer@npm:2.1.2" checksum: cab8f25ae6f1434abee8d80023d7e72b598cf1327164ddab31003c51215526801e40b66c5e65d658a0af1e9d6478cadcb4c745f4bd6751f97d8644786c0978b0 @@ -40756,6 +41116,13 @@ __metadata: languageName: node linkType: hard +"split-ca@npm:^1.0.1": + version: 1.0.1 + resolution: "split-ca@npm:1.0.1" + checksum: 1e7409938a95ee843fe2593156a5735e6ee63772748ee448ea8477a5a3e3abde193c3325b3696e56a5aff07c7dcf6b1f6a2f2a036895b4f3afe96abb366d893f + languageName: node + linkType: hard + "split2@npm:^4.1.0": version: 4.2.0 resolution: "split2@npm:4.2.0" @@ -40813,6 +41180,33 @@ __metadata: languageName: node linkType: hard +"ssh-remote-port-forward@npm:^1.0.4": + version: 1.0.4 + resolution: "ssh-remote-port-forward@npm:1.0.4" + dependencies: + "@types/ssh2": ^0.5.48 + ssh2: ^1.4.0 + checksum: c6c04c5ddfde7cb06e9a8655a152bd28fe6771c6fe62ff0bc08be229491546c410f30b153c968b8d6817a57d38678a270c228f30143ec0fe1be546efc4f6b65a + languageName: node + linkType: hard + +"ssh2@npm:^1.11.0, ssh2@npm:^1.4.0": + version: 1.16.0 + resolution: "ssh2@npm:1.16.0" + dependencies: + asn1: ^0.2.6 + bcrypt-pbkdf: ^1.0.2 + cpu-features: ~0.0.10 + nan: ^2.20.0 + dependenciesMeta: + cpu-features: + optional: true + nan: + optional: true + checksum: c024c4a432aae2457852037f31c0d9bec323fb062ace3a31e4a6dd6c55842246c80e7d20ff93ffed22dde1e523250d8438bc2f7d4a1450cf4fa4887818176f0e + languageName: node + linkType: hard + "ssri@npm:^10.0.0": version: 10.0.5 resolution: "ssri@npm:10.0.5" @@ -41548,6 +41942,18 @@ __metadata: languageName: node linkType: hard +"tar-fs@npm:~2.0.1": + version: 2.0.1 + resolution: "tar-fs@npm:2.0.1" + dependencies: + chownr: ^1.1.1 + mkdirp-classic: ^0.5.2 + pump: ^3.0.0 + tar-stream: ^2.0.0 + checksum: 26cd297ed2421bc8038ce1a4ca442296b53739f409847d495d46086e5713d8db27f2c03ba2f461d0f5ddbc790045628188a8544f8ae32cbb6238b279b68d0247 + languageName: node + linkType: hard + "tar-stream@npm:^1.5.2": version: 1.6.2 resolution: "tar-stream@npm:1.6.2" @@ -41563,7 +41969,7 @@ __metadata: languageName: node linkType: hard -"tar-stream@npm:^2.1.4": +"tar-stream@npm:^2.0.0, tar-stream@npm:^2.1.4": version: 2.2.0 resolution: "tar-stream@npm:2.2.0" dependencies: @@ -41576,6 +41982,17 @@ __metadata: languageName: node linkType: hard +"tar-stream@npm:^3.0.0": + version: 3.1.7 + resolution: "tar-stream@npm:3.1.7" + dependencies: + b4a: ^1.6.4 + fast-fifo: ^1.2.0 + streamx: ^2.15.0 + checksum: 6393a6c19082b17b8dcc8e7fd349352bb29b4b8bfe1075912b91b01743ba6bb4298f5ff0b499a3bbaf82121830e96a1a59d4f21a43c0df339e54b01789cb8cc6 + languageName: node + linkType: hard + "tar-stream@npm:^3.1.5, tar-stream@npm:^3.1.6": version: 3.1.6 resolution: "tar-stream@npm:3.1.6" @@ -41689,6 +42106,29 @@ __metadata: languageName: node linkType: hard +"testcontainers@npm:^10.16.0": + version: 10.16.0 + resolution: "testcontainers@npm:10.16.0" + dependencies: + "@balena/dockerignore": ^1.0.2 + "@types/dockerode": ^3.3.29 + archiver: ^7.0.1 + async-lock: ^1.4.1 + byline: ^5.0.0 + debug: ^4.3.5 + docker-compose: ^0.24.8 + dockerode: ^3.3.5 + get-port: ^5.1.1 + proper-lockfile: ^4.1.2 + properties-reader: ^2.3.0 + ssh-remote-port-forward: ^1.0.4 + tar-fs: ^3.0.6 + tmp: ^0.2.3 + undici: ^5.28.4 + checksum: 69d56bc68daf9b3aa0fef524133d5ab85b6028210f4fd456c05fff5f48b9452c6ff44274be281dbf45690595df8b6914eaf5c3fade75191aeaaa3087abd4a23d + languageName: node + linkType: hard + "text-decoder@npm:^1.1.0": version: 1.1.1 resolution: "text-decoder@npm:1.1.1" @@ -41810,6 +42250,13 @@ __metadata: languageName: node linkType: hard +"tmp@npm:^0.2.3": + version: 0.2.3 + resolution: "tmp@npm:0.2.3" + checksum: 73b5c96b6e52da7e104d9d44afb5d106bb1e16d9fa7d00dbeb9e6522e61b571fbdb165c756c62164be9a3bbe192b9b268c236d370a2a0955c7689cd2ae377b95 + languageName: node + linkType: hard + "tmpl@npm:1.0.5": version: 1.0.5 resolution: "tmpl@npm:1.0.5" @@ -42238,6 +42685,13 @@ __metadata: languageName: node linkType: hard +"tweetnacl@npm:^0.14.3": + version: 0.14.5 + resolution: "tweetnacl@npm:0.14.5" + checksum: 6061daba1724f59473d99a7bb82e13f211cdf6e31315510ae9656fefd4779851cb927adad90f3b488c8ed77c106adc0421ea8055f6f976ff21b27c5c4e918487 + languageName: node + linkType: hard + "type-check@npm:^0.4.0, type-check@npm:~0.4.0": version: 0.4.0 resolution: "type-check@npm:0.4.0" @@ -42844,6 +43298,13 @@ __metadata: languageName: node linkType: hard +"undici-types@npm:~6.20.0": + version: 6.20.0 + resolution: "undici-types@npm:6.20.0" + checksum: b7bc50f012dc6afbcce56c9fd62d7e86b20a62ff21f12b7b5cbf1973b9578d90f22a9c7fe50e638e96905d33893bf2f9f16d98929c4673c2480de05c6c96ea8b + languageName: node + linkType: hard + "undici@npm:5.27.2": version: 5.27.2 resolution: "undici@npm:5.27.2" @@ -42871,7 +43332,7 @@ __metadata: languageName: node linkType: hard -"undici@npm:~5.28.4": +"undici@npm:^5.28.4, undici@npm:~5.28.4": version: 5.28.4 resolution: "undici@npm:5.28.4" dependencies: @@ -44534,6 +44995,15 @@ __metadata: languageName: node linkType: hard +"yaml@npm:^2.2.2": + version: 2.6.1 + resolution: "yaml@npm:2.6.1" + bin: + yaml: bin.mjs + checksum: 5cf2627f121dcf04ccdebce8e6cbac7c9983d465c4eab314f6fbdc13cda8a07f4e8f9c2252a382b30bcabe05ee3c683647293afd52eb37cbcefbdc7b6ebde9ee + languageName: node + linkType: hard + "yaml@npm:^2.4.5": version: 2.4.5 resolution: "yaml@npm:2.4.5" @@ -44653,6 +45123,17 @@ __metadata: languageName: node linkType: hard +"zip-stream@npm:^6.0.1": + version: 6.0.1 + resolution: "zip-stream@npm:6.0.1" + dependencies: + archiver-utils: ^5.0.0 + compress-commons: ^6.0.2 + readable-stream: ^4.0.0 + checksum: aa5abd6a89590eadeba040afbc375f53337f12637e5e98330012a12d9886cde7a3ccc28bd91aafab50576035bbb1de39a9a316eecf2411c8b9009c9f94f0db27 + languageName: node + linkType: hard + "zod-to-json-schema@npm:3.20.3": version: 3.20.3 resolution: "zod-to-json-schema@npm:3.20.3"