diff --git a/site/en/about/comparison.md b/site/en/about/comparison.md index f7c0ed0e2..4a3bd3016 100644 --- a/site/en/about/comparison.md +++ b/site/en/about/comparison.md @@ -52,7 +52,7 @@ Although both serve similar functions as vector databases, the domain-specific t | Deployment Modes | SaaS-only | Milvus Lite, On-prem Standalone & Cluster, Zilliz Cloud Saas & BYOC | | Embedding Functions | Not available | Support with pymilvus[model] | | Data Types | String, Number, Bool, List of String | String, VarChar, Number (Int, Float, Double), Bool, Array, JSON, Float Vector, Binary Vector, BFloat16, Float16, Sparse Vector | -| Metric and Index Types | Cos, Dot, Euclidean
P-family, S-family | Cosine, IP (Dot), L2 (Euclidean), Hamming, Jaccard
FLAT, IVF_FLAT, IVF_SQ8, IVF_PQ, HNSW, SCANN, GPU Indexes | +| Metric and Index Types | Cos, Dot, Euclidean
P-family, S-family | Cosine, IP (Dot), L2 (Euclidean), Hamming, Jaccard
FLAT, IVF_FLAT, IVF_SQ8, IVF_PQ, HNSW, SCANN, GPU Indexes | | Schema Design | Flexible mode | Flexible mode, Strict mode | | Multiple Vector Fields | N/A | Multi-vector and hybrid search | | Tools | Datasets, text utilities, spark connector | Attu, Birdwatcher, Backup, CLI, CDC, Spark and Kafka connectors | diff --git a/site/en/about/roadmap.md b/site/en/about/roadmap.md index 977e10c91..a96754adc 100644 --- a/site/en/about/roadmap.md +++ b/site/en/about/roadmap.md @@ -22,28 +22,28 @@ Welcome to the Milvus Roadmap! Join us on our continuous journey to enhance and - AI-developer Friendly
A developer-friendly technology stack, enhanced with the latest AI innovations - Multi-Vectors & Hybrid Search
Framework for multiplex recall and fusion

GPU Index Acceleration
Support for higher QPS and faster index creation

Model Library in PyMilvus
Integrated embedding models for Milvus - Sparse Vector (GA)
Local feature extraction and keyword search

Milvus Lite (GA)
A lightweight, in-memory version of Milvus

Embedding Models Gallery
Support for image and multi-modal embeddings and reranker models in model libraries - Original Data-In and Data-Out
Support for Blob data types

Data Clustering
Data co-locality

Scenario-oriented Vector Search
e.g. Multi-target search & NN filtering

Support Embedding & Reranker Endpoint + AI-developer Friendly
A developer-friendly technology stack, enhanced with the latest AI innovations + Multi-Vectors & Hybrid Search
Framework for multiplex recall and fusion

GPU Index Acceleration
Support for higher QPS and faster index creation

Model Library in PyMilvus
Integrated embedding models for Milvus + Sparse Vector (GA)
Local feature extraction and keyword search

Milvus Lite (GA)
A lightweight, in-memory version of Milvus

Embedding Models Gallery
Support for image and multi-modal embeddings and reranker models in model libraries + Original Data-In and Data-Out
Support for Blob data types

Data Clustering
Data co-locality

Scenario-oriented Vector Search
e.g. Multi-target search & NN filtering

Support Embedding & Reranker Endpoint - Rich Functionality
Enhanced retrieval and data management features - Support for FP16, BF16 Datatypes
These ML datatypes can help reduce memory usage

Grouping Search
Aggregate split embeddings

Fuzzy Match and Inverted Index
Support for fuzzy matching and inverted indexing for scalar types like varchar and int - Inverted Index for Array & JSON
Indexing for array and partial support JSON

Bitset Index
Improved execution speed and future data aggregation

Truncate Collection
Allows data clearance while preserving metadata

Support for NULL and Default Values - Support for More Datatypes
e.g. Datetime, GIS

Advanced Text Filtering
e.g. Match Phrase

Primary Key Deduplication + Rich Functionality
Enhanced retrieval and data management features + Support for FP16, BF16 Datatypes
These ML datatypes can help reduce memory usage

Grouping Search
Aggregate split embeddings

Fuzzy Match and Inverted Index
Support for fuzzy matching and inverted indexing for scalar types like varchar and int + Inverted Index for Array & JSON
Indexing for array and partial support JSON

Bitset Index
Improved execution speed and future data aggregation

Truncate Collection
Allows data clearance while preserving metadata

Support for NULL and Default Values + Support for More Datatypes
e.g. Datetime, GIS

Advanced Text Filtering
e.g. Match Phrase

Primary Key Deduplication - Cost Efficiency & Architecture
Advanced systems emphasizing stability, cost efficiency, scalability, and performance - Support for More Collections/Partitions
Handles over 10,000 collections in smaller clusters

Mmap Optimization
Balances reduced memory consumption with latency

Bulk Insert Optimazation
Simplifies importing large datasets - Lazy Load
Data is loaded on-demand through read operations

Major Compaction
Re-distributes data based on configuration to enhance read performance

Mmap for Growing Data
Mmap files for expanding data segments - Memory Control
Reduces out-of-memory issues and provides global memory management

LogNode Introduction
Ensures global consistency and addresses the single-point bottleneck in root coordination

Storage Format V2
Universal format design lays the groundwork for disk-based data access + Cost Efficiency & Architecture
Advanced systems emphasizing stability, cost efficiency, scalability, and performance + Support for More Collections/Partitions
Handles over 10,000 collections in smaller clusters

Mmap Optimization
Balances reduced memory consumption with latency

Bulk Insert Optimazation
Simplifies importing large datasets + Lazy Load
Data is loaded on-demand through read operations

Major Compaction
Re-distributes data based on configuration to enhance read performance

Mmap for Growing Data
Mmap files for expanding data segments + Memory Control
Reduces out-of-memory issues and provides global memory management

LogNode Introduction
Ensures global consistency and addresses the single-point bottleneck in root coordination

Storage Format V2
Universal format design lays the groundwork for disk-based data access - Enterprise Ready
Designed to meet the needs of enterprise production environments - Milvus CDC
Capability for data replication

Accesslog Enhancement
Detailed recording for audit and tracing - New Resource Group
Enhanced resource management

Storage Hook
Support for Bring Your Own Key (BYOK) encryption - Dynamic Replica Number Adjustment
Facilitates dynamic changes to the number of replicas

Dynamic Schema Modification
e.g., Add/delete fields, modify varchar lengths

Rust and C# SDKs + Enterprise Ready
Designed to meet the needs of enterprise production environments + Milvus CDC
Capability for data replication

Accesslog Enhancement
Detailed recording for audit and tracing + New Resource Group
Enhanced resource management

Storage Hook
Support for Bring Your Own Key (BYOK) encryption + Dynamic Replica Number Adjustment
Facilitates dynamic changes to the number of replicas

Dynamic Schema Modification
e.g., Add/delete fields, modify varchar lengths

Rust and C# SDKs diff --git a/site/en/getstarted/run-milvus-gpu/install_cluster-helm-gpu.md b/site/en/getstarted/run-milvus-gpu/install_cluster-helm-gpu.md index cb3b1f28b..0024338d8 100644 --- a/site/en/getstarted/run-milvus-gpu/install_cluster-helm-gpu.md +++ b/site/en/getstarted/run-milvus-gpu/install_cluster-helm-gpu.md @@ -174,7 +174,7 @@ In addition to a single GPU device, you can also assign multiple GPU devices to diff --git a/site/en/getstarted/run-milvus-k8s/install_cluster-helm.md b/site/en/getstarted/run-milvus-k8s/install_cluster-helm.md index 1db2327e8..38c077d81 100644 --- a/site/en/getstarted/run-milvus-k8s/install_cluster-helm.md +++ b/site/en/getstarted/run-milvus-k8s/install_cluster-helm.md @@ -85,7 +85,7 @@ The command above deploys a Milvus cluster with its components and dependencies diff --git a/site/en/migrate/es2m.md b/site/en/migrate/es2m.md index 88317c25e..e521541a7 100644 --- a/site/en/migrate/es2m.md +++ b/site/en/migrate/es2m.md @@ -118,7 +118,7 @@ The following table describes the parameters in the example config file. For a f | Parameter | Description | | --- | --- | - | `target.mode` | Storage location for dumped files. Valid values:
- `local`: Store dumped files on local disks.
- `remote`: Store dumped files on object storage. | + | `target.mode` | Storage location for dumped files. Valid values:
- `local`: Store dumped files on local disks.
- `remote`: Store dumped files on object storage. | | `target.remote.outputDir` | Output directory path in the cloud storage bucket. | | `target.remote.cloud` | Cloud storage service provider. Example values: `aws`, `gcp`, `azure`. | | `target.remote.region` | Cloud storage region. It can be any value if you use local MinIO. | diff --git a/site/en/migrate/f2m.md b/site/en/migrate/f2m.md index 637d7123c..77534510d 100644 --- a/site/en/migrate/f2m.md +++ b/site/en/migrate/f2m.md @@ -87,7 +87,7 @@ The following table describes the parameters in the example config file. For a f | Parameter | Description | | --- | --- | - | `source.mode` | Specifies where the source files are read from. Valid values:
- `local`: reads files from a local disk.
- `remote`: reads files from remote storage. | + | `source.mode` | Specifies where the source files are read from. Valid values:
- `local`: reads files from a local disk.
- `remote`: reads files from remote storage. | | `source.local.faissFile` | The directory path where the source files are located. For example, `/db/faiss.index`. | - `target` @@ -98,7 +98,7 @@ The following table describes the parameters in the example config file. For a f | `target.create.collection.shardsNums` | Number of shards to be created in the collection. For more information on shards, refer to [Terminology](https://milvus.io/docs/glossary.md#Shard). | | `target.create.collection.dim` | Dimension of the vector field. | | `target.create.collection.metricType` | Metric type used to measure similarities between vectors. For more information, refer to [Terminology](https://milvus.io/docs/glossary.md#Metric-type). | - | `target.mode` | Storage location for dumped files. Valid values:
- `local`: Store dumped files on local disks.
- `remote`: Store dumped files on object storage. | + | `target.mode` | Storage location for dumped files. Valid values:
- `local`: Store dumped files on local disks.
- `remote`: Store dumped files on object storage. | | `target.remote.outputDir` | Output directory path in the cloud storage bucket. | | `target.remote.cloud` | Cloud storage service provider. Example values: `aws`, `gcp`, `azure`. | | `target.remote.endpoint` | Endpoint of Milvus 2.x storage. | diff --git a/site/en/migrate/m2m.md b/site/en/migrate/m2m.md index c01a32535..caddd36d1 100644 --- a/site/en/migrate/m2m.md +++ b/site/en/migrate/m2m.md @@ -114,14 +114,14 @@ The following table describes the parameters in the example config file. For a f | Parameter | Description | | --- | --- | - | `source.mode` | Specifies where the source files are read from. Valid values:
- `local`: reads files from a local disk.
- `remote`: reads files from remote storage. | + | `source.mode` | Specifies where the source files are read from. Valid values:
- `local`: reads files from a local disk.
- `remote`: reads files from remote storage. | | `source.local.tablesDir` | The directory path where the source files are located. For example, `/db/tables/`. | - `target` | Parameter | Description | | --- | --- | - | `target.mode` | Storage location for dumped files. Valid values:
- `local`: Store dumped files on local disks.
- `remote`: Store dumped files on object storage. | + | `target.mode` | Storage location for dumped files. Valid values:
- `local`: Store dumped files on local disks.
- `remote`: Store dumped files on object storage. | | `target.remote.outputDir` | Output directory path in the cloud storage bucket. | | `target.remote.ak` | Access key for Milvus 2.x storage. | | `target.remote.sk` | Secret key for Milvus 2.x storage. | diff --git a/site/en/reference/disk_index.md b/site/en/reference/disk_index.md index 6e1aa1a36..746c10762 100644 --- a/site/en/reference/disk_index.md +++ b/site/en/reference/disk_index.md @@ -61,10 +61,10 @@ DiskIndex: | Parameter | Description | Value Range | Default Value | | --- | --- | --- | --- | -| `MaxDegree` | Maximum degree of the Vamana graph.
A larger value offers a higher recall rate but increases the size of and time to build the index. | [1, 512] | 56 | -| `SearchListSize` | Size of the candidate list.
A larger value increases the time spent on building the index but offers a higher recall rate.
Set it to a value smaller than `MaxDegree` unless you need to reduce the index-building time. | [1, int32_max] | 100 | -| `PQCodeBugetGBRatio` | Size limit on the PQ code.
A larger value offers a higher recall rate but increases memory usage. | (0.0, 0.25] | 0.125 | -| `SearchCacheBudgetGBRatio` | Ratio of cached node numbers to raw data.
A larger value improves index-building performance with increased memory usage. | [0.0, 0.3) | 0.10 | +| `MaxDegree` | Maximum degree of the Vamana graph.
A larger value offers a higher recall rate but increases the size of and time to build the index. | [1, 512] | 56 | +| `SearchListSize` | Size of the candidate list.
A larger value increases the time spent on building the index but offers a higher recall rate.
Set it to a value smaller than `MaxDegree` unless you need to reduce the index-building time. | [1, int32_max] | 100 | +| `PQCodeBugetGBRatio` | Size limit on the PQ code.
A larger value offers a higher recall rate but increases memory usage. | (0.0, 0.25] | 0.125 | +| `SearchCacheBudgetGBRatio` | Ratio of cached node numbers to raw data.
A larger value improves index-building performance with increased memory usage. | [0.0, 0.3) | 0.10 | | `BeamWidthRatio` | Ratio between the maximum number of IO requests per search iteration and CPU number. | [1, max(128 / CPU number, 16)] | 4.0 | ## Troubleshooting diff --git a/site/en/userGuide/manage-collections.md b/site/en/userGuide/manage-collections.md index 6af8c27c2..54414853a 100644 --- a/site/en/userGuide/manage-collections.md +++ b/site/en/userGuide/manage-collections.md @@ -335,11 +335,11 @@ export fields='[{ \ auto_id - Determines if the primary field automatically increments.
Setting this to True makes the primary field automatically increment. In this case, the primary field should not be included in the data to insert to avoid errors. The auto-generated IDs have a fixed length and cannot be altered. + Determines if the primary field automatically increments.
Setting this to True makes the primary field automatically increment. In this case, the primary field should not be included in the data to insert to avoid errors. The auto-generated IDs have a fixed length and cannot be altered. enable_dynamic_field - Determines if Milvus saves the values of undefined fields in a dynamic field if the data being inserted into the target collection includes fields that are not defined in the collection's schema.
When you set this to True, Milvus will create a field called $meta to store any undefined fields and their values from the data that is inserted. + Determines if Milvus saves the values of undefined fields in a dynamic field if the data being inserted into the target collection includes fields that are not defined in the collection's schema.
When you set this to True, Milvus will create a field called $meta to store any undefined fields and their values from the data that is inserted. field_name @@ -351,11 +351,11 @@ export fields='[{ \ is_primary - Whether the current field is the primary field in a collection.
Each collection has only one primary field. A primary field should be of either the DataType.INT64 type or the DataType.VARCHAR type. + Whether the current field is the primary field in a collection.
Each collection has only one primary field. A primary field should be of either the DataType.INT64 type or the DataType.VARCHAR type. dim - The dimension of the vector embeddings.
This is mandatory for a field of the DataType.FLOAT_VECTOR, DataType.BINARY_VECTOR, DataType.FLOAT16_VECTOR, or DataType.BFLOAT16_VECTOR type. If you use DataType.SPARSE_FLOAT_VECTOR, omit this parameter. + The dimension of the vector embeddings.
This is mandatory for a field of the DataType.FLOAT_VECTOR, DataType.BINARY_VECTOR, DataType.FLOAT16_VECTOR, or DataType.BFLOAT16_VECTOR type. If you use DataType.SPARSE_FLOAT_VECTOR, omit this parameter. @@ -378,15 +378,15 @@ export fields='[{ \ isPrimaryKey - Whether the current field is the primary field in a collection.
Each collection has only one primary field. A primary field should be of either the DataType.Int64 type or the DataType.VarChar type. + Whether the current field is the primary field in a collection.
Each collection has only one primary field. A primary field should be of either the DataType.Int64 type or the DataType.VarChar type. autoID - Whether allows the primary field to automatically increment.
Setting this to true makes the primary field automatically increment. In this case, the primary field should not be included in the data to insert to avoid errors. + Whether allows the primary field to automatically increment.
Setting this to true makes the primary field automatically increment. In this case, the primary field should not be included in the data to insert to avoid errors. dimension - The dimension of the vector embeddings.
This is mandatory for a field of the DataType.FloatVector, DataType.BinaryVector, DataType.Float16Vector, or DataType.BFloat16Vector type. + The dimension of the vector embeddings.
This is mandatory for a field of the DataType.FloatVector, DataType.BinaryVector, DataType.Float16Vector, or DataType.BFloat16Vector type. @@ -409,15 +409,15 @@ export fields='[{ \ is_primary_key - Whether the current field is the primary field in a collection.
Each collection has only one primary field. A primary field should be of either the DataType.INT64 type or the DataType.VARCHAR type. + Whether the current field is the primary field in a collection.
Each collection has only one primary field. A primary field should be of either the DataType.INT64 type or the DataType.VARCHAR type. auto_id - Whether the primary field automatically increments upon data insertions into this collection.
The value defaults to False. Setting this to True makes the primary field automatically increment. Skip this parameter if you need to set up a collection with a customized schema. + Whether the primary field automatically increments upon data insertions into this collection.
The value defaults to False. Setting this to True makes the primary field automatically increment. Skip this parameter if you need to set up a collection with a customized schema. dim - The dimensionality of the collection field that holds vector embeddings.
The value should be an integer greater than 1 and is usually determined by the model you use to generate vector embeddings. + The dimensionality of the collection field that holds vector embeddings.
The value should be an integer greater than 1 and is usually determined by the model you use to generate vector embeddings. @@ -980,11 +980,11 @@ Use [createCollection()](https://milvus.io/api-reference/node/v2.4.x/Collections schema - The schema of this collection.
Setting this to None indicates this collection will be created with default settings.
To set up a collection with a customized schema, you need to create a CollectionSchema object and reference it here. In this case, Milvus ignores all other schema-related settings carried in the request. + The schema of this collection.
Setting this to None indicates this collection will be created with default settings.
To set up a collection with a customized schema, you need to create a CollectionSchema object and reference it here. In this case, Milvus ignores all other schema-related settings carried in the request. index_params - The parameters for building the index on the vector field in this collection. To set up a collection with a customized schema and automatically load the collection to memory, you need to create an IndexParams object and reference it here.
You should at least add an index for the vector field in this collection. You can also skip this parameter if you prefer to set up the index parameters later on. + The parameters for building the index on the vector field in this collection. To set up a collection with a customized schema and automatically load the collection to memory, you need to create an IndexParams object and reference it here.
You should at least add an index for the vector field in this collection. You can also skip this parameter if you prefer to set up the index parameters later on. @@ -1003,7 +1003,7 @@ Use [createCollection()](https://milvus.io/api-reference/node/v2.4.x/Collections collectionSchema - The schema of this collection.
Leaving it empty indicates this collection will be created with default settings. To set up a collection with a customized schema, you need to create a CollectionSchema object and reference it here. + The schema of this collection.
Leaving it empty indicates this collection will be created with default settings. To set up a collection with a customized schema, you need to create a CollectionSchema object and reference it here. indexParams diff --git a/site/en/userGuide/search-query-get/single-vector-search.md b/site/en/userGuide/search-query-get/single-vector-search.md index fccbd81d1..96e1ecd2a 100644 --- a/site/en/userGuide/search-query-get/single-vector-search.md +++ b/site/en/userGuide/search-query-get/single-vector-search.md @@ -478,15 +478,15 @@ console.log(res.results) data - A list of vector embeddings.
Milvus searches for the most similar vector embeddings to the specified ones. + A list of vector embeddings.
Milvus searches for the most similar vector embeddings to the specified ones. limit - The total number of entities to return.
You can use this parameter in combination with offset in param to enable pagination.
The sum of this value and offset in param should be less than 16,384. + The total number of entities to return.
You can use this parameter in combination with offset in param to enable pagination.
The sum of this value and offset in param should be less than 16,384. search_params - The parameter settings specific to this operation.
+ The parameter settings specific to this operation.
@@ -505,11 +505,11 @@ console.log(res.results) data - A list of vector embeddings.
Milvus searches for the most similar vector embeddings to the specified ones. + A list of vector embeddings.
Milvus searches for the most similar vector embeddings to the specified ones. topK - The number of records to return in the search result. This parameter uses the same syntax as the limit parameter, so you should only set one of them.
You can use this parameter in combination with offset in param to enable pagination.
The sum of this value and offset in param should be less than 16,384. + The number of records to return in the search result. This parameter uses the same syntax as the limit parameter, so you should only set one of them.
You can use this parameter in combination with offset in param to enable pagination.
The sum of this value and offset in param should be less than 16,384. @@ -528,11 +528,11 @@ console.log(res.results) data - A list of vector embeddings.
Milvus searches for the most similar vector embeddings to the specified ones. + A list of vector embeddings.
Milvus searches for the most similar vector embeddings to the specified ones. limit - The total number of entities to return.
You can use this parameter in combination with offset in param to enable pagination.
The sum of this value and offset in param should be less than 16,384. + The total number of entities to return.
You can use this parameter in combination with offset in param to enable pagination.
The sum of this value and offset in param should be less than 16,384. diff --git a/site/en/userGuide/search-query-get/with-iterators.md b/site/en/userGuide/search-query-get/with-iterators.md index e32d3eba3..cdcb1d8b6 100644 --- a/site/en/userGuide/search-query-get/with-iterators.md +++ b/site/en/userGuide/search-query-get/with-iterators.md @@ -362,7 +362,7 @@ System.out.println(results.size()); data - A list of vector embeddings.
Milvus searches for the most similar vector embeddings to the specified ones. + A list of vector embeddings.
Milvus searches for the most similar vector embeddings to the specified ones. anns_field @@ -370,19 +370,19 @@ System.out.println(results.size()); batch_size - The number of entities to return each time you call next() on the current iterator.
The value defaults to 1000. Set it to a proper value to control the number of entities to return per iteration. + The number of entities to return each time you call next() on the current iterator.
The value defaults to 1000. Set it to a proper value to control the number of entities to return per iteration. param - The parameter settings specific to this operation.
+ The parameter settings specific to this operation.
output_fields - A list of field names to include in each entity in return.
The value defaults to None. If left unspecified, only the primary field is included. + A list of field names to include in each entity in return.
The value defaults to None. If left unspecified, only the primary field is included. limit - The total number of entities to return.
The value defaults to -1, indicating all matching entities will be in return. + The total number of entities to return.
The value defaults to -1, indicating all matching entities will be in return. @@ -409,7 +409,7 @@ System.out.println(results.size()); withBatchSize - The number of entities to return each time you call next() on the current iterator.
The value defaults to 1000. Set it to a proper value to control the number of entities to return per iteration. + The number of entities to return each time you call next() on the current iterator.
The value defaults to 1000. Set it to a proper value to control the number of entities to return per iteration. withParams @@ -551,19 +551,19 @@ while (true) { batch_size - The number of entities to return each time you call next() on the current iterator.
The value defaults to 1000. Set it to a proper value to control the number of entities to return per iteration. + The number of entities to return each time you call next() on the current iterator.
The value defaults to 1000. Set it to a proper value to control the number of entities to return per iteration. expr - A scalar filtering condition to filter matching entities.
The value defaults to None, indicating that scalar filtering is ignored. To build a scalar filtering condition, refer to Boolean Expression Rules. + A scalar filtering condition to filter matching entities.
The value defaults to None, indicating that scalar filtering is ignored. To build a scalar filtering condition, refer to Boolean Expression Rules. output_fields - A list of field names to include in each entity in return.
The value defaults to None. If left unspecified, only the primary field is included. + A list of field names to include in each entity in return.
The value defaults to None. If left unspecified, only the primary field is included. limit - The total number of entities to return.
The value defaults to -1, indicating all matching entities will be in return. + The total number of entities to return.
The value defaults to -1, indicating all matching entities will be in return. @@ -586,7 +586,7 @@ while (true) { withBatchSize - The number of entities to return each time you call next() on the current iterator.
The value defaults to 1000. Set it to a proper value to control the number of entities to return per iteration. + The number of entities to return each time you call next() on the current iterator.
The value defaults to 1000. Set it to a proper value to control the number of entities to return per iteration. addOutField diff --git a/site/zh/about/comparison.md b/site/zh/about/comparison.md index f7c0ed0e2..4a3bd3016 100644 --- a/site/zh/about/comparison.md +++ b/site/zh/about/comparison.md @@ -52,7 +52,7 @@ Although both serve similar functions as vector databases, the domain-specific t | Deployment Modes | SaaS-only | Milvus Lite, On-prem Standalone & Cluster, Zilliz Cloud Saas & BYOC | | Embedding Functions | Not available | Support with pymilvus[model] | | Data Types | String, Number, Bool, List of String | String, VarChar, Number (Int, Float, Double), Bool, Array, JSON, Float Vector, Binary Vector, BFloat16, Float16, Sparse Vector | -| Metric and Index Types | Cos, Dot, Euclidean
P-family, S-family | Cosine, IP (Dot), L2 (Euclidean), Hamming, Jaccard
FLAT, IVF_FLAT, IVF_SQ8, IVF_PQ, HNSW, SCANN, GPU Indexes | +| Metric and Index Types | Cos, Dot, Euclidean
P-family, S-family | Cosine, IP (Dot), L2 (Euclidean), Hamming, Jaccard
FLAT, IVF_FLAT, IVF_SQ8, IVF_PQ, HNSW, SCANN, GPU Indexes | | Schema Design | Flexible mode | Flexible mode, Strict mode | | Multiple Vector Fields | N/A | Multi-vector and hybrid search | | Tools | Datasets, text utilities, spark connector | Attu, Birdwatcher, Backup, CLI, CDC, Spark and Kafka connectors | diff --git a/site/zh/about/roadmap.md b/site/zh/about/roadmap.md index 977e10c91..a96754adc 100644 --- a/site/zh/about/roadmap.md +++ b/site/zh/about/roadmap.md @@ -22,28 +22,28 @@ Welcome to the Milvus Roadmap! Join us on our continuous journey to enhance and - AI-developer Friendly
A developer-friendly technology stack, enhanced with the latest AI innovations - Multi-Vectors & Hybrid Search
Framework for multiplex recall and fusion

GPU Index Acceleration
Support for higher QPS and faster index creation

Model Library in PyMilvus
Integrated embedding models for Milvus - Sparse Vector (GA)
Local feature extraction and keyword search

Milvus Lite (GA)
A lightweight, in-memory version of Milvus

Embedding Models Gallery
Support for image and multi-modal embeddings and reranker models in model libraries - Original Data-In and Data-Out
Support for Blob data types

Data Clustering
Data co-locality

Scenario-oriented Vector Search
e.g. Multi-target search & NN filtering

Support Embedding & Reranker Endpoint + AI-developer Friendly
A developer-friendly technology stack, enhanced with the latest AI innovations + Multi-Vectors & Hybrid Search
Framework for multiplex recall and fusion

GPU Index Acceleration
Support for higher QPS and faster index creation

Model Library in PyMilvus
Integrated embedding models for Milvus + Sparse Vector (GA)
Local feature extraction and keyword search

Milvus Lite (GA)
A lightweight, in-memory version of Milvus

Embedding Models Gallery
Support for image and multi-modal embeddings and reranker models in model libraries + Original Data-In and Data-Out
Support for Blob data types

Data Clustering
Data co-locality

Scenario-oriented Vector Search
e.g. Multi-target search & NN filtering

Support Embedding & Reranker Endpoint - Rich Functionality
Enhanced retrieval and data management features - Support for FP16, BF16 Datatypes
These ML datatypes can help reduce memory usage

Grouping Search
Aggregate split embeddings

Fuzzy Match and Inverted Index
Support for fuzzy matching and inverted indexing for scalar types like varchar and int - Inverted Index for Array & JSON
Indexing for array and partial support JSON

Bitset Index
Improved execution speed and future data aggregation

Truncate Collection
Allows data clearance while preserving metadata

Support for NULL and Default Values - Support for More Datatypes
e.g. Datetime, GIS

Advanced Text Filtering
e.g. Match Phrase

Primary Key Deduplication + Rich Functionality
Enhanced retrieval and data management features + Support for FP16, BF16 Datatypes
These ML datatypes can help reduce memory usage

Grouping Search
Aggregate split embeddings

Fuzzy Match and Inverted Index
Support for fuzzy matching and inverted indexing for scalar types like varchar and int + Inverted Index for Array & JSON
Indexing for array and partial support JSON

Bitset Index
Improved execution speed and future data aggregation

Truncate Collection
Allows data clearance while preserving metadata

Support for NULL and Default Values + Support for More Datatypes
e.g. Datetime, GIS

Advanced Text Filtering
e.g. Match Phrase

Primary Key Deduplication - Cost Efficiency & Architecture
Advanced systems emphasizing stability, cost efficiency, scalability, and performance - Support for More Collections/Partitions
Handles over 10,000 collections in smaller clusters

Mmap Optimization
Balances reduced memory consumption with latency

Bulk Insert Optimazation
Simplifies importing large datasets - Lazy Load
Data is loaded on-demand through read operations

Major Compaction
Re-distributes data based on configuration to enhance read performance

Mmap for Growing Data
Mmap files for expanding data segments - Memory Control
Reduces out-of-memory issues and provides global memory management

LogNode Introduction
Ensures global consistency and addresses the single-point bottleneck in root coordination

Storage Format V2
Universal format design lays the groundwork for disk-based data access + Cost Efficiency & Architecture
Advanced systems emphasizing stability, cost efficiency, scalability, and performance + Support for More Collections/Partitions
Handles over 10,000 collections in smaller clusters

Mmap Optimization
Balances reduced memory consumption with latency

Bulk Insert Optimazation
Simplifies importing large datasets + Lazy Load
Data is loaded on-demand through read operations

Major Compaction
Re-distributes data based on configuration to enhance read performance

Mmap for Growing Data
Mmap files for expanding data segments + Memory Control
Reduces out-of-memory issues and provides global memory management

LogNode Introduction
Ensures global consistency and addresses the single-point bottleneck in root coordination

Storage Format V2
Universal format design lays the groundwork for disk-based data access - Enterprise Ready
Designed to meet the needs of enterprise production environments - Milvus CDC
Capability for data replication

Accesslog Enhancement
Detailed recording for audit and tracing - New Resource Group
Enhanced resource management

Storage Hook
Support for Bring Your Own Key (BYOK) encryption - Dynamic Replica Number Adjustment
Facilitates dynamic changes to the number of replicas

Dynamic Schema Modification
e.g., Add/delete fields, modify varchar lengths

Rust and C# SDKs + Enterprise Ready
Designed to meet the needs of enterprise production environments + Milvus CDC
Capability for data replication

Accesslog Enhancement
Detailed recording for audit and tracing + New Resource Group
Enhanced resource management

Storage Hook
Support for Bring Your Own Key (BYOK) encryption + Dynamic Replica Number Adjustment
Facilitates dynamic changes to the number of replicas

Dynamic Schema Modification
e.g., Add/delete fields, modify varchar lengths

Rust and C# SDKs diff --git a/site/zh/getstarted/run-milvus-gpu/install_cluster-helm-gpu.md b/site/zh/getstarted/run-milvus-gpu/install_cluster-helm-gpu.md index cb3b1f28b..0024338d8 100644 --- a/site/zh/getstarted/run-milvus-gpu/install_cluster-helm-gpu.md +++ b/site/zh/getstarted/run-milvus-gpu/install_cluster-helm-gpu.md @@ -174,7 +174,7 @@ In addition to a single GPU device, you can also assign multiple GPU devices to diff --git a/site/zh/getstarted/run-milvus-k8s/install_cluster-helm.md b/site/zh/getstarted/run-milvus-k8s/install_cluster-helm.md index 1db2327e8..38c077d81 100644 --- a/site/zh/getstarted/run-milvus-k8s/install_cluster-helm.md +++ b/site/zh/getstarted/run-milvus-k8s/install_cluster-helm.md @@ -85,7 +85,7 @@ The command above deploys a Milvus cluster with its components and dependencies diff --git a/site/zh/migrate/es2m.md b/site/zh/migrate/es2m.md index 88317c25e..e521541a7 100644 --- a/site/zh/migrate/es2m.md +++ b/site/zh/migrate/es2m.md @@ -118,7 +118,7 @@ The following table describes the parameters in the example config file. For a f | Parameter | Description | | --- | --- | - | `target.mode` | Storage location for dumped files. Valid values:
- `local`: Store dumped files on local disks.
- `remote`: Store dumped files on object storage. | + | `target.mode` | Storage location for dumped files. Valid values:
- `local`: Store dumped files on local disks.
- `remote`: Store dumped files on object storage. | | `target.remote.outputDir` | Output directory path in the cloud storage bucket. | | `target.remote.cloud` | Cloud storage service provider. Example values: `aws`, `gcp`, `azure`. | | `target.remote.region` | Cloud storage region. It can be any value if you use local MinIO. | diff --git a/site/zh/migrate/f2m.md b/site/zh/migrate/f2m.md index 637d7123c..77534510d 100644 --- a/site/zh/migrate/f2m.md +++ b/site/zh/migrate/f2m.md @@ -87,7 +87,7 @@ The following table describes the parameters in the example config file. For a f | Parameter | Description | | --- | --- | - | `source.mode` | Specifies where the source files are read from. Valid values:
- `local`: reads files from a local disk.
- `remote`: reads files from remote storage. | + | `source.mode` | Specifies where the source files are read from. Valid values:
- `local`: reads files from a local disk.
- `remote`: reads files from remote storage. | | `source.local.faissFile` | The directory path where the source files are located. For example, `/db/faiss.index`. | - `target` @@ -98,7 +98,7 @@ The following table describes the parameters in the example config file. For a f | `target.create.collection.shardsNums` | Number of shards to be created in the collection. For more information on shards, refer to [Terminology](https://milvus.io/docs/glossary.md#Shard). | | `target.create.collection.dim` | Dimension of the vector field. | | `target.create.collection.metricType` | Metric type used to measure similarities between vectors. For more information, refer to [Terminology](https://milvus.io/docs/glossary.md#Metric-type). | - | `target.mode` | Storage location for dumped files. Valid values:
- `local`: Store dumped files on local disks.
- `remote`: Store dumped files on object storage. | + | `target.mode` | Storage location for dumped files. Valid values:
- `local`: Store dumped files on local disks.
- `remote`: Store dumped files on object storage. | | `target.remote.outputDir` | Output directory path in the cloud storage bucket. | | `target.remote.cloud` | Cloud storage service provider. Example values: `aws`, `gcp`, `azure`. | | `target.remote.endpoint` | Endpoint of Milvus 2.x storage. | diff --git a/site/zh/migrate/m2m.md b/site/zh/migrate/m2m.md index c01a32535..caddd36d1 100644 --- a/site/zh/migrate/m2m.md +++ b/site/zh/migrate/m2m.md @@ -114,14 +114,14 @@ The following table describes the parameters in the example config file. For a f | Parameter | Description | | --- | --- | - | `source.mode` | Specifies where the source files are read from. Valid values:
- `local`: reads files from a local disk.
- `remote`: reads files from remote storage. | + | `source.mode` | Specifies where the source files are read from. Valid values:
- `local`: reads files from a local disk.
- `remote`: reads files from remote storage. | | `source.local.tablesDir` | The directory path where the source files are located. For example, `/db/tables/`. | - `target` | Parameter | Description | | --- | --- | - | `target.mode` | Storage location for dumped files. Valid values:
- `local`: Store dumped files on local disks.
- `remote`: Store dumped files on object storage. | + | `target.mode` | Storage location for dumped files. Valid values:
- `local`: Store dumped files on local disks.
- `remote`: Store dumped files on object storage. | | `target.remote.outputDir` | Output directory path in the cloud storage bucket. | | `target.remote.ak` | Access key for Milvus 2.x storage. | | `target.remote.sk` | Secret key for Milvus 2.x storage. | diff --git a/site/zh/reference/disk_index.md b/site/zh/reference/disk_index.md index 6e1aa1a36..746c10762 100644 --- a/site/zh/reference/disk_index.md +++ b/site/zh/reference/disk_index.md @@ -61,10 +61,10 @@ DiskIndex: | Parameter | Description | Value Range | Default Value | | --- | --- | --- | --- | -| `MaxDegree` | Maximum degree of the Vamana graph.
A larger value offers a higher recall rate but increases the size of and time to build the index. | [1, 512] | 56 | -| `SearchListSize` | Size of the candidate list.
A larger value increases the time spent on building the index but offers a higher recall rate.
Set it to a value smaller than `MaxDegree` unless you need to reduce the index-building time. | [1, int32_max] | 100 | -| `PQCodeBugetGBRatio` | Size limit on the PQ code.
A larger value offers a higher recall rate but increases memory usage. | (0.0, 0.25] | 0.125 | -| `SearchCacheBudgetGBRatio` | Ratio of cached node numbers to raw data.
A larger value improves index-building performance with increased memory usage. | [0.0, 0.3) | 0.10 | +| `MaxDegree` | Maximum degree of the Vamana graph.
A larger value offers a higher recall rate but increases the size of and time to build the index. | [1, 512] | 56 | +| `SearchListSize` | Size of the candidate list.
A larger value increases the time spent on building the index but offers a higher recall rate.
Set it to a value smaller than `MaxDegree` unless you need to reduce the index-building time. | [1, int32_max] | 100 | +| `PQCodeBugetGBRatio` | Size limit on the PQ code.
A larger value offers a higher recall rate but increases memory usage. | (0.0, 0.25] | 0.125 | +| `SearchCacheBudgetGBRatio` | Ratio of cached node numbers to raw data.
A larger value improves index-building performance with increased memory usage. | [0.0, 0.3) | 0.10 | | `BeamWidthRatio` | Ratio between the maximum number of IO requests per search iteration and CPU number. | [1, max(128 / CPU number, 16)] | 4.0 | ## Troubleshooting diff --git a/site/zh/userGuide/manage-collections.md b/site/zh/userGuide/manage-collections.md index 6af8c27c2..54414853a 100644 --- a/site/zh/userGuide/manage-collections.md +++ b/site/zh/userGuide/manage-collections.md @@ -335,11 +335,11 @@ export fields='[{ \ auto_id - Determines if the primary field automatically increments.
Setting this to True makes the primary field automatically increment. In this case, the primary field should not be included in the data to insert to avoid errors. The auto-generated IDs have a fixed length and cannot be altered. + Determines if the primary field automatically increments.
Setting this to True makes the primary field automatically increment. In this case, the primary field should not be included in the data to insert to avoid errors. The auto-generated IDs have a fixed length and cannot be altered. enable_dynamic_field - Determines if Milvus saves the values of undefined fields in a dynamic field if the data being inserted into the target collection includes fields that are not defined in the collection's schema.
When you set this to True, Milvus will create a field called $meta to store any undefined fields and their values from the data that is inserted. + Determines if Milvus saves the values of undefined fields in a dynamic field if the data being inserted into the target collection includes fields that are not defined in the collection's schema.
When you set this to True, Milvus will create a field called $meta to store any undefined fields and their values from the data that is inserted. field_name @@ -351,11 +351,11 @@ export fields='[{ \ is_primary - Whether the current field is the primary field in a collection.
Each collection has only one primary field. A primary field should be of either the DataType.INT64 type or the DataType.VARCHAR type. + Whether the current field is the primary field in a collection.
Each collection has only one primary field. A primary field should be of either the DataType.INT64 type or the DataType.VARCHAR type. dim - The dimension of the vector embeddings.
This is mandatory for a field of the DataType.FLOAT_VECTOR, DataType.BINARY_VECTOR, DataType.FLOAT16_VECTOR, or DataType.BFLOAT16_VECTOR type. If you use DataType.SPARSE_FLOAT_VECTOR, omit this parameter. + The dimension of the vector embeddings.
This is mandatory for a field of the DataType.FLOAT_VECTOR, DataType.BINARY_VECTOR, DataType.FLOAT16_VECTOR, or DataType.BFLOAT16_VECTOR type. If you use DataType.SPARSE_FLOAT_VECTOR, omit this parameter. @@ -378,15 +378,15 @@ export fields='[{ \ isPrimaryKey - Whether the current field is the primary field in a collection.
Each collection has only one primary field. A primary field should be of either the DataType.Int64 type or the DataType.VarChar type. + Whether the current field is the primary field in a collection.
Each collection has only one primary field. A primary field should be of either the DataType.Int64 type or the DataType.VarChar type. autoID - Whether allows the primary field to automatically increment.
Setting this to true makes the primary field automatically increment. In this case, the primary field should not be included in the data to insert to avoid errors. + Whether allows the primary field to automatically increment.
Setting this to true makes the primary field automatically increment. In this case, the primary field should not be included in the data to insert to avoid errors. dimension - The dimension of the vector embeddings.
This is mandatory for a field of the DataType.FloatVector, DataType.BinaryVector, DataType.Float16Vector, or DataType.BFloat16Vector type. + The dimension of the vector embeddings.
This is mandatory for a field of the DataType.FloatVector, DataType.BinaryVector, DataType.Float16Vector, or DataType.BFloat16Vector type. @@ -409,15 +409,15 @@ export fields='[{ \ is_primary_key - Whether the current field is the primary field in a collection.
Each collection has only one primary field. A primary field should be of either the DataType.INT64 type or the DataType.VARCHAR type. + Whether the current field is the primary field in a collection.
Each collection has only one primary field. A primary field should be of either the DataType.INT64 type or the DataType.VARCHAR type. auto_id - Whether the primary field automatically increments upon data insertions into this collection.
The value defaults to False. Setting this to True makes the primary field automatically increment. Skip this parameter if you need to set up a collection with a customized schema. + Whether the primary field automatically increments upon data insertions into this collection.
The value defaults to False. Setting this to True makes the primary field automatically increment. Skip this parameter if you need to set up a collection with a customized schema. dim - The dimensionality of the collection field that holds vector embeddings.
The value should be an integer greater than 1 and is usually determined by the model you use to generate vector embeddings. + The dimensionality of the collection field that holds vector embeddings.
The value should be an integer greater than 1 and is usually determined by the model you use to generate vector embeddings. @@ -980,11 +980,11 @@ Use [createCollection()](https://milvus.io/api-reference/node/v2.4.x/Collections schema - The schema of this collection.
Setting this to None indicates this collection will be created with default settings.
To set up a collection with a customized schema, you need to create a CollectionSchema object and reference it here. In this case, Milvus ignores all other schema-related settings carried in the request. + The schema of this collection.
Setting this to None indicates this collection will be created with default settings.
To set up a collection with a customized schema, you need to create a CollectionSchema object and reference it here. In this case, Milvus ignores all other schema-related settings carried in the request. index_params - The parameters for building the index on the vector field in this collection. To set up a collection with a customized schema and automatically load the collection to memory, you need to create an IndexParams object and reference it here.
You should at least add an index for the vector field in this collection. You can also skip this parameter if you prefer to set up the index parameters later on. + The parameters for building the index on the vector field in this collection. To set up a collection with a customized schema and automatically load the collection to memory, you need to create an IndexParams object and reference it here.
You should at least add an index for the vector field in this collection. You can also skip this parameter if you prefer to set up the index parameters later on. @@ -1003,7 +1003,7 @@ Use [createCollection()](https://milvus.io/api-reference/node/v2.4.x/Collections collectionSchema - The schema of this collection.
Leaving it empty indicates this collection will be created with default settings. To set up a collection with a customized schema, you need to create a CollectionSchema object and reference it here. + The schema of this collection.
Leaving it empty indicates this collection will be created with default settings. To set up a collection with a customized schema, you need to create a CollectionSchema object and reference it here. indexParams diff --git a/site/zh/userGuide/search-query-get/single-vector-search.md b/site/zh/userGuide/search-query-get/single-vector-search.md index fccbd81d1..96e1ecd2a 100644 --- a/site/zh/userGuide/search-query-get/single-vector-search.md +++ b/site/zh/userGuide/search-query-get/single-vector-search.md @@ -478,15 +478,15 @@ console.log(res.results) data - A list of vector embeddings.
Milvus searches for the most similar vector embeddings to the specified ones. + A list of vector embeddings.
Milvus searches for the most similar vector embeddings to the specified ones. limit - The total number of entities to return.
You can use this parameter in combination with offset in param to enable pagination.
The sum of this value and offset in param should be less than 16,384. + The total number of entities to return.
You can use this parameter in combination with offset in param to enable pagination.
The sum of this value and offset in param should be less than 16,384. search_params - The parameter settings specific to this operation.
+ The parameter settings specific to this operation.
@@ -505,11 +505,11 @@ console.log(res.results) data - A list of vector embeddings.
Milvus searches for the most similar vector embeddings to the specified ones. + A list of vector embeddings.
Milvus searches for the most similar vector embeddings to the specified ones. topK - The number of records to return in the search result. This parameter uses the same syntax as the limit parameter, so you should only set one of them.
You can use this parameter in combination with offset in param to enable pagination.
The sum of this value and offset in param should be less than 16,384. + The number of records to return in the search result. This parameter uses the same syntax as the limit parameter, so you should only set one of them.
You can use this parameter in combination with offset in param to enable pagination.
The sum of this value and offset in param should be less than 16,384. @@ -528,11 +528,11 @@ console.log(res.results) data - A list of vector embeddings.
Milvus searches for the most similar vector embeddings to the specified ones. + A list of vector embeddings.
Milvus searches for the most similar vector embeddings to the specified ones. limit - The total number of entities to return.
You can use this parameter in combination with offset in param to enable pagination.
The sum of this value and offset in param should be less than 16,384. + The total number of entities to return.
You can use this parameter in combination with offset in param to enable pagination.
The sum of this value and offset in param should be less than 16,384. diff --git a/site/zh/userGuide/search-query-get/with-iterators.md b/site/zh/userGuide/search-query-get/with-iterators.md index e32d3eba3..cdcb1d8b6 100644 --- a/site/zh/userGuide/search-query-get/with-iterators.md +++ b/site/zh/userGuide/search-query-get/with-iterators.md @@ -362,7 +362,7 @@ System.out.println(results.size()); data - A list of vector embeddings.
Milvus searches for the most similar vector embeddings to the specified ones. + A list of vector embeddings.
Milvus searches for the most similar vector embeddings to the specified ones. anns_field @@ -370,19 +370,19 @@ System.out.println(results.size()); batch_size - The number of entities to return each time you call next() on the current iterator.
The value defaults to 1000. Set it to a proper value to control the number of entities to return per iteration. + The number of entities to return each time you call next() on the current iterator.
The value defaults to 1000. Set it to a proper value to control the number of entities to return per iteration. param - The parameter settings specific to this operation.
+ The parameter settings specific to this operation.
output_fields - A list of field names to include in each entity in return.
The value defaults to None. If left unspecified, only the primary field is included. + A list of field names to include in each entity in return.
The value defaults to None. If left unspecified, only the primary field is included. limit - The total number of entities to return.
The value defaults to -1, indicating all matching entities will be in return. + The total number of entities to return.
The value defaults to -1, indicating all matching entities will be in return. @@ -409,7 +409,7 @@ System.out.println(results.size()); withBatchSize - The number of entities to return each time you call next() on the current iterator.
The value defaults to 1000. Set it to a proper value to control the number of entities to return per iteration. + The number of entities to return each time you call next() on the current iterator.
The value defaults to 1000. Set it to a proper value to control the number of entities to return per iteration. withParams @@ -551,19 +551,19 @@ while (true) { batch_size - The number of entities to return each time you call next() on the current iterator.
The value defaults to 1000. Set it to a proper value to control the number of entities to return per iteration. + The number of entities to return each time you call next() on the current iterator.
The value defaults to 1000. Set it to a proper value to control the number of entities to return per iteration. expr - A scalar filtering condition to filter matching entities.
The value defaults to None, indicating that scalar filtering is ignored. To build a scalar filtering condition, refer to Boolean Expression Rules. + A scalar filtering condition to filter matching entities.
The value defaults to None, indicating that scalar filtering is ignored. To build a scalar filtering condition, refer to Boolean Expression Rules. output_fields - A list of field names to include in each entity in return.
The value defaults to None. If left unspecified, only the primary field is included. + A list of field names to include in each entity in return.
The value defaults to None. If left unspecified, only the primary field is included. limit - The total number of entities to return.
The value defaults to -1, indicating all matching entities will be in return. + The total number of entities to return.
The value defaults to -1, indicating all matching entities will be in return. @@ -586,7 +586,7 @@ while (true) { withBatchSize - The number of entities to return each time you call next() on the current iterator.
The value defaults to 1000. Set it to a proper value to control the number of entities to return per iteration. + The number of entities to return each time you call next() on the current iterator.
The value defaults to 1000. Set it to a proper value to control the number of entities to return per iteration. addOutField