Skip to content

Commit

Permalink
Merge branch 'main' into patch-1
Browse files Browse the repository at this point in the history
  • Loading branch information
kolchfa-aws authored Jul 30, 2024
2 parents 35c53ce + 8e03f53 commit 5cc2512
Show file tree
Hide file tree
Showing 9 changed files with 229 additions and 70 deletions.
1 change: 1 addition & 0 deletions .github/vale/styles/Vocab/OpenSearch/Products/accept.txt
Original file line number Diff line number Diff line change
Expand Up @@ -86,6 +86,7 @@ RPM Package Manager
Ruby
Simple Schema for Observability
Tableau
Textract
TorchScript
Tribuo
VisBuilder
Expand Down
2 changes: 1 addition & 1 deletion _api-reference/document-apis/bulk.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@ routing | String | Routes the request to the specified shard.
timeout | Time | How long to wait for the request to return. Default `1m`.
type | String | (Deprecated) The default document type for documents that don't specify a type. Default is `_doc`. We highly recommend ignoring this parameter and using a type of `_doc` for all indexes.
wait_for_active_shards | String | Specifies the number of active shards that must be available before OpenSearch processes the bulk request. Default is 1 (only the primary shard). Set to `all` or a positive integer. Values greater than 1 require replicas. For example, if you specify a value of 3, the index must have two replicas distributed across two additional nodes for the request to succeed.
batch_size | Integer | Specifies the number of documents to be batched and sent to an ingest pipeline to be processed together. Default is `1` (documents are ingested by an ingest pipeline one at a time). If the bulk request doesn't explicitly specify an ingest pipeline or the index doesn't have a default ingest pipeline, then this parameter is ignored. Only documents with `create`, `index`, or `update` actions can be grouped into batches.
batch_size | Integer | **(Deprecated)** Specifies the number of documents to be batched and sent to an ingest pipeline to be processed together. Default is `2147483647` (documents are ingested by an ingest pipeline all at once). If the bulk request doesn't explicitly specify an ingest pipeline or the index doesn't have a default ingest pipeline, then this parameter is ignored. Only documents with `create`, `index`, or `update` actions can be grouped into batches.
{% comment %}_source | List | asdf
_source_excludes | list | asdf
_source_includes | list | asdf{% endcomment %}
Expand Down
1 change: 1 addition & 0 deletions _ingest-pipelines/processors/sparse-encoding.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@ The following table lists the required and optional parameters for the `sparse_e
`field_map.<vector_field>` | String | Required | The name of the vector field in which to store the generated vector embeddings.
`description` | String | Optional | A brief description of the processor. |
`tag` | String | Optional | An identifier tag for the processor. Useful for debugging to distinguish between processors of the same type. |
`batch_size` | Integer | Optional | Specifies the number of documents to be batched and processed each time. Default is `1`. |

## Using the processor

Expand Down
1 change: 1 addition & 0 deletions _ingest-pipelines/processors/text-embedding.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@ The following table lists the required and optional parameters for the `text_emb
`field_map.<vector_field>` | String | Required | The name of the vector field in which to store the generated text embeddings.
`description` | String | Optional | A brief description of the processor. |
`tag` | String | Optional | An identifier tag for the processor. Useful for debugging to distinguish between processors of the same type. |
`batch_size` | Integer | Optional | Specifies the number of documents to be batched and processed each time. Default is `1`. |

## Using the processor

Expand Down
40 changes: 40 additions & 0 deletions _install-and-configure/additional-plugins/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
---
layout: default
title: Additional plugins
parent: Installing plugins
nav_order: 10
---

# Additional plugins

There are many more plugins available in addition to those provided by the standard distribution of OpenSearch. These additional plugins have been built by OpenSearch developers or members of the OpenSearch community. While it isn't possible to provide an exhaustive list (because many plugins are not maintained in an OpenSearch GitHub repository), the following plugins, available in the [OpenSearch/plugins](https://github.com/opensearch-project/OpenSearch/tree/main/plugins) directory on GitHub, are some of the plugins that can be installed using one of the installation options, for example, using the command `bin/opensearch-plugin install <plugin-name>`.


| Plugin name | Earliest available version |
| :--- | :--- |
| analysis-icu | 1.0.0 |
| analysis-kuromoji | 1.0.0 |
| analysis-nori | 1.0.0 |
| analysis-phonetic | 1.0.0 |
| analysis-smartcn | 1.0.0 |
| analysis-stempel | 1.0.0 |
| analysis-ukrainian | 1.0.0 |
| discovery-azure-classic | 1.0.0 |
| discovery-ec2 | 1.0.0 |
| discovery-gce | 1.0.0 |
| ingest-attachment | 1.0.0 |
| mapper-annotated-text | 1.0.0 |
| mapper-murmur3 | 1.0.0 |
| [`mapper-size`]({{site.url}}{{site.baseurl}}/install-and-configure/additional-plugins/mapper-size-plugin/) | 1.0.0 |
| query-insights | 2.12.0 |
| repository-azure | 1.0.0 |
| repository-gcs | 1.0.0 |
| repository-hdfs | 1.0.0 |
| repository-s3 | 1.0.0 |
| store-smb | 1.0.0 |
| transport-nio | 1.0.0 |


## Related articles
[Installing plugins]({{site.url}}{{site.baseurl}}/install-and-configure/plugins/)
[`mapper-size` plugin]({{site.url}}{{site.baseurl}}/install-and-configure/additional-plugins/mapper-size-plugin/)
100 changes: 100 additions & 0 deletions _install-and-configure/additional-plugins/mapper-size-plugin.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
---
layout: default
title: Mapper-size plugin
parent: Installing plugins
nav_order: 20

---

# Mapper-size plugin

The `mapper-size` plugin enables the use of the `_size` field in OpenSearch indexes. The `_size` field stores the size, in bytes, of each document.

## Installing the plugin

You can install the `mapper-size` plugin using the following command:

```sh
./bin/opensearch-plugin install mapper-size
```

## Examples

After starting up a cluster, you can create an index with size mapping enabled, index a document, and search for documents, as shown in the following examples.

### Create an index with size mapping enabled

```sh
curl -XPUT example-index -H "Content-Type: application/json" -d '{
"mappings": {
"_size": {
"enabled": true
},
"properties": {
"name": {
"type": "text"
},
"age": {
"type": "integer"
}
}
}
}'
```

### Index a document

```sh
curl -XPOST example-index/_doc -H "Content-Type: application/json" -d '{
"name": "John Doe",
"age": 30
}'
```

### Query the index

```sh
curl -XGET example-index/_search -H "Content-Type: application/json" -d '{
"query": {
"match_all": {}
},
"stored_fields": ["_size", "_source"]
}'
```

### Query results

In the following example, the `_size` field is included in the query results and shows the size, in bytes, of the indexed document:

```json
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": 1.0,
"hits": [
{
"_index": "example_index",
"_id": "Pctw0I8BLto8I5f_NLKK",
"_score": 1.0,
"_size": 37,
"_source": {
"name": "John Doe",
"age": 30
}
}
]
}
}
```

Loading

0 comments on commit 5cc2512

Please sign in to comment.