Skip to content

Commit

Permalink
Updated USER_GUIDE.md and Added 6 Guides (#162)
Browse files Browse the repository at this point in the history
Signed-off-by: Theo Truong <[email protected]>
  • Loading branch information
nhtruong authored Apr 4, 2023
1 parent a8c421d commit e45dd59
Show file tree
Hide file tree
Showing 7 changed files with 893 additions and 37 deletions.
48 changes: 11 additions & 37 deletions USER_GUIDE.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,7 @@
- [User Guide](#user-guide)
- [Setup](#setup)
- [Sample code](#sample-code)
- [Basic Usage](#basic-usage)
- [Point in Time](#point-in-time)
- [Basic Usage](#basic-usage)
- [Guides by Topics](#guides-by-topics)
- [Amazon OpenSearch Service](#amazon-opensearch-service)

# User Guide
Expand All @@ -25,16 +24,14 @@ Import the client:

`require 'opensearch'`

## Sample code
<a name="basic-usage" /></a>
### Basic Usage
## Basic Usage
```ruby
require 'opensearch'

client = OpenSearch::Client.new(
host: 'https://localhost:9200',
user: 'admin',
password: 'admin'
password: 'admin',
transport_options: { ssl: { verify: false } } # For testing only. Use certificate for validation.
)

Expand Down Expand Up @@ -109,36 +106,13 @@ response = client.indices.delete(
puts response
```

### Point in Time
Refer to OpenSearch [documentation](https://opensearch.org/docs/latest/point-in-time-api/) for more information on point in time.
```ruby
require 'opensearch-ruby'
client = OpenSearch::Client.new({ host: 'localhost' })
index = :movies
client.indices.create(index: 'movies')

# CREATE 3 PITS
client.create_pit index: index, keep_alive: '1m'
client.create_pit index: index, keep_alive: '1m'
client.create_pit index: index, keep_alive: '1m'

# GET ALL PITS
pits = client.get_all_pits
puts pits

# DELETE FIRST PIT
client.delete_pit body: { pit_id: [pits.dig('pits', 0, 'pit_id')] }

# ALL PITS SEGMENTS
puts client.cat.all_pit_segments

# SEGMENTS FOR A SPECIFIC PIT
puts client.cat.pit_segments body: { pit_id: [pits.dig('pits', 1, 'pit_id')] }


# DELETE ALL PITS
puts client.delete_all_pits
```
## Guides by Topics
- [Index Lifecycle](guides/index_lifecycle.md)
- [Document Lifecycle](guides/document_lifecycle.md)
- [Search](guides/search.md)
- [Bulk](guides/bulk.md)
- [Advanced Index Actions](guides/advanced_index_actions.md)
- [Index Templates](guides/index_template.md)

## Amazon OpenSearch Service

Expand Down
90 changes: 90 additions & 0 deletions guides/advanced_index_actions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
# Advanced Index Actions
In this guide, we will look at some advanced index actions that are not covered in the [Index Lifecycle](index_lifecycle.md) guide.


## Setup
Let's create a client instance, and an index named `movies`:
```ruby
require 'opensearch-ruby'
client = OpenSearch::Client.new(
host: 'https://admin:admin@localhost:9200',
transport_options: { ssl: { verify: false } })
client.indices.create(index: :movies)
```
## API Actions
### Clear index cache
You can clear the cache of an index or indices by using the `indices.clear_cache` API action. The following example clears the cache of the `movies` index:

```ruby
client.indices.clear_cache(index: :movies)
```

By default, the `indices.clear_cache` API action clears all types of cache. To clear specific types of cache pass the the `query`, `fielddata`, or `request` parameter to the API action:

```ruby
client.indices.clear_cache(index: :movies, query: true)
client.indices.clear_cache(index: :movies, fielddata: true, request: true)
```

### Flush index
Sometimes you might want to flush an index or indices to make sure that all data in the transaction log is persisted to the index. To flush an index or indices use the `indices.flush` API action. The following example flushes the `movies` index:

```ruby
client.indices.flush(index: :movies)
```

### Refresh index
You can refresh an index or indices to make sure that all changes are available for search. To refresh an index or indices use the `indices.refresh` API action:

```ruby
client.indices.refresh(index: :movies)
```

### Open/Close index
You can close an index to prevent read and write operations on the index. A closed index does not have to maintain certain data structures that an opened index require, reducing the memory and disk space required by the index. The following example closes and reopens the `movies` index:

```ruby
client.indices.close(index: :movies)
client.indices.open(index: :movies)
```
### Force merge index
You can force merge an index or indices to reduce the number of segments in the index. This can be useful if you have a large number of small segments in the index. Merging segments reduces the memory footprint of the index. Do note that this action is resource intensive and it is only recommended for read-only indices. The following example force merges the `movies` index:

```ruby
client.indices.forcemerge(index: :movies)
```

### Clone index
You can clone an index to create a new index with the same mappings, data, and MOST of the settings. The source index must be in read-only state for cloning. The following example blocks write operations from `movies` index, clones the said index to create a new index named `movies_clone`, then re-enables write:

```ruby
client.indices.add_block(index: :movies, block: :write)
client.indices.clone(index: :movies, target: :movies_clone)
client.indices.put_settings(index: :movies, body: { index: { blocks: { write: false } } })
```

### Split index
You can split an index into another index with more primary shards. The source index must be in read-only state for splitting. The following example create the read-only `books` index with 30 routing shards and 5 shards (which is divisible by 30), splits index into `bigger_books` with 10 shards (which is also divisible by 30), then re-enables write:

```ruby
client.indices.create(
index: :books,
body: { settings: {
index: { number_of_shards: 5,
number_of_routing_shards: 30,
blocks: { write: true } } } })

client.indices.split(
index: :books,
target: :bigger_books,
body: { settings: { index: { number_of_shards: 10 } } })

client.indices.put_settings(index: :books, body: { index: { blocks: { write: false } } })
```

## Cleanup

Let's delete all the indices we created in this guide:
```ruby
client.indices.delete(index: %i[movies books movies_clone bigger_books])
```
141 changes: 141 additions & 0 deletions guides/bulk.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,141 @@
# Bulk

In this guide, you'll learn how to use the OpenSearch Ruby Client API to perform bulk operations. You'll learn how to index, update, and delete multiple documents in a single request.

## Setup
First, create a client instance with the following code:

```ruby
require 'opensearch-ruby'
client = OpenSearch::Client.new({ host: 'localhost' })
```

Next, create an index named `movies` and another named `books` with the default settings:

```ruby
movies = 'movies'
books = 'books'
client.indices.create(index: movies) unless client.indices.exists?(index: movies)
client.indices.create(index: books) unless client.indices.exists?(index: books)
```


## Bulk API

The `bulk` API action allows you to perform document operations in a single request. The body of the request is an array of objects that contains the bulk operations and the target documents to index, create, update, or delete.

### Indexing multiple documents
The following code creates two documents in the `movies` index and one document in the `books` index:

```ruby
client.bulk(
body: [
{ index: { _index: movies, _id: 1 } },
{ title: 'Beauty and the Beast', year: 1991 },
{ index: { _index: movies, _id: 2 } },
{ title: 'Beauty and the Beast - Live Action', year: 2017 },
{ index: { _index: books, _id: 1 } },
{ title: 'The Lion King', year: 1994 }
]
)
```
As you can see, each bulk operation is comprised of two objects. The first object contains the operation type and the target document's `_index` and `_id`. The second object contains the document's data. As a result, the body of the request above contains six objects for three index actions.

Alternatively, the `bulk` method can accept an array of hashes where each hash represents a single operation. The following code is equivalent to the previous example:

```ruby
client.bulk(
body: [
{ index: { _index: movies, _id: 1, data: { title: 'Beauty and the Beast', year: 1991 } } },
{ index: { _index: movies, _id: 2, data: { title: 'Beauty and the Beast - Live Action', year: 2017 } } },
{ index: { _index: books, _id: 1, data: { title: 'The Lion King', year: 1994 } } }
]
)
```

We will use this format for the rest of the examples in this guide.

### Creating multiple documents

Similarly, instead of calling the `create` method for each document, you can use the `bulk` API to create multiple documents in a single request. The following code creates three documents in the `movies` index and one in the `books` index:

```ruby
client.bulk(
index: movies,
body: [
{ create: { data: { title: 'Beauty and the Beast 2', year: 2030 } } },
{ create: { data: { title: 'Beauty and the Beast 3', year: 2031 } } },
{ create: { data: { title: 'Beauty and the Beast 4', year: 2049 } } },
{ create: { _index: books, data: { title: 'The Lion King 2', year: 1998 } } }
]
)
```
Note that we specified only the `_index` for the last document in the request body. This is because the `bulk` method accepts an `index` parameter that specifies the default `_index` for all bulk operations in the request body. Moreover, we omit the `_id` for each document and let OpenSearch generate them for us in this example, just like we can with the `create` method.

### Updating multiple documents
```ruby
client.bulk(
index: movies,
body: [
{ update: { _id: 1, data: { doc: { year: 1992 } } } },
{ update: { _id: 2, data: { doc: { year: 2018 } } } }
]
)
```
Note that the updated data is specified in the `doc` field of the `data` object.


### Deleting multiple documents
```ruby
client.bulk(
index: movies,
body: [
{ delete: { _id: 1 } },
{ delete: { _id: 2 } }
]
)
```

### Mix and match operations
You can mix and match the different operations in a single request. The following code creates two documents, updates one document, and deletes another document:

```ruby
client.bulk(
index: movies,
body: [
{ create: { data: { title: 'Beauty and the Beast 5', year: 2050 } } },
{ create: { data: { title: 'Beauty and the Beast 6', year: 2051 } } },
{ update: { _id: 3, data: { doc: { year: 2052 } } } },
{ delete: { _id: 4 } }
]
)
```

### Handling errors
The `bulk` API returns an array of responses for each operation in the request body. Each response contains a `status` field that indicates whether the operation was successful or not. If the operation was successful, the `status` field is set to a `2xx` code. Otherwise, the response contains an error message in the `error` field.

The following code shows how to look for errors in the response:

```ruby
response = client.bulk(
index: movies,
body: [
{ create: { _id: 1, data: { title: 'Beauty and the Beast', year: 1991 } } },
{ create: { _id: 2, data: { title: 'Beauty and the Beast 2', year: 2030 } } },
{ create: { _id: 1, data: { title: 'Beauty and the Beast 3', year: 2031 } } }, # document already exists error
{ create: { _id: 2, data: { title: 'Beauty and the Beast 4', year: 2049 } } } # document already exists error
]
)

response['items'].each do |item|
next if item.dig('create', 'status').between?(200, 299)
puts item.dig('create', 'error', 'reason')
end
```

## Cleanup
To clean up the resources created in this guide, delete the `movies` and `books` indices:

```ruby
client.indices.delete(index: [movies, books])
```
Loading

0 comments on commit e45dd59

Please sign in to comment.