Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CCI][GUIDE] Added Guide for Snapshot API(snapshot.md) #486

Closed
wants to merge 81 commits into from

Conversation

roma2023
Copy link
Contributor

@roma2023 roma2023 commented Sep 5, 2023

Description

Added Snapshot Actions Guide (guides/snapshot.md) with code examples

Issues Resolved

issue #406 resolved

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@codecov
Copy link

codecov bot commented Sep 5, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 72.12%. Comparing base (7b0b58d) to head (dc0cf68).
Report is 75 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #486   +/-   ##
=======================================
  Coverage   72.12%   72.12%           
=======================================
  Files          89       89           
  Lines        7939     7939           
=======================================
  Hits         5726     5726           
  Misses       2213     2213           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@dblock dblock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will you add a working sample into samples based on this guide? Something that creates, gets, cleans up a snapshot?

CHANGELOG.md Outdated
@@ -42,6 +42,7 @@ Inspired from [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)
- Improved CI performance of integration with unreleased OpenSearch ([#318](https://github.com/opensearch-project/opensearch-py/pull/318))
- Added k-NN guide and samples ([#449](https://github.com/opensearch-project/opensearch-py/pull/449))
- Added the ability to run tests matching a pattern to `.ci/run-tests` ([#454](https://github.com/opensearch-project/opensearch-py/pull/454))
- Added new guide: `snapshot.md` for Snapshot API. ([#486](https://github.com/opensearch-project/opensearch-py/pull/429))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Say "Added a guide for taking snapshots"

guides/snapshot.md Show resolved Hide resolved
"location": "/path/to/repo",
}
}
client.snapshot.create_repository(repository='my_repository', body=repo_body)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the response from this? Should print something out as a result.

guides/snapshot.md Show resolved Hide resolved
repo_body = {
"type": "fs", # Replace 'fs' with the appropriate repository type
"settings": {
"location": "/path/to/repo", # Replace with the desired repository location
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not really sure if I need to keep this as a location example for it to be replaced by the client or if is there a way to retrieve the default location for a file system repository.

Copy link
Member

@dblock dblock Sep 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would tempfile.TemporaryDirectory work for the sample against a locally running / docker OpenSearch?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the prerequisites for running samples? I ran docker and followed the instructions in the samples/README.md, but constantly having the following error on the Mac:
photo_2023-09-12_22-59-30
So far, I have tried it on both Mac and Windows.
This is the result on Windows:
image
I would appreciate your insight!

Copy link
Collaborator

@saimedhi saimedhi Sep 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For windows may be use cd opensearch-py/samples

Copy link
Contributor Author

@roma2023 roma2023 Sep 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I apologize for the wrong screenshot. Yes, I have tried it within samples directory. I get the following error:

$ poetry run hello/hello.py
"hello" is not an internal or external
command, executable program, or batch file.
(base)

And I am running this on my main branch, which is up-to-date with my remote origin. Could you please let me know if there are any pre-steps I need to take besides the ones on ./samples/README.md?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dblock Could you please take a look at this issue?

Copy link
Collaborator

@saimedhi saimedhi Sep 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

@roma2023 roma2023 Sep 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dblock @saimedhi How can I run the docker with the OpenSearch server write permission?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am trying to run snapshot volume following the steps specified here: https://opensearch.org/docs/latest/tuning-your-cluster/availability-and-recovery/snapshots/snapshot-restore/,

but the poetry run python snapshot/snapshot_sample fails with the following error:

$ poetry run python snapshot/snapshot_sample.py
Traceback (most recent call last):
  File "C:\Users\HP\Desktop\AmazonOpenSearchService\Opensearch-py-ml\opensearch-py\samples\snapshot\snapshot_sample.py", line 39, in <module>
    response = client.snapshot.create_repository(repository = repository_name, body = repo_body)
  File "C:\Users\HP\AppData\Local\pypoetry\Cache\virtualenvs\package-yvpjSqWs-py3.9\lib\site-packages\opensearchpy\client\utils.py", line 179, in _wrapped
    return func(*args, params=params, headers=headers, **kwargs)
  File "C:\Users\HP\AppData\Local\pypoetry\Cache\virtualenvs\package-yvpjSqWs-py3.9\lib\site-packages\opensearchpy\client\snapshot.py", line 185, in create_repository
    return self.transport.perform_request(
  File "C:\Users\HP\AppData\Local\pypoetry\Cache\virtualenvs\package-yvpjSqWs-py3.9\lib\site-packages\opensearchpy\transport.py", line 409, in perform_request
    raise e
  File "C:\Users\HP\AppData\Local\pypoetry\Cache\virtualenvs\package-yvpjSqWs-py3.9\lib\site-packages\opensearchpy\transport.py", line 370, in perform_request
    status, headers_response, data = connection.perform_request(
  File "C:\Users\HP\AppData\Local\pypoetry\Cache\virtualenvs\package-yvpjSqWs-py3.9\lib\site-packages\opensearchpy\connection\http_urllib3.py", line 266, in perform_request
    self._raise_error(
  File "C:\Users\HP\AppData\Local\pypoetry\Cache\virtualenvs\package-yvpjSqWs-py3.9\lib\site-packages\opensearchpy\connection\base.py", line 301, in _raise_error
    raise HTTP_EXCEPTIONS.get(status_code, TransportError)(
opensearchpy.exceptions.TransportError: TransportError(500, 'repository_verification_exception', 'failed to create blob container')

I believe this is because docker is denying the write permission.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know the details of how this feature works, but I imagine that the docker container will need something like --env path.repo=/tmp. It would be totally fine to document that to run this sample you need to create a docker container in a certain way.

dblock
dblock previously approved these changes Sep 15, 2023
Copy link
Member

@dblock dblock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can I nitpick? :)

We have a 2 variables that aren't reused temp_repo and temp_repo_location in more than 1 place, and 2 names for the same thing: temp_repo, temp_repo_location.

Is a 1-liner cleaner?

repo_body = {
   ...
   "location": tempfile.TemporaryDirectory().name 
}

Ignore me if you don't agree ;)

@dblock
Copy link
Member

dblock commented Nov 9, 2023

@roma2023 Want to finish this?

@roma2023
Copy link
Contributor Author

Sure, but I got stuck with this issue: How can I run the docker with the OpenSearch server write permission? Here is more about the details: #486 (comment). Would appreciate your guidance)

@dblock
Copy link
Member

dblock commented Nov 21, 2023

@roma2023 Do you need more help? Let's finish this one?

@roma2023
Copy link
Contributor Author

I am trying to use shared file system as a snapshot in samples/snapshot/snapshot_sample.py I followed the steps specified here (https://opensearch.org/docs/latest/tuning-your-cluster/availability-and-recovery/snapshots/snapshot-restore/#shared-file-system) and added path.repo: ["/mnt/snapshots"] to opensearch.yml , then when running the code, I get the following error:

$ poetry run python snapshot/snapshot_sample.py
Traceback (most recent call last):
  File "C:\Users\HP\Desktop\AmazonOpenSearchService\Opensearch-py-ml\opensearch-py\samples\snapshot\snapshot_sample.py", line 39, in <module>
    response = client.snapshot.create_repository(repository = repository_name, body = repo_body)
  File "C:\Users\HP\AppData\Local\pypoetry\Cache\virtualenvs\package-yvpjSqWs-py3.9\lib\site-packages\opensearchpy\client\utils.py", line 179, in _wrapped
    return func(*args, params=params, headers=headers, **kwargs)
  File "C:\Users\HP\AppData\Local\pypoetry\Cache\virtualenvs\package-yvpjSqWs-py3.9\lib\site-packages\opensearchpy\client\snapshot.py", line 185, in create_repository
    return self.transport.perform_request(
  File "C:\Users\HP\AppData\Local\pypoetry\Cache\virtualenvs\package-yvpjSqWs-py3.9\lib\site-packages\opensearchpy\transport.py", line 409, in perform_request
    raise e
  File "C:\Users\HP\AppData\Local\pypoetry\Cache\virtualenvs\package-yvpjSqWs-py3.9\lib\site-packages\opensearchpy\transport.py", line 370, in perform_request
    status, headers_response, data = connection.perform_request(
  File "C:\Users\HP\AppData\Local\pypoetry\Cache\virtualenvs\package-yvpjSqWs-py3.9\lib\site-packages\opensearchpy\connection\http_urllib3.py", line 266, in perform_request
    self._raise_error(
  File "C:\Users\HP\AppData\Local\pypoetry\Cache\virtualenvs\package-yvpjSqWs-py3.9\lib\site-packages\opensearchpy\connection\base.py", line 301, in _raise_error
    raise HTTP_EXCEPTIONS.get(status_code, TransportError)(
opensearchpy.exceptions.TransportError: TransportError(500, 'repository_verification_exception', 'failed to create blob container')

It seems to me that the OpenSearch process does not have read and write permissions to /mnt/snapshots directory.

I do not know how to solve this issue. If someone could pick this up from here that would be amazing.

@dblock
Copy link
Member

dblock commented Nov 21, 2023

Yes, you will need to chmod it inside the container. Here's what I did.

  1. Created a new Dockerfile with the following.
FROM opensearchproject/opensearch:2.11.0

ARG OPENSEARCH_HOME=/usr/share/opensearch
ARG UID=1000
ARG GID=1000

RUN echo 'path.repo: ["/usr/share/opensearch/backups"]' >> $OPENSEARCH_HOME/config/opensearch.yml
RUN mkdir -p $OPENSEARCH_HOME/backups 
RUN chown -Rv $UID:$GID $OPENSEARCH_HOME/backups
  1. docker buildx build -t opensearch-snapshot-restore .
  2. docker run -d -p 9200:9200 -p 9600:9600 -e "discovery.type=single-node" opensearch-snapshot-restore
  3. Change temp_repo_location = "/usr/share/opensearch/backups" in your code.

Code ran:

$ poetry run python snapshot/snapshot_sample.py 
/usr/share/opensearch/backups
{'acknowledged': True}
{'accepted': True}
{'snapshots': [{'snapshot': 'my_snapshot', 'uuid': 'Q4_hA4jATWmsb6ex38qRsQ', 'version_id': 136327827, 'version': '2.11.0', 'remote_store_index_shallow_copy': False, 'indices': ['test-snapshot'], 'data_streams': [], 'include_global_state': True, 'state': 'IN_PROGRESS', 'start_time': '2023-11-21T20:28:44.288Z', 'start_time_in_millis': 1700598524288, 'end_time': '1970-01-01T00:00:00.000Z', 'end_time_in_millis': 0, 'duration_in_millis': 0, 'failures': [], 'shards': {'total': 0, 'failed': 0, 'successful': 0}}]}

LMK if this doesn't unblock you.

@dblock
Copy link
Member

dblock commented Dec 12, 2023

@roma2023 want to finish this?

@roma2023
Copy link
Contributor Author

@dblock Yes, I will finish this soon.

@roma2023
Copy link
Contributor Author

Apologize for the delay

@roma2023 roma2023 force-pushed the snapshot-md-guide branch 2 times, most recently from cf0faba to fd1c9df Compare December 14, 2023 08:26
dblock
dblock previously approved these changes Dec 15, 2023
Copy link
Member

@dblock dblock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rebase, update nit?

USER_GUIDE.md Outdated
@@ -153,6 +153,7 @@ print(response)
- [Search](guides/search.md)
- [Point in Time](guides/point_in_time.md)
- [Using a Proxy](guides/proxy.md)
- [Taking a Snapshot](guides/snapshot.md)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doc is more than taking, let's say "Working with Snapshots" or "Snapshot Actions" to match what's in the doc?

Signed-off-by: Raman Saparkhan <[email protected]>
Signed-off-by: roma2023 <[email protected]>
dblock and others added 23 commits December 28, 2023 20:35
* Prepare for next developer iteration, 2.4.1.

Signed-off-by: dblock <[email protected]>

* Fix: sync opensearchpy without iohttp.

Signed-off-by: dblock <[email protected]>

* Use nox to run tests.

Signed-off-by: dblock <[email protected]>

---------

Signed-off-by: dblock <[email protected]>
Signed-off-by: roma2023 <[email protected]>
* Fix: TypeError on calling parallel_bulk.

Signed-off-by: dblock <[email protected]>

* Added a sample that uses a bulk function generator.

Signed-off-by: dblock <[email protected]>

---------

Signed-off-by: dblock <[email protected]>
Signed-off-by: roma2023 <[email protected]>
…roject#604)

* Fix invalid value type.

Signed-off-by: dblock <[email protected]>

* Workaround Incompatible types in assignment (expression has type float, variable has type Double)  [assignment]

Signed-off-by: dblock <[email protected]>

---------

Signed-off-by: dblock <[email protected]>
Signed-off-by: roma2023 <[email protected]>
* Added pylint.

Signed-off-by: dblock <[email protected]>

* Enforce pylint:invalid-name.

Signed-off-by: dblock <[email protected]>

* Updated the generated code header to prevent broken links.

Signed-off-by: dblock <[email protected]>

* Swapped order of messages.

Signed-off-by: dblock <[email protected]>

---------

Signed-off-by: dblock <[email protected]>
Signed-off-by: roma2023 <[email protected]>
…ensearch-project#579)

* Added a guide & sample for a custom logger client implementation.

Signed-off-by: Djcarrillo6 <[email protected]>

Black formatter

Signed-off-by: Djcarrillo6 <[email protected]>

* Changes from PR review

Signed-off-by: Djcarrillo6 <[email protected]>

Fixed import formatting in sample code for gudie.

Signed-off-by: Djcarrillo6 <[email protected]>

Fixed nox formatting of log collection sample module.

Signed-off-by: Djcarrillo6 <[email protected]>

Added types to log_collection_sample.py

Signed-off-by: Djcarrillo6 <[email protected]>

Added type ignore to StramHandler class

Signed-off-by: Djcarrillo6 <[email protected]>

Added formatting change

Signed-off-by: Djcarrillo6 <[email protected]>

* Added PR review changes.

Signed-off-by: Djcarrillo6 <[email protected]>

Fixed typo in CHANGELOG.

Signed-off-by: Djcarrillo6 <[email protected]>

Requested changes.

Signed-off-by: Djcarrillo6 <[email protected]>

Requested changes again.

Signed-off-by: Djcarrillo6 <[email protected]>

Added link in USER_GUIDE.md.

Signed-off-by: Djcarrillo6 <[email protected]>

---------

Signed-off-by: Djcarrillo6 <[email protected]>
Signed-off-by: roma2023 <[email protected]>
…earch-project#614)

* Add GitHub action for opensearch-py release

Signed-off-by: Zelin Hao <[email protected]>

* Generate GitHub release at the end

Signed-off-by: Zelin Hao <[email protected]>

* Update CHANGELOG

Signed-off-by: Zelin Hao <[email protected]>

* Update CHANGELOG

Signed-off-by: Zelin Hao <[email protected]>

---------

Signed-off-by: Zelin Hao <[email protected]>
Signed-off-by: roma2023 <[email protected]>
* remove unnecessary utf-8 header in .py files

Signed-off-by: samuel orji <[email protected]>

* review feedback: add link to changelog

Signed-off-by: samuel orji <[email protected]>

---------

Signed-off-by: samuel orji <[email protected]>
Signed-off-by: roma2023 <[email protected]>
…pensearch-project#615) (opensearch-project#617)

* License tools: remove utf-8 coding declaration from license_headers.py check/fix script; since opensearch-project#615 it is no longer used in the library's codebase

UTF-8 is the default encoding used to read source code files for Python3 - see https://docs.python.org/3/howto/unicode.html#unicode-literals-in-python-source-code

Signed-off-by: James Addison <[email protected]>

* Cleanup: remove utf-8 coding declaration from the license_headers.py script itself

Signed-off-by: James Addison <[email protected]>

* Update CHANGELOG.md

Signed-off-by: James Addison <[email protected]>

---------

Signed-off-by: James Addison <[email protected]>
Signed-off-by: roma2023 <[email protected]>
* Add OpenSearch 2.11.1 integration tests.

Signed-off-by: dblock <[email protected]>

* Exclude flaky integration tests with OpenSearch 2.0.1.

Signed-off-by: dblock <[email protected]>

---------

Signed-off-by: dblock <[email protected]>
Signed-off-by: roma2023 <[email protected]>
…oject#616)

Updated CHANGELOG.md.

nox formatting applied.

Added new unit test for actions scan function.

Added type hints & nox formatting.

Added fix to async scan function & added matching unit tests for async.

Signed-off-by: Djcarrillo6 <[email protected]>
Signed-off-by: roma2023 <[email protected]>
…ch-project#625)

* Update pytest-asyncio requirement from <=0.21.1 to <=0.23.2

Updates the requirements on [pytest-asyncio](https://github.com/pytest-dev/pytest-asyncio) to permit the latest version.
- [Release notes](https://github.com/pytest-dev/pytest-asyncio/releases)
- [Commits](pytest-dev/pytest-asyncio@v0.1.1...v0.23.2)

---
updated-dependencies:
- dependency-name: pytest-asyncio
  dependency-type: direct:development
...

Signed-off-by: dependabot[bot] <[email protected]>

* Update changelog

Signed-off-by: dependabot[bot] <[email protected]>

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot[bot] <dependabot[bot]@users.noreply.github.com>
Signed-off-by: roma2023 <[email protected]>
…pensearch-project#640)

* updated files with docstrings to pass pylint

Signed-off-by: Mark Cohen <[email protected]>

* updated samples to prepare for enabling missing-docstring linter; will continue to work on this before committing setup.cfg

Signed-off-by: Mark Cohen <[email protected]>

* removed missing-function-docstring from setup.cfg so the linter doesn't fail while work on docstrings continues

Signed-off-by: Mark Cohen <[email protected]>

* corrected unnecessary return docstring values

Signed-off-by: Mark Cohen <[email protected]>

* fixing failure in 'black' on reformatting

Signed-off-by: Mark Cohen <[email protected]>

---------

Signed-off-by: Mark Cohen <[email protected]>
Signed-off-by: roma2023 <[email protected]>
Signed-off-by: roma2023 <[email protected]>
Signed-off-by: roma2023 <[email protected]>
Copy link
Member

@dblock dblock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I think we need to explain how to use Dockerfile that you created.

  1. Move samples/Dockerfile into samples/snapshot.
  2. Add a "Working Sample" section to the guide with something like "See .../snapshot for a working sample. To run this sample, use the associated Dockerfile as follows." and include some bash output of how one would start the docker container and run the sample.
    Add something to guides/snapshot.md

@saimedhi
Copy link
Collaborator

Hello @roma2023, could you please take care of the failing DCO and make the requested changes? Thank you for your patience.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.