Skip to content

Commit

Permalink
Added support for AWS Sigv4 for UrlLib3. (opensearch-project#547)
Browse files Browse the repository at this point in the history
* WIP: Added support for AWS Sigv4 for UrlLib3.

Signed-off-by: dblock <[email protected]>

* Refactored common implementation.

Signed-off-by: dblock <[email protected]>

* Added sigv4 samples.

Signed-off-by: dblock <[email protected]>

* Updated CHANGELOG.

Signed-off-by: dblock <[email protected]>

* Add documentation.

Signed-off-by: dblock <[email protected]>

* Use the correct class in tests.

Signed-off-by: dblock <[email protected]>

* Renamed samples.

Signed-off-by: dblock <[email protected]>

* Split up requests and urllib3 unit tests.

Signed-off-by: dblock <[email protected]>

* Rename AWSV4Signer.

Signed-off-by: dblock <[email protected]>

* Clarified documentation of when to use Urllib3AWSV4SignerAuth vs. RequestHttpConnection.

Signed-off-by: dblock <[email protected]>

* Move fetch_url inside the signer class.

Signed-off-by: dblock <[email protected]>

* Added unit test for Urllib3AWSV4SignerAuth adding headers.

Signed-off-by: dblock <[email protected]>

* Added unit test for signing to include query string.

Signed-off-by: dblock <[email protected]>

---------

Signed-off-by: dblock <[email protected]>
Signed-off-by: roma2023 <[email protected]>
  • Loading branch information
dblock authored and roma2023 committed Dec 28, 2023
1 parent 48b22d3 commit 93b4698
Show file tree
Hide file tree
Showing 19 changed files with 1,677 additions and 1,766 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ Inspired from [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)
- Added `pool_maxsize` for `Urllib3HttpConnection` ([#535](https://github.com/opensearch-project/opensearch-py/pull/535))
- Added benchmarks ([#537](https://github.com/opensearch-project/opensearch-py/pull/537))
- Added guide on making raw JSON REST requests ([#542](https://github.com/opensearch-project/opensearch-py/pull/542))
- Added support for AWS SigV4 for urllib3 ([#547](https://github.com/opensearch-project/opensearch-py/pull/547))
### Changed
- Generate `tasks` client from API specs ([#508](https://github.com/opensearch-project/opensearch-py/pull/508))
- Generate `ingest` client from API specs ([#513](https://github.com/opensearch-project/opensearch-py/pull/513))
Expand Down
16 changes: 14 additions & 2 deletions DEVELOPER_GUIDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,19 @@ docker run -d -p 9200:9200 -p 9600:9600 -e "discovery.type=single-node" opensear

Tests require a live instance of OpenSearch running in docker.

This will start a new instance and run tests against the latest version of OpenSearch.
If you have one running.

```
python setup.py test
```

To run tests in a specific test file.

```
python setup.py test -s test_opensearchpy/test_connection.py
```

If you want to auto-start one, the following will start a new instance and run tests against the latest version of OpenSearch.

```
./.ci/run-tests
Expand Down Expand Up @@ -76,7 +88,7 @@ You can also run individual tests matching a pattern (`pytest -k [pattern]`).
```
./.ci/run-tests true 1.3.0 test_no_http_compression
test_opensearchpy/test_connection.py::TestUrllib3Connection::test_no_http_compression PASSED [ 33%]
test_opensearchpy/test_connection.py::TestUrllib3HttpConnection::test_no_http_compression PASSED [ 33%]
test_opensearchpy/test_connection.py::TestRequestsConnection::test_no_http_compression PASSED [ 66%]
test_opensearchpy/test_async/test_connection.py::TestAIOHttpConnection::test_no_http_compression PASSED [100%]
```
Expand Down
13 changes: 9 additions & 4 deletions guides/auth.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
- [Authentication](#authentication)
- [IAM Authentication](#iam-authentication)
- [IAM Authentication with a Synchronous Client](#iam-authentication-with-a-synchronous-client)
- [IAM Authentication with an Async Client](#iam-authentication-with-an-async-client)
- [Kerberos](#kerberos)

Expand All @@ -9,24 +10,28 @@ OpenSearch allows you to use different methods for the authentication via `conne

## IAM Authentication

Opensearch-py supports IAM-based authentication via `AWSV4SignerAuth`, which uses `RequestHttpConnection` as the transport class for communicating with OpenSearch clusters running in Amazon Managed OpenSearch and OpenSearch Serverless, and works in conjunction with [botocore](https://pypi.org/project/botocore/).
This library supports IAM-based authentication when communicating with OpenSearch clusters running in Amazon Managed OpenSearch and OpenSearch Serverless.

## IAM Authentication with a Synchronous Client

For `Urllib3HttpConnection` use `Urllib3AWSV4SignerAuth`, and for `RequestHttpConnection` use `RequestsAWSV4SignerAuth`.

```python
from opensearchpy import OpenSearch, RequestsHttpConnection, AWSV4SignerAuth
from opensearchpy import OpenSearch, Urllib3HttpConnection, Urllib3AWSV4SignerAuth
import boto3

host = '' # cluster endpoint, for example: my-test-domain.us-east-1.es.amazonaws.com
region = 'us-west-2'
service = 'es' # 'aoss' for OpenSearch Serverless
credentials = boto3.Session().get_credentials()
auth = AWSV4SignerAuth(credentials, region, service)
auth = Urllib3AWSV4SignerAuth(credentials, region, service)

client = OpenSearch(
hosts = [{'host': host, 'port': 443}],
http_auth = auth,
use_ssl = True,
verify_certs = True,
connection_class = RequestsHttpConnection,
connection_class = Urllib3HttpConnection,
pool_maxsize = 20
)

Expand Down
9 changes: 8 additions & 1 deletion opensearchpy/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,12 @@
UnknownDslObject,
ValidationException,
)
from .helpers import AWSV4SignerAsyncAuth, AWSV4SignerAuth
from .helpers import (
AWSV4SignerAsyncAuth,
AWSV4SignerAuth,
RequestsAWSV4SignerAuth,
Urllib3AWSV4SignerAuth,
)
from .helpers.aggs import A
from .helpers.analysis import analyzer, char_filter, normalizer, token_filter, tokenizer
from .helpers.document import Document, InnerDoc, MetaField
Expand Down Expand Up @@ -166,6 +171,8 @@
"OpenSearchWarning",
"OpenSearchDeprecationWarning",
"AWSV4SignerAuth",
"Urllib3AWSV4SignerAuth",
"RequestsAWSV4SignerAuth",
"AWSV4SignerAsyncAuth",
"A",
"AttrDict",
Expand Down
21 changes: 17 additions & 4 deletions opensearchpy/connection/http_urllib3.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@
import ssl
import time
import warnings
from typing import Callable

import urllib3 # type: ignore
from urllib3.exceptions import ReadTimeoutError
Expand Down Expand Up @@ -128,10 +129,17 @@ def __init__(
opaque_id=opaque_id,
**kwargs
)
if http_auth is not None:
if isinstance(http_auth, (tuple, list)):
http_auth = ":".join(http_auth)
self.headers.update(urllib3.make_headers(basic_auth=http_auth))

self.http_auth = http_auth
if self.http_auth is not None:
if isinstance(self.http_auth, Callable):
pass
elif isinstance(self.http_auth, (tuple, list)):
self.headers.update(
urllib3.make_headers(basic_auth=":".join(http_auth))
)
else:
self.headers.update(urllib3.make_headers(basic_auth=http_auth))

pool_class = urllib3.HTTPConnectionPool
kw = {}
Expand Down Expand Up @@ -218,6 +226,7 @@ def perform_request(
url = "%s?%s" % (url, urlencode(params))

full_url = self.host + url

start = time.time()
orig_body = body
try:
Expand All @@ -240,6 +249,10 @@ def perform_request(
body = self._gzip_compress(body)
request_headers["content-encoding"] = "gzip"

if self.http_auth is not None:
if isinstance(self.http_auth, Callable):
request_headers.update(self.http_auth(method, full_url, body))

response = self.pool.urlopen(
method, url, body, retries=Retry(False), headers=request_headers, **kw
)
Expand Down
4 changes: 3 additions & 1 deletion opensearchpy/helpers/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@
)
from .asyncsigner import AWSV4SignerAsyncAuth
from .errors import BulkIndexError, ScanError
from .signer import AWSV4SignerAuth
from .signer import AWSV4SignerAuth, RequestsAWSV4SignerAuth, Urllib3AWSV4SignerAuth

__all__ = [
"BulkIndexError",
Expand All @@ -54,6 +54,8 @@
"_process_bulk_chunk",
"AWSV4SignerAuth",
"AWSV4SignerAsyncAuth",
"RequestsAWSV4SignerAuth",
"Urllib3AWSV4SignerAuth",
]


Expand Down
1 change: 1 addition & 0 deletions opensearchpy/helpers/__init__.pyi
Original file line number Diff line number Diff line change
Expand Up @@ -48,5 +48,6 @@ try:
from .._async.helpers.actions import async_streaming_bulk as async_streaming_bulk
from .asyncsigner import AWSV4SignerAsyncAuth as AWSV4SignerAsyncAuth
from .signer import AWSV4SignerAuth as AWSV4SignerAuth
from .signer import RequestsAWSV4SignerAuth, Urllib3AWSV4SignerAuth
except (ImportError, SyntaxError):
pass
125 changes: 79 additions & 46 deletions opensearchpy/helpers/signer.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
# GitHub history for details.

import sys
from typing import Any, Callable, Dict

import requests

Expand All @@ -17,38 +18,12 @@
from urllib.parse import parse_qs, urlencode, urlparse


def fetch_url(prepared_request): # type: ignore
class AWSV4Signer:
"""
This is a util method that helps in reconstructing the request url.
:param prepared_request: unsigned request
:return: reconstructed url
Generic AWS V4 Request Signer.
"""
url = urlparse(prepared_request.url)
path = url.path or "/"

# fetch the query string if present in the request
querystring = ""
if url.query:
querystring = "?" + urlencode(
parse_qs(url.query, keep_blank_values=True), doseq=True
)

# fetch the host information from headers
headers = dict(
(key.lower(), value) for key, value in prepared_request.headers.items()
)
location = headers.get("host") or url.netloc

# construct the url and return
return url.scheme + "://" + location + path + querystring


class AWSV4SignerAuth(requests.auth.AuthBase):
"""
AWS V4 Request Signer for Requests.
"""

def __init__(self, credentials, region, service="es"): # type: ignore
def __init__(self, credentials, region: str, service: str = "es") -> Any: # type: ignore
if not credentials:
raise ValueError("Credentials cannot be empty")
self.credentials = credentials
Expand All @@ -61,27 +36,20 @@ def __init__(self, credentials, region, service="es"): # type: ignore
raise ValueError("Service name cannot be empty")
self.service = service

def __call__(self, request): # type: ignore
return self._sign_request(request) # type: ignore

def _sign_request(self, prepared_request): # type: ignore
def sign(self, method: str, url: str, body: Any) -> Dict[str, str]:
"""
This method helps in signing the request by injecting the required headers.
:param prepared_request: unsigned request
:return: signed request
This method signs the request and returns headers.
:param method: HTTP method
:param url: url
:param body: body
:return: headers
"""

from botocore.auth import SigV4Auth
from botocore.awsrequest import AWSRequest

url = fetch_url(prepared_request) # type: ignore

# create an AWS request object and sign it using SigV4Auth
aws_request = AWSRequest(
method=prepared_request.method.upper(),
url=url,
data=prepared_request.body,
)
aws_request = AWSRequest(method=method.upper(), url=url, data=body)

# credentials objects expose access_key, secret_key and token attributes
# via @property annotations that call _refresh() on every access,
Expand All @@ -101,9 +69,74 @@ def _sign_request(self, prepared_request): # type: ignore
sig_v4_auth.add_auth(aws_request)

# copy the headers from AWS request object into the prepared_request
prepared_request.headers.update(dict(aws_request.headers.items()))
prepared_request.headers["X-Amz-Content-SHA256"] = sig_v4_auth.payload(
aws_request
headers = dict(aws_request.headers.items())
headers["X-Amz-Content-SHA256"] = sig_v4_auth.payload(aws_request)

return headers


class RequestsAWSV4SignerAuth(requests.auth.AuthBase):
"""
AWS V4 Request Signer for Requests.
"""

def __init__(self, credentials, region, service="es"): # type: ignore
self.signer = AWSV4Signer(credentials, region, service)

def __call__(self, request): # type: ignore
return self._sign_request(request) # type: ignore

def _sign_request(self, prepared_request): # type: ignore
"""
This method helps in signing the request by injecting the required headers.
:param prepared_request: unsigned request
:return: signed request
"""

prepared_request.headers.update(
self.signer.sign(
prepared_request.method,
self._fetch_url(prepared_request), # type: ignore
prepared_request.body,
)
)

return prepared_request

def _fetch_url(self, prepared_request): # type: ignore
"""
This is a util method that helps in reconstructing the request url.
:param prepared_request: unsigned request
:return: reconstructed url
"""
url = urlparse(prepared_request.url)
path = url.path or "/"

# fetch the query string if present in the request
querystring = ""
if url.query:
querystring = "?" + urlencode(
parse_qs(url.query, keep_blank_values=True), doseq=True
)

# fetch the host information from headers
headers = dict(
(key.lower(), value) for key, value in prepared_request.headers.items()
)
location = headers.get("host") or url.netloc

# construct the url and return
return url.scheme + "://" + location + path + querystring


# Deprecated: use RequestsAWSV4SignerAuth
class AWSV4SignerAuth(RequestsAWSV4SignerAuth):
pass


class Urllib3AWSV4SignerAuth(Callable): # type: ignore
def __init__(self, credentials, region, service="es"): # type: ignore
self.signer = AWSV4Signer(credentials, region, service)

def __call__(self, method: str, url: str, body: Any) -> Dict[str, str]:
return self.signer.sign(method, url, body)
22 changes: 22 additions & 0 deletions samples/aws/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
## AWS SigV4 Samples

Create an OpenSearch domain in (AWS) which support IAM based AuthN/AuthZ.

```
export AWS_ACCESS_KEY_ID=
export AWS_SECRET_ACCESS_KEY=
export AWS_SESSION_TOKEN=
export AWS_REGION=us-west-2
export SERVICE=es # use "aoss" for OpenSearch Serverless.
export ENDPOINT=https://....us-west-2.es.amazonaws.com
poetry run aws/search-urllib.py
```

This will output the version of OpenSearch and a search result.

```
opensearch: 2.3.0
{'director': 'Bennett Miller', 'title': 'Moneyball', 'year': 2011}
```
Loading

0 comments on commit 93b4698

Please sign in to comment.