Skip to content

Commit

Permalink
Tuning S3_File doc comments. (#10832)
Browse files Browse the repository at this point in the history
- Review and update the doc comments of public functions in the AWS library.
- Reorder the functions to make the order in component browser (and online docs better).
- Align some error handling.
- Fix bug with `list` on root S3.
- Hide `S3.get_object` as it's single read makes it bad for GUI use.
  • Loading branch information
jdunkerley authored Aug 16, 2024
1 parent b442a38 commit 2dbdcb2
Show file tree
Hide file tree
Showing 7 changed files with 448 additions and 227 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,8 @@ type AWS_Credential
With_Configuration (base_credential : AWS_Credential) (default_region : AWS_Region)

## ICON cloud
Get a list of the available profiles

Returns a vector of the available profile names.
profile_names : Vector Text
profile_names = Vector.from_polyglot_array <|
ProfileReader.INSTANCE.getProfiles
Expand Down
20 changes: 11 additions & 9 deletions distribution/lib/Standard/AWS/0.0.0-dev/src/Internal/S3_Path.enso
Original file line number Diff line number Diff line change
Expand Up @@ -17,18 +17,20 @@ type S3_Path
## PRIVATE
parse (uri : Text) -> S3_Path ! Illegal_Argument =
if uri.starts_with S3.uri_prefix . not then Error.throw (Illegal_Argument.Error "An S3 path must start with `"+S3.uri_prefix+"`.") else
without_prefix = uri.drop S3.uri_prefix.length
first_slash_index = without_prefix.index_of S3_Path.delimiter
if first_slash_index == 0 then Error.throw (Illegal_Argument.Error "Invalid S3 path: empty bucket name.") else
if first_slash_index.is_nothing then S3_Path.Value without_prefix "" else
bucket = (without_prefix.take first_slash_index)
key = (without_prefix.drop first_slash_index+1)
normalized = Decomposed_S3_Path.parse key . normalize . key
S3_Path.Value bucket normalized
if uri.length == S3.uri_prefix.length then S3_Path.Value "" "" else
without_prefix = uri.drop S3.uri_prefix.length
first_slash_index = without_prefix.index_of S3_Path.delimiter
if first_slash_index == 0 then Error.throw (Illegal_Argument.Error "Invalid S3 path: empty bucket name.") else
if first_slash_index.is_nothing then S3_Path.Value without_prefix "" else
bucket = (without_prefix.take first_slash_index)
if bucket == "" then Error.throw (Illegal_Argument.Error "Invalid S3 path: empty bucket name with key name.") else
key = (without_prefix.drop first_slash_index+1)
normalized = Decomposed_S3_Path.parse key . normalize . key
S3_Path.Value bucket normalized

## PRIVATE
to_text self -> Text =
S3.uri_prefix + self.bucket + S3_Path.delimiter + self.key
S3.uri_prefix + (if self.bucket == "" then "" else self.bucket + S3_Path.delimiter + self.key)

## PRIVATE
to_display_text self -> Text = self.to_text.to_display_text
Expand Down
148 changes: 89 additions & 59 deletions distribution/lib/Standard/AWS/0.0.0-dev/src/S3/S3.enso
Original file line number Diff line number Diff line change
Expand Up @@ -29,73 +29,98 @@ polyglot java import software.amazon.awssdk.services.s3.model.S3Exception
polyglot java import software.amazon.awssdk.services.s3.S3Client

## ICON data_input

Gets the list of the S3 bucket names.

Arguments:
- credentials: AWS credentials. If not provided, the default credentials will
be used.
- credentials: The credentials to use to access S3. If not specified, the
default credentials are used.

Returns:
- A vector of bucket names (as Text).

! Error Conditions
- If the credentials are invalid or access to S3 is denied, then an
`AWS_SDK_Error` will be raised.
list_buckets : AWS_Credential -> Vector Text ! S3_Error
list_buckets credentials:AWS_Credential=AWS_Credential.Default = handle_s3_errors <|
list_buckets credentials:AWS_Credential=..Default = handle_s3_errors <|
client = make_client credentials
buckets = client.listBuckets.buckets
buckets.map b->b.name

## GROUP Standard.Base.Input
ICON data_input
## ICON data_input

Gets the list of the items inside a bucket.

Arguments:
- bucket: the name of the bucket.
- prefix: the prefix of keys to match.
- max_count: the maximum number of items to return. The default is 1000.
- credentials: AWS credentials. If not provided, the default credentials will
be used.
- credentials: The credentials to use to access the S3 bucket. If not
specified, the default credentials are used.

Returns:
- A vector of object keys (as Text) (including the prefix).

! Error Conditions
- If the credentials are invalid or access to S3 is denied, then an
`AWS_SDK_Error` will be raised.
- If the bucket does not exist, an `S3_Bucket_Not_Found` error is thrown.
- If more items are available than the `max_count` parameter, a
`More_Records_Available` warning is attached to the result.
list_objects : Text -> Text -> AWS_Credential -> Integer -> Vector Text ! S3_Error
list_objects bucket prefix="" credentials:AWS_Credential=AWS_Credential.Default max_count=1000 =
list_objects bucket prefix="" credentials:AWS_Credential=..Default max_count:Integer=1000 =
read_bucket bucket prefix credentials delimiter="" max_count=max_count . second

## PRIVATE
Reads an S3 bucket returning a pair of vectors, one with common prefixes and
one with object keys.

Arguments:
- bucket: The name of the bucket.
- prefix: The prefix to use when searching for keys to return.
- credentials: The credentials for the AWS resource.
- delimiter: The delimiter used to deduce common prefixes.
read_bucket : Text -> Text -> AWS_Credential -> Integer -> Text -> Pair Vector Vector ! S3_Error
read_bucket bucket prefix="" credentials:AWS_Credential=AWS_Credential.Default delimiter="/" max_count=1000 = handle_s3_errors bucket=bucket <|
client = make_client_for_bucket bucket credentials
ADVANCED
ICON data_input

per_request = Math.min max_count 1000
request = ListObjectsV2Request.builder.bucket bucket . maxKeys per_request . delimiter delimiter . prefix prefix . build
Gets an object from an S3 bucket.
Returns a raw stream which can be read once.

iterator request count current prefixes first =
response = client.listObjectsV2 request
Arguments:
- bucket: the name of the bucket.
- key: the key of the object.
- credentials: AWS credentials. If not provided, the default credentials will
be used.
- delimiter: The delimiter to use for deducing the filename from the path.
get_object : Text -> Text -> AWS_Credential -> Text -> Response_Body ! S3_Error
get_object bucket key credentials:AWS_Credential=AWS_Credential.Default delimiter="/" = handle_s3_errors bucket=bucket key=key <|
request = GetObjectRequest.builder.bucket bucket . key key . build

if response.is_error then response else
## Note the AWS API does not limit the count of common prefixes.
common_prefixes = if first then response.commonPrefixes.map _.prefix else prefixes
result = current + (response.contents.map _.key)
client = make_client_for_bucket bucket credentials
response = client.getObject request

if response.isTruncated.not then Pair.new common_prefixes result else
new_count = count + result.length
if new_count >= max_count then (Warning.attach (More_Records_Available.Warning "Not all keys returned. Additional objects found.") (Pair.new common_prefixes result)) else
new_items = Math.min (Math.max 0 max_count-new_count) 1000
new_request = request.toBuilder.continuationToken response.nextContinuationToken . maxKeys new_items . build
@Tail_Call iterator new_request new_count result common_prefixes False
inner_response = response.response
s3_uri = URI.parse (uri_prefix + bucket + "/") / key
content_type = inner_response.contentType
name = filename_from_content_disposition inner_response.contentDisposition . if_nothing <|
key.split delimiter . last
metadata = File_Format_Metadata.Value path=key name=name content_type=content_type

iterator request 0 [] [] True
input_stream = Input_Stream.new response (handle_io_errors s3_uri)
Response_Body.Raw_Stream input_stream metadata s3_uri

## ADVANCED
ICON data_input

Gets the metadata of a bucket or object.

Arguments:
- bucket: the name of the bucket.
- key: the key of the object.
- credentials: AWS credentials. If not provided, the default credentials will
be used.
- prefix: the prefix of keys to match.
- credentials: The credentials to use to access the S3 bucket. If not
specified, the default credentials are used.

Returns:
- A Dictionary of the associated metadata of a bucket or object.

! Error Conditions
- If the credentials are invalid or access to S3 is denied, then an
`AWS_SDK_Error` will be raised.
- If the bucket does not exist, an `S3_Bucket_Not_Found` error is thrown.
- If the object does not exist, an `S3_Key_Not_Found` error is thrown.
head : Text -> Text -> AWS_Credential -> Dictionary Text Any ! S3_Error
head bucket key="" credentials:AWS_Credential=AWS_Credential.Default =
response = raw_head bucket key credentials
Expand All @@ -120,33 +145,38 @@ raw_head bucket key credentials =
request = HeadObjectRequest.builder.bucket bucket . key key . build
handle_s3_errors bucket=bucket key=key <| client.headObject request

## ADVANCED
ICON data_input
Gets an object from an S3 bucket.
Returns a raw stream which can be read once.
## PRIVATE
Reads an S3 bucket returning a pair of vectors, one with common prefixes and
one with object keys.

Arguments:
- bucket: the name of the bucket.
- key: the key of the object.
- credentials: AWS credentials. If not provided, the default credentials will
be used.
- delimiter: The delimiter to use for deducing the filename from the path.
get_object : Text -> Text -> AWS_Credential -> Text -> Response_Body ! S3_Error
get_object bucket key credentials:AWS_Credential=AWS_Credential.Default delimiter="/" = handle_s3_errors bucket=bucket key=key <|
request = GetObjectRequest.builder.bucket bucket . key key . build

- bucket: The name of the bucket.
- prefix: The prefix to use when searching for keys to return.
- credentials: The credentials for the AWS resource.
- delimiter: The delimiter used to deduce common prefixes.
read_bucket : Text -> Text -> AWS_Credential -> Integer -> Text -> Pair Vector Vector ! S3_Error
read_bucket bucket prefix="" credentials:AWS_Credential=AWS_Credential.Default delimiter="/" max_count=1000 = handle_s3_errors bucket=bucket <|
client = make_client_for_bucket bucket credentials
response = client.getObject request

inner_response = response.response
s3_uri = URI.parse (uri_prefix + bucket + "/") / key
content_type = inner_response.contentType
name = filename_from_content_disposition inner_response.contentDisposition . if_nothing <|
key.split delimiter . last
metadata = File_Format_Metadata.Value path=key name=name content_type=content_type
per_request = Math.min max_count 1000
request = ListObjectsV2Request.builder.bucket bucket . maxKeys per_request . delimiter delimiter . prefix prefix . build

input_stream = Input_Stream.new response (handle_io_errors s3_uri)
Response_Body.Raw_Stream input_stream metadata s3_uri
iterator request count current prefixes first =
response = client.listObjectsV2 request

if response.is_error then response else
## Note the AWS API does not limit the count of common prefixes.
common_prefixes = if first then response.commonPrefixes.map _.prefix else prefixes
result = current + (response.contents.map _.key)

if response.isTruncated.not then Pair.new common_prefixes result else
new_count = count + result.length
if new_count >= max_count then (Warning.attach (More_Records_Available.Warning "Not all keys returned. Additional objects found.") (Pair.new common_prefixes result)) else
new_items = Math.min (Math.max 0 max_count-new_count) 1000
new_request = request.toBuilder.continuationToken response.nextContinuationToken . maxKeys new_items . build
@Tail_Call iterator new_request new_count result common_prefixes False

iterator request 0 [] [] True

## PRIVATE
put_object (bucket : Text) (key : Text) credentials:AWS_Credential=AWS_Credential.Default request_body = handle_s3_errors bucket=bucket key=key <|
Expand Down
Loading

0 comments on commit 2dbdcb2

Please sign in to comment.