The DataMeshProducer.py
class provides functions to assist data Producers to create and manage Data Products.
DataMeshProducer(
data_mesh_account_id: str,
region_name: str = 'us-east-1',
log_level: str = "INFO",
use_credentials=None
)
data_mesh_account_id
: The AWS Account ID to use as the central Data Mesh Account in the regionregion_name
: The short AWS Region Name in which you want to execute Producer functionslog_level
: The level of information you want to see when executing. Based upon pythonlogging
, values includeINFO
,DEBUG
,ERROR
, etc.use_credentials
: Credentials to use to setup the instance. This can be provided as a boto3 Credentials object, a dict containing the below structure, or if None is provided the boto3 environment will be accessed.
{
"AccountId": "The Consumer AWS Account ID",
"AccessKeyId": "Your access key",
"SecretAccessKey": "Your secret key",
"SessionToken": "Optional - a session token, if you are using an IAM Role & temporary credentials"
}
The following methods are avialable:
create_data_products
list_pending_access_requests
approve_access_request
deny_access_request
update_subscription_permissions
delete_subscription
get_data_product
Creates a new data product offering of one-or-more tables. When creating a set of data products, the object metadata is copied into the Lake Formation catalog of the data mesh account, and appropriate grants are created to enable the product to administer the central metadata.
create_data_products(
source_database_name: str,
create_public_metadata: bool = True,
table_name_regex: str = None,
domain: str = None,
data_product_name: str = None,
sync_mesh_catalog_schedule: str = None,
sync_mesh_crawler_role_arn: str = None,
expose_data_mesh_db_name: str = None,
expose_table_references_with_suffix: str = "_link"
)
source_database_name
(String) - The name of the Source Database. Only 1 Database at a time may be used to create a set of data productstable_name_regex
(String) - A table name or regular expression matching a set of tables to be offered. Optional.domain
(String) - A domain name to be associated with the data productdata_product_name
(String) - The data product name to be used for the resolved objects. If not provided, then only direct sharing grants will be possible.create_public_metadata
(Boolean) - True or False indicating whether the read-only role should be granted DESCRIBE on metadatasync_mesh_catalog_schedule
(String) - CRON expression indicating how often the data mesh catalog should be synced with the source. Optional. If not provided, metadata will be updated every 4 hours if async_mesh_crawler_role_arn
is provided.sync_mesh_crawler_role_arn
(String) - IAM Role ARN to be used to create a Glue Crawler which will update the structure of the data mesh metadata based upon changes to the source. Optional. If not provided, metadata will not be updated from source.expose_data_mesh_db_name
(String) - Overrides the name of the database in the Data Mesh account with the provided value. If not provided, then the database name will be set to<original name>-<account id>
expose_table_references_with_suffix
(String) - Overrides the suffix to be set on all resource links shared back to the Producer. Default is<original name>_link
.
dict
{
'DatabaseName': str,
'Tables': [
'SourceTable': str,
'LinkTable': str,
]
}
- (dict)
DatabaseName
: The name of the database createdTables
: List of Tables created in the mesh accountSourceTable
: The table that was shared to the data meshLinkTable
: The resource link that is shared back to the producer Account
This method will return a list of requests made by Consumers to access to products owned by the calling principal which have not yet been approved, denied, or deleted.
list_pending_access_requests()
None
dict
{
'Subscriptions': [
{
"SubscriptionId": str,
"DatabaseName": str,
"TableName": list<string>,
"RequestedGrants": list<string>,
"SubscriberPrincipal": str,
"CreationDate": str,
"CreatedBy": str
},
...
]
}
- (dict)
Subscriptions
: List of pending subscriptionsSubscriptionId
: The ID assigned to the Subscription requestDatabaseName
: The name of the database containing shared objectsTableName
: List of table names being requestedRequestedGrants
: Grants requested by the ConsumerSubscriberPrincipal
: The AWS Account Number of the requesting ConsumerCreationDate
: Date the request was made, inYYYY-MM-DD HH:MI:SS
formatCreatedBy
: The Identity of the Principal who requested access.
Approves a subscription request raised by a Consumer. During this grant, the permissions can match what was requested, or overridden.
approve_access_request(
request_id: str,
grant_permissions: list = None,
grantable_permissions: list = None,
decision_notes: str = None
)
request_id
: The Subscription Request that is being approvedgrant_permissions
: The permissions to be granted to the Consumer. If None, then all requested permissions will be granted.grantable_permissions
: The permissions which the Consumer can grant to other principals within their AWS Account. If None, then allgrant_permissions
will be grantable.decision_notes
: String value attached to the Subscription containing information about the approval.
None
Marks a requested subscription from a Consumer as deleted. No grants are made and no objects are shared.
deny_access_request(
request_id: str,
decision_notes: str = None
)
request_id
: The ID of the Subscription being denieddecision_notes
: String value indicating why the Subscription was denied.
None
Method allowing a Producer to change the permissions granted to a Consumer.
update_subscription_permissions(
subscription_id: str,
grant_permissions: list,
notes: str
)
subscription_id
: The ID of the Subscription being modifiedgrant_permissions
: The permissions that will be set on the shared objects after updatenotes
: String value associated with the permissions modification
None
Tears down a granted Subscription so it can no longer be used by the Consumer. The record of the Subscription is retained for future auditing.
delete_subscription(
subscription_id: str,
reason: str
)
subscription_id
: The ID of the Subscription being deletedreason
: String value associated with the deletion.
None
Fetches information from the system about a set of tables from a given database in the Data Mesh.
get_data_product(
database_name: str,
table_name_regex: str
)
database_name
: The Database Name in the mesh to retrieve tables fromtable_name_regex
: String value or regular expression matching one or more tables to retrieve
list
[
{
'Database': str,
'TableName': str,
'Location': str
}
]
- (list)
- (dict)
DatabaseName
: The name of the database matchedTableName
: The name of the table matchedLocation
: S3 Location of the Table
- (dict)