-
Notifications
You must be signed in to change notification settings - Fork 72
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Adds a sample demonstrating how to implement an incremental refresh b…
…ased on the Hyper API and Hyper Update REST API. (#64) * Adds an incremental refresh sample based on the Hyper Update REST API. Adds a sample demonstrating how to implement an incremental refresh based on the Hyper API and the Hyper Update REST API. The sample is based on the content the Hyper team presented in the Hands on Training session "Hands-on: Leverage the Hyper Update API and Hyper API to Keep Your Data Fresh on Tableau Server" at Tableau Conference 2022. * Minor: Added argparser to pass in arguments and minor rephrasing of the README file. * Added the OpenSkyAPI to the requirements.txt file and removed the instructions to manually install it.
- Loading branch information
1 parent
ecf38e2
commit 9ad9c0d
Showing
5 changed files
with
181 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
51 changes: 51 additions & 0 deletions
51
Community-Supported/flights-data-incremental-refresh/README.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,51 @@ | ||
# flights-data-incremental-refresh | ||
## __Incremental Refresh using the OpenSkyApi__ | ||
|
||
![Community Supported](https://img.shields.io/badge/Support%20Level-Community%20Supported-53bd92.svg) | ||
|
||
This sample is based on the content the Hyper team presented in the Hands on Training session "Hands-on: Leverage the Hyper Update API and Hyper API to Keep Your Data Fresh on Tableau Server" at Tableau Conference 2022 ([slides available here](https://mkt.tableau.com/tc22/sessions/live/430-HOT-D1_Hands-onLeverageTheHyperUpdate.pdf)). | ||
|
||
This script pulls down flights data from the [OpenSkyAPI](https://github.com/openskynetwork/opensky-api), creates a hyper database with this data and uses the [Hyper Update API](https://help.tableau.com/current/api/rest_api/en-us/REST/rest_api_how_to_update_data_to_hyper.htm) to implement an incremental refresh on your Tableau Server/Cloud. The first time this script is executed, the database file is simply published. | ||
|
||
# Get started | ||
|
||
## __Prerequisites__ | ||
To run the script, you will need: | ||
- Windows, Linux, or Mac | ||
- Python 3 | ||
- Run `pip install -r requirements.txt` | ||
- Tableau Server Credentials, see below. | ||
|
||
## Tableau Server Credentials | ||
To run this sample with your Tableau Server/Cloud, you first need to get the following information: | ||
- Tableau Server Url, e.g. 'https://us-west-2a.online.tableau.com' | ||
- Site name, e.g., use 'default' for your default site (note that you cannot use 'default' in Tableau Cloud but must use the site name) | ||
- Project name, e.g., use an empty string ('') for your default project | ||
- [Token Name and Token Value](https://help.tableau.com/current/server/en-us/security_personal_access_tokens.htm) | ||
|
||
Ensure that you have installed the requirements and then just run the sample Python file with the information from above. The syntax for running the script is: | ||
|
||
**python flights-data-incremental-refresh.py [-h] server_url site_name project_name token_name token_value** | ||
|
||
# Incremental Refresh using the OpenSkyApi | ||
The script consists of two parts: first it creates a Hyper database with flights data and then publishes the database to Tableau Server/Cloud. | ||
|
||
## Create a database with flights data | ||
The `create_hyper_database_with_flights_data` method creates an instance of the `OpenSkyAPI` and then pulls down states within a specific bounding box. This example just uses a subset of the available data as we are using the free version of the OpenSkyApi. | ||
|
||
Then, a Hyper database is created with a table with name `TableName("public", "flights")`. Finally, an inserter is used to insert the flights data. | ||
|
||
## Publish the hyper database to Tableau Server / Cloud | ||
The `publish_to_server` method first signs into Tableau Server / Cloud. Then, it finds the respective project to which the database should be published to. | ||
|
||
There are two cases for publishing the database to Server: | ||
- No datasource with name `datasource_name_on_server` exists on Tableau Server. In this case, the script simply creates the initial datasource on Tableau server. This datasource is needed for the subsequent incremental refreshes as the data will be added to this datasource. | ||
- The datasource with name `datasource_name_on_server` already exists on Tableau Server. In this case, the script uses the Hyper Update REST API to insert the data from the database into the respective table in the datasource on Tableau Server/Cloud. | ||
|
||
## __Resources__ | ||
Check out these resources to learn more: | ||
- [Hyper API documentation](https://help.tableau.com/current/api/hyper_api/en-us/index.html) | ||
- [Hyper Update API documentation](https://help.tableau.com/current/api/rest_api/en-us/REST/rest_api_how_to_update_data_to_hyper.htm) | ||
- [Tableau Server Client Docs](https://tableau.github.io/server-client-python/docs/) | ||
- [REST API documentation](https://help.tableau.com/current/api/rest_api/en-us/REST/rest_api.htm) | ||
- [Tableau Tools](https://github.com/bryantbhowell/tableau_tools) |
123 changes: 123 additions & 0 deletions
123
Community-Supported/flights-data-incremental-refresh/flights-data-incremental-refresh.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,123 @@ | ||
from tableauhyperapi import HyperProcess, Connection, Telemetry, TableDefinition, TableName, CreateMode, SqlType, Nullability, Inserter | ||
from opensky_api import OpenSkyApi | ||
import tableauserverclient as TSC | ||
import uuid | ||
import argparse | ||
|
||
def create_hyper_database_with_flights_data(database_path): | ||
""" | ||
Leverages the OpenSkyAPI (https://github.com/openskynetwork/opensky-api) to create a | ||
Hyper database with flights data. | ||
""" | ||
# Create an instance of the opensky api to retrieve data from OpenSky via HTTP. | ||
opensky = OpenSkyApi() | ||
# Get the most recent state vector. Note that we can only call this method every | ||
# 10 seconds as we are using the free version of the API. | ||
states = opensky.get_states(bbox=(45.8389, 47.8229, 5.9962, 10.5226)) | ||
|
||
# Start up a local Hyper process. | ||
with HyperProcess(telemetry=Telemetry.SEND_USAGE_DATA_TO_TABLEAU) as hyper: | ||
# Create a connection to the Hyper process and connect to a hyper file | ||
# (create the file and replace if it exists). | ||
with Connection(endpoint=hyper.endpoint, database=database_path, create_mode=CreateMode.CREATE_AND_REPLACE) as connection: | ||
# Create a table definition with table name "flights" in the "public" schema | ||
# and columns for airport data. | ||
table_definition = TableDefinition( | ||
table_name=TableName("public", "flights"), | ||
columns=[ | ||
TableDefinition.Column('baro_altitude', SqlType.double(), Nullability.NULLABLE), | ||
TableDefinition.Column('callsign', SqlType.text(), Nullability.NOT_NULLABLE), | ||
TableDefinition.Column('latitude', SqlType.double(), Nullability.NULLABLE), | ||
TableDefinition.Column('longitude', SqlType.double(), Nullability.NULLABLE), | ||
TableDefinition.Column('on_ground', SqlType.bool(), Nullability.NOT_NULLABLE), | ||
TableDefinition.Column('origin_country', SqlType.text(), Nullability.NOT_NULLABLE), | ||
TableDefinition.Column('time_position', SqlType.int(), Nullability.NULLABLE), | ||
TableDefinition.Column('velocity', SqlType.double(), Nullability.NULLABLE), | ||
]) | ||
# Create the flights table. | ||
connection.catalog.create_table(table_definition) | ||
|
||
# Insert each of the states into the table. | ||
with Inserter(connection, table_definition) as inserter: | ||
for s in states.states: | ||
inserter.add_row([s.baro_altitude, s.callsign, s.latitude, s.longitude, s.on_ground, s.origin_country, s.time_position, s.velocity]) | ||
inserter.execute() | ||
|
||
num_flights = connection.execute_scalar_query(query=f"SELECT COUNT(*) from {table_definition.table_name}") | ||
print(f"Inserted {num_flights} flights into {database_path}.") | ||
|
||
def publish_to_server(server_url, tableau_auth, project_name, database_path, datasource_name_on_server): | ||
""" | ||
Creates the datasource on Tableau Server if it has not yet been created. Otherwise, uses the | ||
Hyper Update REST API (https://help.tableau.com/current/api/rest_api/en-us/REST/rest_api_how_to_update_data_to_hyper.htm) to append the data to the datasource. | ||
""" | ||
# Create a tableuserverclient object to interact with Tableau Server. | ||
server = TSC.Server(server_url, use_server_version=True) | ||
# Sign into Tableau Server with the above authentication information. | ||
with server.auth.sign_in(tableau_auth): | ||
# Get project_id from project_name. | ||
matching_projects = server.projects.filter(name=project_name) | ||
project_id = next((project.id for project in matching_projects if project.name == project_name), None) | ||
if project_id is None: | ||
print(f"Publish failed. The specified project '{project_name}' does not exist.") | ||
exit() | ||
|
||
# Get the datasource from Server (if it exists). | ||
matching_datasources = server.datasources.filter(name=datasource_name_on_server) | ||
datasource = next((ds for ds in matching_datasources), None) | ||
|
||
if datasource is None: | ||
# If the datasource does not exist on server, publish the datasource. | ||
publish_mode = TSC.Server.PublishMode.CreateNew | ||
datasource = TSC.DatasourceItem(project_id) | ||
# Set the name of the datasource such that it can be easily identified. | ||
datasource.name = datasource_name_on_server | ||
datasource = server.datasources.publish(datasource, database_path, publish_mode) | ||
print(f"New datasource published: (id : {datasource.id}, name: {datasource.name}).") | ||
else: | ||
# If the datasource already exists on Tableau Server, use the Hyper Update REST API | ||
# to send the delta to Tableau Server and insert the data into the respective table | ||
# in the datasource. | ||
|
||
# Create a new random request id. | ||
request_id = str(uuid.uuid4()) | ||
|
||
# Create one action that inserts from the new table into the existing table. | ||
# For more details, see https://help.tableau.com/current/api/rest_api/en-us/REST/rest_api_how_to_update_data_to_hyper.htm#action-batch-descriptions | ||
actions = [ | ||
{ | ||
"action": "insert", | ||
"source-schema": "public", | ||
"source-table": "flights", | ||
"target-schema": "public", | ||
"target-table": "flights", | ||
} | ||
] | ||
|
||
# Start the update job on Server. | ||
job = server.datasources.update_hyper_data(datasource.id, request_id=request_id, actions=actions, payload=database_path) | ||
print(f"Update job posted (ID: {job.id}). Waiting for the job to complete...") | ||
|
||
# Wait for the job to finish. | ||
job = server.jobs.wait_for_job(job) | ||
print("Job finished successfully") | ||
|
||
|
||
if __name__ == '__main__': | ||
argparser = argparse.ArgumentParser(description="Incremental refresh with flights data.") | ||
argparser.add_argument("server_url", help="The url of Tableau Server / Cloud, e.g. 'https://us-west-2a.online.tableau.com'") | ||
argparser.add_argument("site_name", help="The name of your site, e.g., use 'default' for your default site. Note that you cannot use 'default' in Tableau Cloud but must use the site name.", default='default') | ||
argparser.add_argument("project_name", help="The name of your project, e.g., use an empty string ('') for your default project.", default="") | ||
argparser.add_argument("token_name", help="The name of your authentication token for Tableau Server/Cloud. See this url for more details: https://help.tableau.com/current/server/en-us/security_personal_access_tokens.htm") | ||
argparser.add_argument("token_value", help="The value of your authentication token for Tableau Server/Cloud. See this url for more details: https://help.tableau.com/current/server/en-us/security_personal_access_tokens.htm") | ||
args = argparser.parse_args() | ||
|
||
# First create the hyper database with flights data. | ||
database_path = "flights.hyper" | ||
create_hyper_database_with_flights_data(database_path) | ||
|
||
# Then publish the data to server. | ||
datasource_name_on_server = 'flights_data_set' | ||
# Create credentials to sign into Tableau Server. | ||
tableau_auth = TSC.PersonalAccessTokenAuth(args.token_name, args.token_value, args.site_name) | ||
publish_to_server(args.server_url, tableau_auth, args.project_name, database_path, datasource_name_on_server) |
3 changes: 3 additions & 0 deletions
3
Community-Supported/flights-data-incremental-refresh/requirements.txt
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
tableauhyperapi>=0.0.14946 | ||
tableauserverclient>=0.19.0 | ||
https://github.com/openskynetwork/opensky-api/archive/master.zip#subdirectory=python |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters