Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Report database name as schema instead of catalog to resolve compatibility issues with Tableau #22

Merged
merged 15 commits into from
Dec 21, 2022

Conversation

alinaliBQ
Copy link
Contributor

@alinaliBQ alinaliBQ commented Dec 12, 2022

report database name as schema instead of catalog

Description

Summary

  • report database name as schema instead of catalog in getTables and getColumns
  • implement getSchemas() and getSchemas(string, string) to return list of database names
  • implement getCatalogs to return empty list
  • update supportsSchemasInDataManipulation to return true, since schemas are now supported
  • tests update
  • bump JDBC version from 1.0.2 to 2.0.0, the next major version
  • documentation for setting up the driver with Tableau and other BI tools will be in separate PRs

Note that as a result of this change, applications using the Timestream JDBC driver programmatically may break, so the version of the driver is bumped to the next major version, 2.0.0.

Previously, databases are reported as catalogs. This PR changes the driver to report databases as schemas instead. This change allows Tableau to generate correct SQL queries to the driver.

How getSchemas and getCatalogs function before changes in this PR

graph TD
    A[getSchemas] --> |returns| G
    G[empty result set]
    G --> H[TABLE_SCHEM -> null]
    G --> I[TABLE_CATALOG -> null]

    B[getCatalogs] --> |returns| E[database result set]
    E --> J[TABLE_CAT database name]
Loading

How getSchemas and getCatalogs function after changes in this PR

graph TD
    A[getSchemas] --> |returns| G
    G[database result set]
    G --> H[TABLE_SCHEM -> database name]
    G --> I[TABLE_CATALOG -> null]

    B[getCatalogs] --> |returns| E[empty result set]
    E --> J[TABLE_CAT -> null]
Loading

How table metadata is changed

Before:

graph TD
    A[getTables] --> |returns| G
    G[tables result set]
    G --> H[TABLE_CAT -> database name]
    G --> I[TABLE_SCHEM -> null]
    G --> K[other return values unchanged]
Loading

After:

graph TD
    A[getTables] --> |returns| G
    G[tables result set]
    G --> H[TABLE_CAT -> null]
    G --> I[TABLE_SCHEM -> database name]
    G --> K[other return values unchanged]
Loading

How column metadata is changed

Before:

graph TD
    A[getColumns] --> |returns| G
    G[columns result set]
    G --> H[TABLE_CAT -> database name]
    G --> I[TABLE_SCHEM -> null]
    G --> K[other return values unchanged]
Loading

After:

graph TD
    A[getColumns] --> |returns| G
    G[columns result set]
    G --> H[TABLE_CAT -> null]
    G --> I[TABLE_SCHEM -> database name]
    G --> K[other return values unchanged]
Loading

Test Results

With changes in this PR, tdvt had a 10 times increase of success rate, and Tableau is able to generate correct SQL queries to the driver.

Results from running existing JDBC driver:

Test Count: 863 tests
        Passed tests: 10
        Failed tests: 853
        Tests run: 863
        Disabled tests: 0
        Skipped tests: 0

Other information:
        Smoke test time: 37.8 seconds
        Main test time: 1175.82 seconds
        Total time: 1213.62 seconds

Results from running JDBC driver on this branch:

Test Count: 863 tests
        Passed tests: 100
        Failed tests: 763
        Tests run: 863
        Disabled tests: 0
        Skipped tests: 0

Other information:
        Smoke test time: 18.45 seconds
        Main test time: 432.35 seconds
        Total time: 450.8 seconds

Summary of Tableau manual testing results

Test Case Status / Comments / Cause of Failure Generated SQL Query (if known)
Test Case 1. Validate Schema is displayed properly (all database names are displayed) PASSED
Database names containing “_”, digit, lower case and upper case can display correctly
 
Test Case 2. Tables are displayed correctly when its database is chosen as Schema PASSED
Tables with names including digits, dash, period, or underscore are displayed correctly. No table is shown for the empty database, which is correct.
 
Test Case 3. Autogenerated SQL queries are correct CONDITIONALLY PASSED. Colons (":") are not supported by Timestream in AS statements. The workaround is to manually correct the generated SQL query to remove the colons (":") in AS statements. Auto-generated query for retrieving meta_queries_test_db.DevOpsMulti:
SELECT "DevOpsMulti"."az" AS "az", "DevOpsMulti"."cpu_utilization" AS "cpu_utilization","DevOpsMulti"."hostname" AS "hostname","DevOpsMulti"."measure_name" AS "measure_name","DevOpsMulti"."memory_utilization" AS "memory_utilization", "DevOpsMulti"."region" AS "region", "DevOpsMulti"."time" AS "time" FROM "meta_queries_test_db"."DevOpsMulti" "DevOpsMulti" LIMIT 10000
This query works and runs successfully.
Auto-generated query for retrieving data_queries_test_db.TestComplexTypes:
SELECT "TestComplexTypes"."az" AS "az", "TestComplexTypes"."instance_id" AS "instance_id", "TestComplexTypes"."measure_name" AS "measure_name", "TestComplexTypes"."measure_value::double" AS "measure_value::double", "TestComplexTypes"."region" AS "region", "TestComplexTypes"."time" AS "time", "TestComplexTypes"."vpc" AS "vpc" FROM "data_queries_test_db"."TestComplexTypes" "TestComplexTypes" LIMIT 10000
This query does not work because Timestream server does not support queries with colons(:) in AS statements, it is not due to the driver, and server-side change is needed to make it work.
Test Case 4. Join 2 tables from the same database PASSED - Able to join meta_queries_test_db.TestColumnsMetadata1 with meta_queries_test_db.Test.ColumnsMetadata  
Test Case 5. Join 2 tables with the different names from different databases PASSED - Able to join tables with names including digits, dash, period, or underscore.
Test Case 6 a. Join 2 large tables with the same name from different databases FAILED. Tableau may generate colons (":") in the AS statement, causing query to fail on the server-side, especially when loading large tables. The initial query that Tableau generated, which works:
SELECT "DevOpsMulti"."az" AS "az", "DevOpsMulti1"."az" AS "az (DevOpsMulti1)", "DevOpsMulti"."cpu_utilization" AS "cpu_utilization", "DevOpsMulti1"."cpu_utilization" AS "cpu_utilization (DevOpsMulti1)", "DevOpsMulti"."hostname" AS "hostname", "DevOpsMulti1"."hostname" AS "hostname (DevOpsMulti1)", "DevOpsMulti"."measure_name" AS "measure_name", "DevOpsMulti1"."measure_name" AS "measure_name (DevOpsMulti1)", "DevOpsMulti"."memory_utilization" AS "memory_utilization", "DevOpsMulti1"."memory_utilization" AS "memory_utilization (DevOpsMulti1)", "DevOpsMulti"."region" AS "region", "DevOpsMulti1"."region" AS "region (DevOpsMulti1)", "DevOpsMulti"."time" AS "time", "DevOpsMulti1"."time" AS "time (DevOpsMulti1)" FROM "sampleDB_2"."DevOpsMulti" "DevOpsMulti" INNER JOIN "meta_queries_test_db"."DevOpsMulti" "DevOpsMulti1" ON (("DevOpsMulti"."az" = "DevOpsMulti1"."az") AND ("DevOpsMulti"."hostname" = "DevOpsMulti1"."hostname") AND ("DevOpsMulti"."region" = "DevOpsMulti1"."region") AND ("DevOpsMulti"."measure_name" = "DevOpsMulti1"."measure_name")) LIMIT 100
The subsequent query that Tableau generated and sent to the JDBC driver, which does not work:
SELECT SUM(1) AS "cnt:DevOpsMulti_E34B84972C4C458CB4D4BB86AD06DE0B:ok" FROM "sampleDB_2"."DevOpsMulti" "DevOpsMulti" INNER JOIN "meta_queries_test_db"."DevOpsMulti" "DevOpsMulti1" ON (("DevOpsMulti"."az" = "DevOpsMulti1"."az") AND ("DevOpsMulti"."hostname" = "DevOpsMulti1"."hostname") AND ("DevOpsMulti"."region" = "DevOpsMulti1"."region") AND ("DevOpsMulti"."measure_name" = "DevOpsMulti1"."measure_name")) HAVING (COUNT(1) > 0)
Tableau tried several times to re-run the query above, but all met with failure due to server not supporting queries including colons (“:“) in the AS statement
Test Case 6 b. Join 2 tables with the same name from different databases PASSED - Able to join data_queries_test_db.TestScalarTypes and meta_queries_test_db.TestScalarTypes. This test case worked because the tables contain only 4 rows and Tableau did not generate queries including colons in the AS statement. SELECT "TestScalarTypes"."device_id" AS "device_id", "TestScalarTypes1"."device_id" AS "device_id (TestScalarTypes1)", "TestScalarTypes"."device_type" AS "device_type", "TestScalarTypes1"."device_type" AS "device_type (TestScalarTypes1)", "TestScalarTypes"."flag" AS "flag", "TestScalarTypes1"."flag" AS "flag (TestScalarTypes1)", "TestScalarTypes"."measure_name" AS "measure_name", "TestScalarTypes1"."measure_name" AS "measure_name (TestScalarTypes1)", "TestScalarTypes"."os_version" AS "os_version", "TestScalarTypes1"."os_version" AS "os_version (TestScalarTypes1)", "TestScalarTypes"."rebuffering_ratio" AS "rebuffering_ratio", "TestScalarTypes1"."rebuffering_ratio" AS "rebuffering_ratio (TestScalarTypes1)", "TestScalarTypes"."region" AS "region", "TestScalarTypes1"."region" AS "region (TestScalarTypes1)", "TestScalarTypes"."time" AS "time", "TestScalarTypes1"."time" AS "time (TestScalarTypes1)", "TestScalarTypes"."video_startup_time" AS "video_startup_time", "TestScalarTypes1"."video_startup_time" AS "video_startup_time (TestScalarTypes1)" FROM "meta_queries_test_db"."TestScalarTypes" "TestScalarTypes" INNER JOIN "data_queries_test_db"."TestScalarTypes" "TestScalarTypes1" ON ("TestScalarTypes"."device_id" = "TestScalarTypes1"."device_id") LIMIT 100
Test Case 7. Columns names and data types are displayed properly for each table PASSED - Unicode column name displays correctly. Data types varchar, timestamp, boolean, double, and bigint display correctly.  
Test Case 8. Filter tables using table name sub-strings PASSED - Able to filter table names using “.”, “-”, “_”, and text (“Multi“).  

For details of Tableau manual testing results, please refer to Tableau Manual Testing Documentation

Additional Reviewers

@affonsoBQ
@alexey-temnikov
@alinaliBQ
@andiem-bq
@birschick-bq
@mitchell-elholm
@RoyZhang2022

* [AT-1169] Make getCatalog return empty result set and make getSchems return all database names

* [AT-1169] temporarily comment out GPG signining

* [AT-1169] add schemaPattern parameter to TimestreamDatabasesResultSet constructor

* [AT-1169] add schemaPattern parameter to populateCurrentRows(connection) function

* [AT-1169] filter based on schemaPattern
* update spotbugs-exclude.xml

* [AT-1169] filter table names based on schemaPattern
* return Null for catalog and database name for schema. Tableau now reporting null.TableName when it's trying to execute a query. Need to investigate where Tableau gets the null from.

* [AT-1169] remove comments

* [AT-1169] retrieve data normally by getting database name with correct column index

* [AT-1169] change order of data in column metadata: make database name be under TABLE_SCHEM instead of TABLE_CAT
* remove comments

* [AT-1169] implement getSchemas and getCatalogs correctly

* [AT-1169] remove TODO

* [AT-1169] fix bug of catalog used instead of schemaPattern

* [AT-1169] refactor spotbugs-exclude

* [AT-1169] refactor code

* [AT-1169] fix current tests for `getColumns()` and `getTables()`. This is due to database name is being returned as schema instead of catalog in `getColumns()` and `getTables()`.

* [AT-1169] add null check for ResultSet in `populateCurrentRows`

* [AT-1169] add tests for getSchema()

* [AT-1169] add test for getCatalogs(). This test checks that the resultSet from getCatalogs is empty

* [AT-1169] add test for getSchemas(String, String)

* [AT-1169] separate test for getSchemas(String, String)

* [AT-1169] documentation updates

* [AT-1169] update getCatalogs() test

* [AT-1169] update getCatalogs() test

* [AT-1169] implement testGetSchemasWithSchemaPattern and add mock test results

* [AT-1169] add test testGetSchemasWithSchemaPattern

* [AT-1169] fix read me

* [AT-1169] refactoring - remove comments and commented out code

* [AT-1169] refactoring - remove comments and commented out code

* [AT-1169] refactoring - change order of tests

* [AT-1169] uncomment out gpg signing

* [AT-1169] remove comment

* [AT-1169] remove uneeded null check. I believe `rs` is null only when returning from Mockito (the mock test suite), not in production code

* [AT-1169] resolve code review comments

* [AT-1169] resolve code review comments

* [AT-1169] add JavaDoc for integration tests

* [AT-1169] add explanation for adding exclude in spotbugs-exclude.xml

* [AT-1169] add documentation for integeration tests.

* [AT-1169] bump JDBC version from 1.0.2 to 2.0.0

* [AT-1169] uncomment out gpg-plugin

* [AT-1169] add java doc for metdata unit tests
@@ -134,7 +134,7 @@ protected boolean doNextPage() throws SQLException {
}

// Get the columns for the next table.
curDatabase = tablesResult.getString(1);
curDatabase = tablesResult.getString(2);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can ordinals be defined as constants and referred in line137-138 instead of relying on magic numbers?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure. The code change has been pushed.

public TimestreamSchemasResultSet() {
super(COLUMNS);
TimestreamSchemasResultSet(TimestreamConnection connection, String schemaPattern) throws SQLException {
super(null, 20);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason to set the fetchSize as 20? - Reference to base class code:

TimestreamBaseResultSet(final TimestreamStatement statement, final int fetchSize) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fetchSize value is set to 20 for database retrieval. We used the same value that was used before. Reference to previous code:


We suggest keeping the same behavior in this PR.

@Override
public boolean isLast() throws SQLException {
verifyOpen();
LOGGER.debug("Checking whether the last row of this TimestreamSchemasResultSet has been reached.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reference documentation on how does calling application (like this sample: https://github.com/awslabs/amazon-timestream-tools/tree/mainline/integrations/jdbc ) can enable printing debug logs to console?

Copy link
Contributor

@alexey-temnikov alexey-temnikov Dec 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This driver is using slf4j facade for logging. This means the application (in this case the provided sample) should provide the implementation for logging. There are multiple slf4j implementations, one of the common is logback. Example of how to configure it is available here - https://mkyong.com/logging/slf4j-logback-tutorial/.

In order to enable debug logs this application should be adjusted in the following way -

  1. Add Dependency with slf4j implementation, e.g.
       <dependency>
            <groupId>ch.qos.logback</groupId>
            <artifactId>logback-classic</artifactId>
            <version>1.4.5</version>
        </dependency>
  1. Add logback.xml configuration file to resources folder:
<?xml version="1.0" encoding="UTF-8"?>
<configuration>
    <appender name="CONSOLE" class="ch.qos.logback.core.ConsoleAppender">
        <layout class="ch.qos.logback.classic.PatternLayout">
            <Pattern>
                %d{HH:mm:ss.SSS} [%t] %-5level %logger{36} - %msg%n
            </Pattern>
        </layout>
    </appender>
    <root level="debug">
        <appender-ref ref="CONSOLE"/>
    </root>
</configuration>

This is one of the possible options to configure slf4j. Application may choose to use other logging implementations, like log4j, jul, etc.

* Input values tested for schemaPattern: {"%", "testDB", "%testDB%"}
*/
@ParameterizedTest
@ValueSource(strings = {"%", "testDB", "%testDB%"})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be good to add negative test cases with invalid schema pattern that is expected to fail and the test can ensure that the right exception is received from JDBC driver

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The invalid schema pattern test is added.

* @throws SQLException If an error occurs while retrieving the value.
*/
private void testGetSchemasResult(ResultSet resultSet) throws SQLException {
final String[] string1 = {"", "testDB", null};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have a test that expects > 1 row?. Let us say SHOW DATABASES returns 2 rows when there are 2 databases creates in the account in the given region.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test with two rows returned has been added.

Mockito.when(mockStatement.executeQuery("SHOW DATABASES LIKE '%'")).thenReturn(dbResultSet);
Mockito.when(mockStatement.executeQuery("SHOW DATABASES LIKE '%test%'")).thenReturn(dbResultSet);
Mockito.when(mockStatement.executeQuery("SHOW DATABASES LIKE 'testDB'")).thenReturn(dbResultSet);
Mockito.when(mockStatement.executeQuery("SHOW DATABASES LIKE '%testDB%'")).thenReturn(dbResultSet);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1/ Add a test case that returns empty Resultset
2/ Add a test case that throw could exception "Access denied" to simulate user who does not have Timestream Read permissions
3/ Add a test case that could return multiple rows and may require pagination

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For test 1, we will add a which checks that an empty Resultset is returned when an invalid schema pattern is provided. While Tests 2 and 3 might be very valuable addition for the future, our changes do not impact these behaviors, and it may take time to do the additional setup to properly test these cases. Suggest raising a separate issue for them, so these enhancements can be added without dependencies on these changes (and do not block this change).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test 2 for "Access denied" is added.

@@ -52,6 +52,59 @@ void init() throws SQLException {
dbMetaData = new TimestreamDatabaseMetaData(mockConnection);
}

/**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there new tests added to test the code changes done in TimestreamTablesResultSet.java ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The old tests for TimestreamTablesResultSet.java are modified based on new changes, so the new changes are covered

@@ -326,7 +404,7 @@ private void testGetColumnsResult(ResultSet resultSet) throws SQLException {
* @throws SQLException If an error occurs while retrieving the value.
*/
private void testGetTableResult(ResultSet resultSet) throws SQLException {
final String[] strings = {"", "testDB", null, "testTable", "TABLE", null, null, null, null,
final String[] strings = {"", null, "testDB", "testTable", "TABLE", null, null, null, null,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add test cases that covers scenarios where
1/ rows returned is > 1
2/ Expect empty rows when there are no tables in specified database.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. They have been added. I’ve changed testGetTableResult to check scenario where rows returned > 1. I also added test testGetTablesWithEmptyDatabase for the second case.

/**
* Validate resultSet MetaData returned from getColumns.
*
* @param resultSet ResultSet need to be validated.
* @throws SQLException If an error occurs while retrieving the value.
*/
private void testGetColumnsResult(ResultSet resultSet) throws SQLException {
final String[] string1 = {"", "testDB", null, "testTable", "ColName", "12", "UNKNOWN",
final String[] string1 = {"", null, "testDB", "testTable", "ColName", "12", "UNKNOWN",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add test cases for
1/ Empty ResultSet
2/ expecting only one row in resultset
3/ multiple rows to simulate paginated results when metadata query is run against Timestream

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While Tests 1, 2 and 3 might be very valuable addition for the future, our changes do not impact these behaviors (getColumns function), and it may take time to do the additional setup to properly test these cases. Suggest raising a separate issue for them, so these enhancements can be added without dependencies on these changes (and do not block this change).

Copy link
Contributor

@sethusrinivasan sethusrinivasan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please review comments in this PR + add test cases that handles scenarios in a given account/region
1/ no databases
2/ One database; no tables
3/ One database ; one table
3/ One database; multiple tables with different naming pattern to test 'LIKE'
4/ Multiple databases ; multiple tables in each

TimestreamTablesResultSet in this file.-->
<Bug pattern="SQL_NONCONSTANT_STRING_PASSED_TO_EXECUTE"/>
<Class name="software.amazon.timestream.jdbc.TimestreamSchemasResultSet"/>
<Method name="populateCurrentRows" params="software.amazon.timestream.jdbc.TimestreamConnection,java.lang.String" returns="void" />
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why this exclusion is needed and wondering how it is related to Prepared statement?.
Will marking schemaPattern parameter as "final" resolve this issue?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting schemaPattern as “final“ will not resolve the issue. The cause of spotbugs reporting this as a bug is because "SHOW DATABASES LIKE 'pattern'" string is dynamically constructed when a pattern is provided. The proper solution would be to use Prepared Statement to construct a pre-compiled SQL query. However, Prepared Statements are not supported, so this exclusion would be needed. The exclusion comment has been updated to make it clearer on how prepared statement is related.

/**
* Checks that an empty result set is returned for getCatalogs
*/
@Test
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do these newly added test run as part of the GitHub CI Action - https://github.com/awslabs/amazon-timestream-driver-jdbc/actions/runs/3707649792/jobs/6344038925. If these tests were already run, wondering how I can check the test logs reported from the CI action

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Download the test log zip file. Find the file “4_Build with Maven.txt” It has test result of TimestreamDatabaseMetaDataTest.

2022-12-20T06:47:52.1273391Z [INFO] Running software.amazon.timestream.jdbc.TimestreamDatabaseMetaDataTest
2022-12-20T06:47:53.0289341Z [INFO] Tests run: 34, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.899 s - in software.amazon.timestream.jdbc.TimestreamDatabaseMetaDataTest

RoyZhang2022 and others added 6 commits December 20, 2022 14:34
* use constants instead of magic numbers
* clarify spot bugs exclude
* add test case that returns empty ResultSet
* add test testGetSchemasWithInvalidSchemaPattern
* testGetTableResult - Add test cases that covers scenarios where rows returned is > 1
* Add test cases that covers scenarios where there are empty rows when there are no tables in specified database.
@@ -444,16 +444,15 @@ To use the JAR with no dependencies in a Java application, the following require
Javadoc JAR builds with `mvn install` alongside the other JAR files. To extract the Javadoc HTML files, use the following command: `jar -xvf amazon-timestream-jdbc-1.0.0-javadoc.jar`

### Known Issues
1. Timestream does not support fully qualified table names. Tools like Tableau may not work as expected.
1. Timestream does not support fully qualified table names.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This statement could be little confusing for customers, especially when reading one of the caveats Timestream JDBC driver only works with queries with fully qualified table names.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it is confusing. After this change we could give fully qualified table names for Tableau. Timestream JDBC driver only works with queries with fully qualified table names. is removed.

import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.Statement;
import java.util.ArrayList;
import java.util.List;

/**
* Result set to return an empty list of schemas in Timestream.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment needs an update?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is updated.

databases.add(new Row().withData(
new Datum().withScalarValue(rs.getString(1)),
NULL_DATUM
));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why NULL_DATUM here? This is a behavior change from getCatalogs()?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe, it is for the catalog that gets returned as null? just ensuring.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is for catalog.

* @throws SQLException the exception thrown
*/
@ParameterizedTest
@ValueSource(strings = {"JDBC_%", "%_Integration%", "%Test_DB"})

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In SQL standard, the _ is also used as single character wildcard pattern, we should add specific test cases for it as well. Currently _ just happens to be included in the database name itself.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SQL standard also allows customers to specify the escape character in patterns with ESCAPE clause. Can we evaluate if it fits passing down to schema-pattern in a JDBC driver, in which case, some tests will need to be added.

Example:

SHOW DATABASES LIKE 'A3!_showDatabaseTestWithNoWildcard_' ESCAPE '!'"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added testcase for "_" and escape character using escape clause.

*/
@ParameterizedTest
@CsvSource(value = {
"%, 2",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add patterns with _ as well

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test has been updated.

@sethusrinivasan sethusrinivasan merged commit 88fac7a into awslabs:main Dec 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants