Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Change type in Power BI fails due to cast to unsupported keyword type #2459

Open
andreaskulicke opened this issue Dec 18, 2023 · 0 comments
Labels
bug Something isn't working

Comments

@andreaskulicke
Copy link
Contributor

What is the bug?
Change type to text in Power BI is not working, an error is shown.

How can one reproduce the bug?
Steps to reproduce the behavior:

  1. Create an empty index

PUT /test

  1. Add a document with a number column

POST /test/_doc?refresh=true
{
"column1": 1
}

  1. Use Power BI with OpenSearch data source connector
  2. Connect to OpenSearch
  3. Select "test" table/index
  4. Click on "Transform Data" button, then "Import" to open Power Query Editor
  5. Right click on column1, select Change Type -> Text
  6. Click "Close & Apply"

An error dialog pops up:

OLE DB or ODBC error: [DataSource.Error] ODBC: ERROR [42000] [OpenSearch][SQL ODBC Driver][SQL Plugin] Invalid SQL query: Unsupported cast type keyword.

What is the expected behavior?
No error dialog, data type is just changed.

What is your host/environment?

  • OS: Windows
  • Version 2.11.1.0
  • Plugins SQL

Do you have any screenshots?
No

Do you have any additional context?
The OpenSearch Power BI connector reports its SQL conformance level as INTERMEDIATE:

https://github.com/opensearch-project/sql-odbc/blob/main/bi-connectors/PowerBIConnector/src/OpenSearchProject.pq#L88C1-L91C12

        SQLGetInfo = Diagnostics.LogValue("SQLGetInfo_Options", [
            SQL_AGGREGATE_FUNCTIONS = ODBC[SQL_AF][All],
            SQL_SQL_CONFORMANCE = ODBC[SQL_SC][SQL_SC_SQL92_INTERMEDIATE]
        ]),

Looks like this (also) means that the CAST operator is supported:
https://learn.microsoft.com/en-us/sql/odbc/reference/appendixes/sql-92-cast-function?view=sql-server-ver16

SQLGetTypeInfo of the OpenSearch ODBC driver returns the supported types like so (from enabled tracing, snipped):

SQLGetTypeInfo boolean: [ TYPE_NAME = \"boolean\", DATA_TYPE = -7, ...
SQLGetTypeInfo boolean: [ TYPE_NAME = \"boolean\", DATA_TYPE = -7, ...
SQLGetTypeInfo byte: [ TYPE_NAME = \"byte\", DATA_TYPE = -6, ...
SQLGetTypeInfo short: [ TYPE_NAME = \"short\", DATA_TYPE = 5, ...
SQLGetTypeInfo keyword: [ TYPE_NAME = \"keyword\", DATA_TYPE = -9, ...
SQLGetTypeInfo text: [ TYPE_NAME = \"text\", DATA_TYPE = -9, ...
SQLGetTypeInfo nested: [ TYPE_NAME = \"nested\", DATA_TYPE = -9, ...
SQLGetTypeInfo object: [ TYPE_NAME = \"object\", DATA_TYPE = -9, ...
SQLGetTypeInfo integer: [ TYPE_NAME = \"integer\", DATA_TYPE = 4, ...
SQLGetTypeInfo double: [ TYPE_NAME = \"double\", DATA_TYPE = 8, ...
SQLGetTypeInfo scaled_float: [ TYPE_NAME = \"scaled_float\", DATA_TYPE = 8, ...
SQLGetTypeInfo long: [ TYPE_NAME = \"long\", DATA_TYPE = -5, ...
SQLGetTypeInfo half_float: [ TYPE_NAME = \"half_float\", DATA_TYPE = 7, ...
SQLGetTypeInfo float: [ TYPE_NAME = \"float\", DATA_TYPE = 7, ...
SQLGetTypeInfo date: [ TYPE_NAME = \"date\", DATA_TYPE = 93, ...
SQLGetTypeInfo timestamp: [ TYPE_NAME = \"timestamp\", DATA_TYPE = 93, ...

As far as I understood the table in:
https://opensearch.org/docs/latest/search-plugins/sql/datatypes/

That should be a subset of the "OpenSearach SQL Type" column. That would be fine.

So it looks like when Power BI tries to convert to "text":

select cast(`column1` as keyword) as `C1`

But keyword is not recognized by the SQL parser, just the following:

https://github.com/opensearch-project/sql/blob/main/sql/src/main/antlr/OpenSearchSQLParser.g4#L452

...

specificFunction
   : CASE expression caseFuncAlternative+ (ELSE elseArg = functionArg)? END     # caseFunctionCall
   | CASE caseFuncAlternative+ (ELSE elseArg = functionArg)? END                # caseFunctionCall
   | CAST '(' expression AS convertedDataType ')'                               # dataTypeFunctionCall
   ;

...

convertedDataType
   : typeName = DATE
   | typeName = TIME
   | typeName = TIMESTAMP
   | typeName = INT
   | typeName = INTEGER
   | typeName = DOUBLE
   | typeName = LONG
   | typeName = FLOAT
   | typeName = STRING
   | typeName = BOOLEAN
   ;

So it looks like there is an inconsistency in the documented OpenSearch SQL types, the types reported by the ODBC drivers SQLGetTypeInfo and the SQL parser.

Adding keyword support here and in OpenSearchSQLLexer.g4, Cast.java seems to fix the issue, but might not be the best approach as there seems to be already some discussions about types, following the issues linked below.

Related issues that I found, but not quite the same:

Remark:
Using the plain ODBC Power BI data source connector works fine as no query folding happens (no SQL92 intermediate support).

@andreaskulicke andreaskulicke added bug Something isn't working untriaged labels Dec 18, 2023
@Swiddis Swiddis removed the untriaged label Mar 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants