Incomplete schema inference while reading from DynamoDB table #90

siah210 · 2021-01-21T08:08:22Z

DynamoDB Table:

I am reading the above table using the following code:

spark.read
        .option("tableName", config.tableName)
        .option("region", config.ddbConfig.region)
        .format("dynamodb")
        .load()
df.show()

Result:
|s_id| created_on|p_id|
+----+-------------------+----+
| 002|2018-11-20 12:01:19| 2|
| 001|2018-11-19 12:01:19| 1|
| 006|2018-11-20 12:01:19| 6|
| 005|2018-11-19 12:01:20| 5|
| 004|2018-12-19 12:01:19| 4|
| 003|2019-11-19 12:01:19| 3|

The "num" column was missing from the df. Why did this happen? Is there any flag which I need to set to ensure complete schema inference?

The text was updated successfully, but these errors were encountered:

Aniruddha-2016 · 2021-02-08T13:14:10Z

You can pass userSchema option and your schema along with that otherwise it creates schema from the data on first page of dynamodb table.

siah210 · 2021-02-22T17:28:39Z

thanks! this helps.

This library returned an empty dataframe when I tried to read a DDB table with both range key and hash key. Is this a known behaviour?

phitotient · 2021-03-26T12:56:54Z

@siah210 you should pass schema with .schema() parameter just like you do to normal DF that would work.

julienrf mentioned this issue Feb 22, 2024

Schema inference for DynamoDB is broken scylladb/scylla-migrator#103

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Incomplete schema inference while reading from DynamoDB table #90

Incomplete schema inference while reading from DynamoDB table #90

siah210 commented Jan 21, 2021

Aniruddha-2016 commented Feb 8, 2021 •

edited

Loading

siah210 commented Feb 22, 2021

phitotient commented Mar 26, 2021

Incomplete schema inference while reading from DynamoDB table #90

Incomplete schema inference while reading from DynamoDB table #90

Comments

siah210 commented Jan 21, 2021

Aniruddha-2016 commented Feb 8, 2021 • edited Loading

siah210 commented Feb 22, 2021

phitotient commented Mar 26, 2021

Aniruddha-2016 commented Feb 8, 2021 •

edited

Loading