Error when trying to write pyspark dataframe to DynamoDB #92

jcerquozzi · 2021-02-11T18:29:26Z

Hi,

I am trying to write a pyspark dataframe (that comes from a parquet file) to DynamoDB, but I am getting the following error:

AnalysisException: TableProvider implementation dynamodb cannot be written with ErrorIfExists mode, please use Append or Overwrite modes instead.;

The code I am using is:

df = sqlContext.read.parquet(path)

df.write.option("tableName", "dynamo_test") \
            .format("dynamodb") \
            .save()

I tried putting

df.write.option("tableName", "dynamo_test") \
                .format("dynamodb").mode("overwrite") \
                .save()

And got error:

AnalysisException: Table dynamo_test does not support truncate in batch mode.;;

The text was updated successfully, but these errors were encountered:

rehevkor5 · 2021-02-20T01:36:28Z

I believe Append is the appropriate choice, try adding:

.mode(SaveMode.Append)

The example in the README is bad, for this. See also the method DynamoDBDataFrameWriter#dynamodb(tableName: String) in implicits.scala. You can see that it specifies SaveMode.Append.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error when trying to write pyspark dataframe to DynamoDB #92

Error when trying to write pyspark dataframe to DynamoDB #92

jcerquozzi commented Feb 11, 2021 •

edited

Loading

rehevkor5 commented Feb 20, 2021

Error when trying to write pyspark dataframe to DynamoDB #92

Error when trying to write pyspark dataframe to DynamoDB #92

Comments

jcerquozzi commented Feb 11, 2021 • edited Loading

rehevkor5 commented Feb 20, 2021

jcerquozzi commented Feb 11, 2021 •

edited

Loading