Skip to content
This repository has been archived by the owner on Aug 31, 2021. It is now read-only.

Error when trying to write pyspark dataframe to DynamoDB #92

Open
jcerquozzi opened this issue Feb 11, 2021 · 1 comment
Open

Error when trying to write pyspark dataframe to DynamoDB #92

jcerquozzi opened this issue Feb 11, 2021 · 1 comment

Comments

@jcerquozzi
Copy link

jcerquozzi commented Feb 11, 2021

Hi,

I am trying to write a pyspark dataframe (that comes from a parquet file) to DynamoDB, but I am getting the following error:

AnalysisException: TableProvider implementation dynamodb cannot be written with ErrorIfExists mode, please use Append or Overwrite modes instead.;

The code I am using is:

df = sqlContext.read.parquet(path)

df.write.option("tableName", "dynamo_test") \
            .format("dynamodb") \
            .save()

I tried putting

df.write.option("tableName", "dynamo_test") \
                .format("dynamodb").mode("overwrite") \
                .save()

And got error:

AnalysisException: Table dynamo_test does not support truncate in batch mode.;;

@rehevkor5
Copy link

I believe Append is the appropriate choice, try adding:

.mode(SaveMode.Append)

The example in the README is bad, for this. See also the method DynamoDBDataFrameWriter#dynamodb(tableName: String) in implicits.scala. You can see that it specifies SaveMode.Append.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Development

No branches or pull requests

2 participants