Skip to content

Commit

Permalink
fix(writer): Initialise retry settings with a delay between retires (G…
Browse files Browse the repository at this point in the history
…oogleCloudDataproc#1030)

- The current retry settings was initialised with max attempts only, this initialises the delay between runs to 0, meaning that the 5 calls will happen with 0 millis between them and not actually an exponential retry strategy like it implies here: https://github.com/GoogleCloudDataproc/spark-bigquery-connector/blob/741bdb33bf5c06fb3fad9defffa65ea1470d2690/bigquery-connector-common/src/main/java/com/google/cloud/bigquery/connector/common/BigQueryDirectDataWriterHelper.java#L133-L142 (ran a test for the retryCallable method with the current settings and indeed the delay was 0 between calls)
- So we initialise, it with the same settings as reading (can be tuned later) as otherwise the API is hammered leading to all failures
  • Loading branch information
mina-asham authored Aug 14, 2023
1 parent b3bda85 commit 7287549
Showing 1 changed file with 9 additions and 8 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -211,9 +211,6 @@ public static WriteMethod from(@Nullable String writeMethod) {
private CompressionCodec arrowCompressionCodec = DEFAULT_ARROW_COMPRESSION_CODEC;
private WriteMethod writeMethod = DEFAULT_WRITE_METHOD;
boolean writeAtLeastOnce = false;
// for V2 write with BigQuery Storage Write API
RetrySettings bigqueryDataWriteHelperRetrySettings =
RetrySettings.newBuilder().setMaxAttempts(5).build();
private int cacheExpirationTimeInMinutes = DEFAULT_CACHE_EXPIRATION_IN_MINUTES;
// used to create BigQuery ReadSessions
private com.google.common.base.Optional<String> traceId;
Expand Down Expand Up @@ -947,11 +944,14 @@ public Optional<String> getKmsKeyName() {

@Override
public RetrySettings getBigQueryClientRetrySettings() {
int maxAttempts =
sparkBigQueryProxyAndHttpConfig.getHttpMaxRetry().orElse(DEFAULT_BIGQUERY_CLIENT_RETRIES);
return getRetrySettings(maxAttempts);
}

private static RetrySettings getRetrySettings(int maxAttempts) {
return RetrySettings.newBuilder()
.setMaxAttempts(
sparkBigQueryProxyAndHttpConfig
.getHttpMaxRetry()
.orElse(DEFAULT_BIGQUERY_CLIENT_RETRIES))
.setMaxAttempts(maxAttempts)
.setTotalTimeout(Duration.ofMinutes(10))
.setInitialRpcTimeout(Duration.ofSeconds(60))
.setMaxRpcTimeout(Duration.ofMinutes(5))
Expand All @@ -962,8 +962,9 @@ public RetrySettings getBigQueryClientRetrySettings() {
.build();
}

// for V2 write with BigQuery Storage Write API
public RetrySettings getBigqueryDataWriteHelperRetrySettings() {
return bigqueryDataWriteHelperRetrySettings;
return getRetrySettings(5);
}

public WriteMethod getWriteMethod() {
Expand Down

0 comments on commit 7287549

Please sign in to comment.