Skip to content

Commit

Permalink
Enable curl error retries. (#5273)
Browse files Browse the repository at this point in the history
Core-side retries on curl errors have been explicitly disallowed in the
past in #3712 to protect from
double writes in case of errors in a connection teardown phase, so after
a fragment has been written, where a retry would just re-write the same
fragment.

However, we have experienced spurious "Connection reset by peer" type of
errors especially during ingestion (Write queries), which we think could
be mitigating by allowing retries for most of curl errors. (exclude
those that typically happen during connection teardown).


[[sc-54377](https://app.shortcut.com/tiledb-inc/story/54277/enable-retries-for-most-curl-error-codes-in-core)]

---
TYPE: IMPROVEMENT
DESC: Enable curl error retries.

---------

Co-authored-by: Shaun M Reed <[email protected]>
  • Loading branch information
ypatia and shaunrd0 authored Sep 4, 2024
1 parent 7fd6328 commit 0ac064b
Show file tree
Hide file tree
Showing 8 changed files with 359 additions and 139 deletions.
2 changes: 2 additions & 0 deletions test/src/unit-capi-config.cc
Original file line number Diff line number Diff line change
Expand Up @@ -228,6 +228,7 @@ void check_save_to_file() {
ss << "filestore.buffer_size 104857600\n";
ss << "rest.capnp_traversal_limit 2147483648\n";
ss << "rest.curl.buffer_size 524288\n";
ss << "rest.curl.retry_errors true\n";
ss << "rest.curl.verbose false\n";
ss << "rest.http_compressor any\n";
ss << "rest.load_enumerations_on_array_open false\n";
Expand Down Expand Up @@ -609,6 +610,7 @@ TEST_CASE("C API: Test config iter", "[capi][config]") {
all_param_values["rest.retry_http_codes"] = "503";
all_param_values["rest.capnp_traversal_limit"] = "2147483648";
all_param_values["rest.curl.buffer_size"] = "524288";
all_param_values["rest.curl.retry_errors"] = "true";
all_param_values["rest.curl.verbose"] = "false";
all_param_values["rest.load_metadata_on_array_open"] = "false";
all_param_values["rest.load_non_empty_domain_on_array_open"] = "false";
Expand Down
37 changes: 37 additions & 0 deletions test/src/unit-curl.cc
Original file line number Diff line number Diff line change
Expand Up @@ -182,3 +182,40 @@ TEST_CASE(
const auto& extra_headers = resources.rest_client()->extra_headers();
CHECK(extra_headers.at("X-Payer") == "foo");
}

TEST_CASE("Test retry logic", "[rest-client][retries]") {
// set the HTTP codes to retry
tiledb::sm::Config cfg;
REQUIRE(cfg.set("rest.retry_http_codes", "503").ok());

std::unordered_map<std::string, std::string> extra_headers;
std::unordered_map<std::string, std::string> res_headers;
std::mutex res_mtx;
Curl curl(tiledb::test::g_helper_logger());
auto rc = curl.init(&cfg, extra_headers, &res_headers, &res_mtx);

// http error not in retry list
CURLcode curl_code = CURLE_OK;
long http_code = 504;
REQUIRE(curl.should_retry_request(curl_code, http_code) == false);

// http error in retry list
http_code = 503;
REQUIRE(curl.should_retry_request(curl_code, http_code) == true);

// Curl error not in retry list
curl_code = CURLE_SSL_SHUTDOWN_FAILED;
REQUIRE(curl.should_retry_request(curl_code, http_code) == false);

// Curl error in retry list
curl_code = CURLE_RECV_ERROR;
// and http error not in retry list
http_code = 504;
REQUIRE(curl.should_retry_request(curl_code, http_code) == true);

// Curl error in retry list but retries disabled in config
REQUIRE(cfg.set("rest.curl.retry_errors", "false").ok());
rc = curl.init(&cfg, extra_headers, &res_headers, &res_mtx);
curl_code = CURLE_RECV_ERROR;
REQUIRE(curl.should_retry_request(curl_code, http_code) == false);
}
3 changes: 3 additions & 0 deletions tiledb/api/c_api/config/config_api_external.h
Original file line number Diff line number Diff line change
Expand Up @@ -728,6 +728,9 @@ TILEDB_EXPORT void tiledb_config_free(tiledb_config_t** config) TILEDB_NOEXCEPT;
* The delay factor to exponentially wait until further retries of a failed
* REST request <br>
* **Default**: 1.25
* - `rest.curl.retry_errors` <br>
* If true any curl requests that returned an error will be retried <br>
* **Default**: true
* - `rest.curl.verbose` <br>
* Set curl to run in verbose mode for REST requests <br>
* curl will print to stdout with this option
Expand Down
2 changes: 2 additions & 0 deletions tiledb/sm/config/config.cc
Original file line number Diff line number Diff line change
Expand Up @@ -91,6 +91,7 @@ const std::string Config::REST_RETRY_DELAY_FACTOR = "1.25";
const std::string Config::REST_CURL_BUFFER_SIZE = "524288";
const std::string Config::REST_CAPNP_TRAVERSAL_LIMIT = "2147483648";
const std::string Config::REST_CURL_VERBOSE = "false";
const std::string Config::REST_CURL_RETRY_ERRORS = "true";
const std::string Config::REST_LOAD_ENUMERATIONS_ON_ARRAY_OPEN = "false";
const std::string Config::REST_LOAD_METADATA_ON_ARRAY_OPEN = "true";
const std::string Config::REST_LOAD_NON_EMPTY_DOMAIN_ON_ARRAY_OPEN = "true";
Expand Down Expand Up @@ -256,6 +257,7 @@ const std::map<std::string, std::string> default_config_values = {
std::make_pair(
"rest.capnp_traversal_limit", Config::REST_CAPNP_TRAVERSAL_LIMIT),
std::make_pair("rest.curl.verbose", Config::REST_CURL_VERBOSE),
std::make_pair("rest.curl.retry_errors", Config::REST_CURL_RETRY_ERRORS),
std::make_pair(
"rest.load_enumerations_on_array_open",
Config::REST_LOAD_ENUMERATIONS_ON_ARRAY_OPEN),
Expand Down
3 changes: 3 additions & 0 deletions tiledb/sm/config/config.h
Original file line number Diff line number Diff line change
Expand Up @@ -110,6 +110,9 @@ class Config {
/** The default for Curl's verbose mode used by REST. */
static const std::string REST_CURL_VERBOSE;

/** If we should retry Curl errors in requests to REST. */
static const std::string REST_CURL_RETRY_ERRORS;

/** If the array enumerations should be loaded on array open */
static const std::string REST_LOAD_ENUMERATIONS_ON_ARRAY_OPEN;

Expand Down
3 changes: 3 additions & 0 deletions tiledb/sm/cpp_api/config.h
Original file line number Diff line number Diff line change
Expand Up @@ -903,6 +903,9 @@ class Config {
* The delay factor to exponentially wait until further retries of a failed
* REST request <br>
* **Default**: 1.25
* - `rest.curl.retry_errors` <br>
* If true any curl requests that returned an error will be retried <br>
* **Default**: true
* - `rest.curl.verbose` <br>
* Set curl to run in verbose mode for REST requests <br>
* curl will print to stdout with this option
Expand Down
Loading

0 comments on commit 0ac064b

Please sign in to comment.