Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove ? operators with checks, to reduce crashing #5

Open
mikedotexe opened this issue Oct 21, 2022 · 1 comment
Open

Remove ? operators with checks, to reduce crashing #5

mikedotexe opened this issue Oct 21, 2022 · 1 comment
Labels
wontfix This will not be worked on

Comments

@mikedotexe
Copy link

I think we might need to be careful about using the ? operator for long-running tasks like the agent and indexer.

This is a pretty good rundown:
https://doc.rust-lang.org/book/ch09-02-recoverable-errors-with-result.html#a-shortcut-for-propagating-errors-the--operator

It's basically like, "something happened that means we gotta stop"

If the value is an Err, the Err will be returned from the whole function as if we had used the return keyword so the error value gets propagated to the calling code.

In the function below, for instance, here's an example:

let block = block.ok_or_else(|| BlockError::EventWithoutBlock)?;

This is in a while loop, and if a block variable comes in with an error, we're instructing the application to throw an error and exit.

I think we'll need to replace this with a match statement that checks for Ok or Err and on error, perhaps log it (optional) or just continue on with the while loop.

I see in that same function ws_block_stream there are a few more places where we have the pattern to throw an error and crash the program, but I think those also need to be dealt with in a way that doesn't stop execution.

@SeedyROM
Copy link
Collaborator

This is in a while loop, and if a block variable comes in with an error, we're instructing the application to throw an error and exit.

This is simply not the case, we're exiting the current loop and passing an Err enum variant up the callstack. This can be handled by anyone further up the chain who wants to handle it, and we do!

See

Retry::spawn(indexer_retry_strategy, || async {
indexer::system::run(
&indexer_name,
&indexer_chain_id,
&indexer_sources,
&indexer_filters,
)
.await
.map_err(|err| {
error!(
"Indexer {} ({}) crashed!",
indexer_name,
indexer_path.display()
);
error!("Error: {}", err);
error!("Retrying in 5 seconds...");
err
})
})

We handle the error, tell the user we got it and what it was. Then we retry getting more blocks from the chain, if we miss a block or 2 (which can absolutely happen considering the connection goes down 4 times a day) this task picks up the slack:

let historical_indexer_handle = tokio::spawn(async move {
Retry::spawn(historical_retry_strategy, || async {
indexer::system::run_historical(
&historical_name,
&historical_chain_id,
&historical_sources,
&historical_filters,
)
.await
.map_err(|err| {
error!(
"Historical indexer {} ({}) crashed!",
config.name,
path.display()
);
error!("Error: {}", err);
error!("Retrying in 5 seconds...");
err
})
})
.await?;
Ok::<(), Report>(())
});

Which once again, handles intermittent errors and retries if it fails.

@SeedyROM SeedyROM added the wontfix This will not be worked on label Oct 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests

2 participants