You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Derecho and Cascade are intended for settings where people want high availability and where failures of various kinds are inevitable. So, a working Derecho or Cascade server should handle fault detections gracefully and not crash just because a client malfunctioned.
I'm noticing that tcp.cpp is filled with uncaught exception throws, which will cause the SERVER to crash if a client it was talking to crashes. In fact, I was able to trigger a case in which a cascade client test program hung (some unrelated issue), and when I killed it, all four Cascade servers threw:
terminate called after throwing an instance of 'tcp::incomplete_read_error'
Read EOF prematurely
Aborted
It seems clear to me that this is an overreaction to a faulty client! In general, Derecho should never throw uncaught exceptions at all, except for the "possible minority partition" one or some sort of extremely fatal startup issue. But once running, the system should ride out anything it encounters.
Probably we have other uncaught throws, but the ones worrying me right now are the half dozen in tcp/tcp.cpp. Could we possibly replace these with logged error messages, but either catch every one of them every time it could arise, or not throw them at all? If a client botches its initialization, the connection to the client should be broken -- nothing more!
The text was updated successfully, but these errors were encountered:
Derecho and Cascade are intended for settings where people want high availability and where failures of various kinds are inevitable. So, a working Derecho or Cascade server should handle fault detections gracefully and not crash just because a client malfunctioned.
I'm noticing that tcp.cpp is filled with uncaught exception throws, which will cause the SERVER to crash if a client it was talking to crashes. In fact, I was able to trigger a case in which a cascade client test program hung (some unrelated issue), and when I killed it, all four Cascade servers threw:
terminate called after throwing an instance of 'tcp::incomplete_read_error'
Read EOF prematurely
Aborted
It seems clear to me that this is an overreaction to a faulty client! In general, Derecho should never throw uncaught exceptions at all, except for the "possible minority partition" one or some sort of extremely fatal startup issue. But once running, the system should ride out anything it encounters.
Probably we have other uncaught throws, but the ones worrying me right now are the half dozen in tcp/tcp.cpp. Could we possibly replace these with logged error messages, but either catch every one of them every time it could arise, or not throw them at all? If a client botches its initialization, the connection to the client should be broken -- nothing more!
The text was updated successfully, but these errors were encountered: