v2.2.2
More bug fixes have been implemented and tested. This version should be used in preference to 2.2.1 or 2.2.0, since it's much more stable.
Bugs Fixed
- View changes could get "stuck" for a variety of reasons if many nodes joined and left in a short period of time, as documented in issue #213. Fixed in #216.
- Nodes that issued several concurrent multicasts could become deadlocked in
RemoteInvoker::receive_response
because all calls to the same function's receive_response would share the same receive_response_mutex. This bug was actually introduced in #211 when we changed the way responses were delivered to PendingResults objects in order to fix another bug; previously, there was no mutex in receive_response. Also fixed in #216. - The report_failure callback in RPCManager, called by P2PConnectionManager, could deadlock trying to acquire view_mutex while holding a p2p_connection_mutex. Fixed by making RPCManager keep track of external connections on its own, so it doesn't need to acquire view_mutex at all (also in #216).
- Group members that handle P2P messages from external clients could crash if they attempted to send a reply to an external client after it disconnected, as documented in #214. Fixed by ee9a622
Other Improvements
- CMakeLists.txt now declares a more recent CMake version, specifically 3.15.4 rather than 2.8.1. This reflects the version of CMake we've actually been using, and avoids generating warnings on newer systems (CMake 3.21 has started emitting warnings if the version required in CMakeLists.txt is older than 2.8.12).
- CMakeLists.txt now specifies that we require the C++17 standard to compile.
- Nodes produce fewer warnings and errors when shutting down "cleanly." A node that marks itself as failed will no longer attempt to freeze its own SST row (which causes a segmentation fault), and a leader that marks itself as failed will no longer throw an exception or warn about a potential partitioning event. (Fixed in a3443bb and 64c0396)