-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: remove mDNS #2630
base: main
Are you sure you want to change the base?
refactor: remove mDNS #2630
Conversation
This also uses a different cache path for local mode.
af8b2b2
to
b4c26e6
Compare
# - name: Verify the routing tables of the nodes | ||
# run: cargo test --release -p ant-node --test verify_routing_table -- --nocapture | ||
# env: | ||
# CARGO_TARGET_DIR: ${{ matrix.os == 'windows-latest' && './test-target' || '.' }} | ||
# timeout-minutes: 5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How bad is the error here? Maybe would it be good to add some tolerance to the test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I've seen it's not too bad. Worst I've seen is one node having 21 nodes in their RT. And usually about 3-6 nodes that are missing one or two (23/24) in their RT.
ant-networking/src/event/swarm.rs
Outdated
@@ -330,6 +306,22 @@ impl SwarmDriver { | |||
// all addresses are effectively external here... | |||
// this is needed for Kad Mode::Server | |||
self.swarm.add_external_address(address.clone()); | |||
|
|||
// If we are local and the first node, add our own address(es) to cache | |||
// if self.first { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we uncomment the if statement here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or do we write the listen addr for all local nodes
, if so the code comment is misleading ig.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we gotta clone the cache before add_address()
, I've pasted the code from the main node handling loop. The ergonomic is really bad here and that is my fault, I had to modify the cache entirely like 3 times and there are a lot of artifacts from previous versions.
let config = bootstrap_cache.config().clone();
let mut old_cache = bootstrap_cache.clone();
let new = match BootstrapCacheStore::new(config) {
Ok(new) => new,
Err(err) => {
error!("Failed to create a new empty cache: {err}");
continue;
}
};
*bootstrap_cache = new;
// save the cache to disk
spawn(async move {
if let Err(err) = old_cache.sync_and_flush_to_disk(true) {
error!("Failed to save bootstrap cache: {err}");
}
});
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I first added a first
field, but later removed it again and just thought why not just write all nodes addresses. In theory it might be more realistic to let only the genesis write its address though.
if opt.peers.first { | ||
bootstrap_cache.write()?; | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if a node is running as a service
it could restart and it would still retain the --first
flag.
Thus, the entire cache would be wiped.
This is only really a problem for our genesis VM
. But a wiped cache there is not really a big deal. If there is a better way to solve this, then great, else no issues tbh.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I edited the RPC restart to remove the --first
flag in ff94c00
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No I mean, a --first
node started using antnodemanager would run as a service
.
And services preserve their args in the service definition file.
Builds upon #2595. (Merge that first.)