Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RGS v2: NodeAnnouncement Delta Serialization #76

Merged
merged 5 commits into from
May 29, 2024

Conversation

arik-so
Copy link
Contributor

@arik-so arik-so commented Mar 19, 2024

This PR aims to start serializing changes to nodes' socket address and feature sets. Doing so comprises two main components: calculating the delta, and encoding it in a new version of the RGS protocol, while also maintaining the ability to serialize old versions.

The lookup of new node details is primarily accomplished in lookup.rs, and the main details of the new serialization format are in lib.rs.

src/config.rs Outdated Show resolved Hide resolved
src/config.rs Outdated Show resolved Hide resolved
src/lib.rs Outdated Show resolved Hide resolved
src/lib.rs Outdated
let mut versioned_output = GOSSIP_PREFIX.to_vec();
versioned_output.append(&mut unversioned_output.clone());

let mut compressed_unversioned_output = rle::encode(&unversioned_output);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have any data on how much savings the RLE will give us vs gzip? I wonder if we RLE first does gzip end up not working as well and we end up with larger gzip data vs originally.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so I just ran some tests. RLE saves about 10% over not-having it — without gzip. Post-gzip, not having RLE prior actually results in a smaller file size. So I think we should do away with RLE altogether and let gzip do its job.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, that is what I was kinda figuring would happen...I think I agree we should do away with our internal RLE here. Just saving 10% isn't really huge at least compared to gzip, so we'd still kinda be reliant on gzip.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All right, I removed RLE. Still gotta massage the commit history a bit, but it's gone

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pity, I liked the RLE idea, but if gzip is better and the new default, all the better!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're ultimately making a tradeoff - we always do gzip on the HTTP end, pre-gzip -9'ing on our reference server, even. We assume most clients will do gzip in their HTTP client (which is part of the reason the API pushes the user to fetch the blob, rather than us bundling a lightweight HTTP client ourselves) so don't want to slow that down, but on the flip side some users may use a lightweight HTTP client that doesn't do gzip and we end up slowing them down.

@arik-so arik-so changed the title Persist NodeAnnouncement gossip RGS v2: NodeAnnouncement Delta Serialization May 14, 2024
@arik-so arik-so force-pushed the arik/2024/02/node-gossip branch 3 times, most recently from e336464 to a95e06d Compare May 15, 2024 16:45
@TheBlueMatt
Copy link
Contributor

We need the ability to generate both formats for some time.

@arik-so arik-so force-pushed the arik/2024/02/node-gossip branch 2 times, most recently from ca193d1 to c58c88c Compare May 15, 2024 22:15
@arik-so arik-so marked this pull request as ready for review May 15, 2024 22:16
src/serialization.rs Show resolved Hide resolved
src/lookup.rs Show resolved Hide resolved
src/lib.rs Outdated
current_node_id.write(&mut current_node_delta_serialization).unwrap();

if let Some(node_delta) = node_delta_set.get(&current_node_id) {
if node_delta.has_feature_set_changed {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this commit is changing the v1 format only for a later commit to change the format back. Can you move this commit later and squash it into the format change PR so that we don't have fixups in the same PR?

src/lib.rs Outdated Show resolved Hide resolved
@arik-so arik-so force-pushed the arik/2024/02/node-gossip branch 2 times, most recently from 088f6b0 to 5245c27 Compare May 16, 2024 16:42
src/lookup.rs Outdated Show resolved Hide resolved
src/lookup.rs Outdated Show resolved Hide resolved
src/lookup.rs Outdated Show resolved Hide resolved
src/lookup.rs Show resolved Hide resolved
src/serialization.rs Outdated Show resolved Hide resolved
src/lib.rs Show resolved Hide resolved
@arik-so arik-so force-pushed the arik/2024/02/node-gossip branch 2 times, most recently from a413e34 to db3fbac Compare May 17, 2024 18:10
Copy link
Contributor

@TheBlueMatt TheBlueMatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few nits and a few real comments.

src/serialization.rs Show resolved Hide resolved
receiver.send(GossipMessage::NodeAnnouncement(announcement.clone(), Some(timestamp - 10))).await.unwrap();
receiver.send(GossipMessage::NodeAnnouncement(announcement.clone(), Some(timestamp - 8))).await.unwrap();

{
let mut current_announcement = generate_node_announcement(Some(SecretKey::from_slice(&[2; 32]).unwrap()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't include commits that are effectively fixups for earlier commits, if we swap the ordering of this commit and the one adding this test it would be less total diff :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the effective fixup here? Am I understanding correctly that the secret key parameter should be in the same commit as the generate_node_announcement method?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I rearranged and squashed some. Hope it's sufficiently non-effectively-fixuppy.

src/lib.rs Outdated Show resolved Hide resolved
src/lib.rs Show resolved Hide resolved
src/lib.rs Outdated Show resolved Hide resolved
src/lib.rs Outdated
}

if has_update {
let extension_length = current_node_delta_extension.len() as u16;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why always use this? We set other bits which indicate we have addresses/features, which are all length-prefixed AFAICT, so we should be always leaving the top bit clear because we don't have any "new" data.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for future-proofing extensibility. It will allow us to push custom data absent any features or addresses that need reading.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, my point is that we don't have to use this currently, rather we can use the fact that the data we're writing already has lengths in it and we can just have this be a future-extensibility thing without actually using it at all today.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but the data we're writing right now is typed. We know to read it based on specific flags set beforehand, and then we read it according to the SocketAddress or NodeFeature spec. For the client to be able to read arbitrary data, or not to error when the SocketAddress or Feature is invalid, we need some higher-level length encoding. Basically, it's technically doable from the server's perspective, but decoding is where I think it would become challenging.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the client to be able to read arbitrary data

Right, if we add new data in the future we should use the length bytes here

not to error when the SocketAddress or Feature is invalid

So don't write invalid data? I don't see how this is challenging.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah gotcha, I see what you meant

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that we will have to handle new gossip address types - those we don't know the length for in advance so we'll probably want to write the total gossip data address data length rather than address count.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we wanna do a per-address byte count prefix such that future address types don't break the deserialization of older clients if they're followed by already supported address types.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively we could sort the socket addresses by when the enum value they assume got introduced, and have the client break out of the loop as soon as it encounters an address it doesn't understand.

@jkczyz jkczyz self-requested a review May 20, 2024 18:42
src/lookup.rs Show resolved Hide resolved
src/lookup.rs Show resolved Hide resolved
src/lookup.rs Outdated Show resolved Hide resolved
src/lookup.rs Outdated Show resolved Hide resolved
let unsigned_node_announcement = NodeAnnouncement::read(&mut readable).unwrap().contents;

let node_id = unsigned_node_announcement.node_id;
let is_previously_processed_node_id = Some(node_id) == previous_node_id;
Copy link

@jkczyz jkczyz May 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we skip the rest of the loop iteration if this is true (other than updating previous_node_id) since we are iterating by descending timestamp? Could we also use SELECT DISTINCT ON then?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, we need to find all pairwise mutation possibilities.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a comment why this is the case? It isn't readily obvious.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, I think such a comment probably belongs in the README

src/lookup.rs Outdated Show resolved Hide resolved
src/lookup.rs Outdated Show resolved Hide resolved
src/lookup.rs Outdated Show resolved Hide resolved
src/lookup.rs Outdated Show resolved Hide resolved
src/lookup.rs Outdated Show resolved Hide resolved
@arik-so arik-so force-pushed the arik/2024/02/node-gossip branch 2 times, most recently from 1248538 to a5fab3f Compare May 22, 2024 21:20
src/lib.rs Outdated Show resolved Hide resolved
src/lib.rs Outdated Show resolved Hide resolved
@TheBlueMatt
Copy link
Contributor

Otherwise I think LGTM.

if let Some(node_delta) = serialization_details.node_mutations.get(&current_node_id) {
/*
Bitmap:
7: expect extra data after the pubkey (a u16 for the count, and then that number of bytes)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so bit 6 is now unused? Should we clearly define it to mean something? Not sure what it would mean but worth mentioning.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could use it for more feature defaults? Frankly, I think we can leave it unused on the server, and decide its interpretation on the client.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, we definitely don't have to interpret it here before merging this PR, I'm just suggesting we think on it.

@arik-so arik-so force-pushed the arik/2024/02/node-gossip branch 2 times, most recently from ae0e7f2 to 567e986 Compare May 28, 2024 21:06
src/lib.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@TheBlueMatt TheBlueMatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I think, I'll do a full 'nother pass tomorrow. LMK If you had any other feedback @jkczyz but I'll merge it either way, we can handle any comments you had later.

src/lib.rs Outdated
let mut address_serialization = Vec::new();

// we don't know a priori how many are <= 255 bytes
let mut total_address_count = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit

Suggested change
let mut total_address_count = 0;
let mut total_address_count: u8 = 0;

Copy link
Contributor

@TheBlueMatt TheBlueMatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ship it

@TheBlueMatt TheBlueMatt merged commit b13e988 into lightningdevkit:main May 29, 2024
4 checks passed
Copy link

@jkczyz jkczyz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I think, I'll do a full 'nother pass tomorrow. LMK If you had any other feedback @jkczyz but I'll merge it either way, we can handle any comments you had later.

Just a couple nits for a follow-up.

Comment on lines +189 to +198
let mut delta_set = DeltaSet::new();
lookup::fetch_channel_announcements(&mut delta_set, network_graph, &client, last_sync_timestamp, snapshot_reference_timestamp, logger.clone()).await;
log_info!(logger, "announcement channel count: {}", delta_set.len());
lookup::fetch_channel_updates(&mut delta_set, &client, last_sync_timestamp, logger.clone()).await;
log_info!(logger, "update-fetched channel count: {}", delta_set.len());
let node_delta_set = lookup::fetch_node_updates(&client, last_sync_timestamp, logger.clone()).await;
log_info!(logger, "update-fetched node count: {}", node_delta_set.len());
lookup::filter_delta_set(&mut delta_set, logger.clone());
log_info!(logger, "update-filtered channel count: {}", delta_set.len());
serialization::serialize_delta_set(delta_set, node_delta_set, last_sync_timestamp)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Probably best to return the sets in all of these instead of passing in a &mut, but can be done in a follow-up.

pub(super) fn find_leading_histogram_entries(histogram: HashMap<&NodeFeatures, usize>, count: usize) -> Vec<NodeFeatures> {
let mut entry_counts: Vec<_> = histogram.iter().filter(|&(_, &count)| count > 1).collect();
entry_counts.sort_by(|a, b| b.1.cmp(&a.1));
entry_counts.into_iter().take(count).map(|(&features, _count)| features.clone()).collect()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't need a clone if you don't use a reference?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the reference avoids a clone to insert every key.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I just noticed the originating HashMap contains references as keys, so the clone is indeed needed. I was actually referring to &features when I said reference. Anyhow, the code could use fewer explicit references as:

let mut entry_counts: Vec<_> = histogram.into_iter().filter(|&(_, count)| count > 1).collect();
entry_counts.sort_by(|a, b| b.1.cmp(&a.1));
entry_counts.into_iter().take(count).map(|(features, _count)| features.clone()).collect();

But that's more a stylistic nit, so feel free to ignore.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants