Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

0.0.101 NodeAddressBook contains duplicate entries for each distinct IP #750

Closed
steven-sheehy opened this issue Nov 6, 2020 · 17 comments · Fixed by #1216
Closed

0.0.101 NodeAddressBook contains duplicate entries for each distinct IP #750

steven-sheehy opened this issue Nov 6, 2020 · 17 comments · Fixed by #1216
Assignees
Labels
Impact Potentially impacts SDK, clients, docs and/or users.
Milestone

Comments

@steven-sheehy
Copy link
Member

steven-sheehy commented Nov 6, 2020

Summary
As a Hedera Services client, I need the address book to contain one entry per node so it's easier to consume and doesn't duplicate information. Currently, for the 0.0.101 address book that contains IP address information it duplicates each NodeAddress for each distinct IP that that node has. This is confusing since the node count shows as 39 in mainnet (via NodeAddressBook.getNodeAddressCount()) despite there only being 13ish nodes. It's also wasteful to duplicate information as every client who wants to communicate with the network will be pulling this file to get up to date IP information.

The proto currently looks like this: Version 0.12.0

message NodeAddress {
    bytes ipAddress = 1; // The ip address of the Node with separator & octets
    int32 portno = 2; // The port number of the grpc server for the node
    bytes memo = 3; // The memo field of the node (usage to store account ID is deprecated)
    string RSA_PubKey = 4; // The RSA public key of the node
    int64 nodeId = 5; // A non-sequential identifier for the node
    AccountID nodeAccountId = 6; // The account to be paid for queries and transactions sent to this node
    bytes nodeCertHash = 7; // A hash of the X509 cert used for gRPC traffic to this node
}

message NodeAddressBook {
    repeated NodeAddress nodeAddress = 1; // Contains multiple Node Address for the network
}

Possible resolution

  • Change ipAddress to be repeated, but that's probably not backwards compatible.
  • Add a repeated bytes ipAddresses = 8; field and mark the other field as deprecated.
  • Keep the fields as is and populate ipAddress with a comma separate concatenation of addresses.
  • Add a stake field to each node.

File 0.0.101 will be NodeAddressBookAbbreviated
File 0.0.102 will be NodeAddressBook

Version 0.13.0

message NodeEndpoint {
    string ipAddress = 1; // The IP address of the node
    string port = 2; // The port of the node
}

message NodeAddressAbbreviated { // All the information required to connect to the network
    int64 nodeId = 5; // A non-sequential identifier for the node. This value is the key between the full and abbreviated address books 
    AccountID nodeAccountId = 6; // The account to be paid for queries and transactions sent to this node
    bytes nodeCertHash = 7; // A hash of the X509 cert used for gRPC traffic to this node
    repeated NodeEndpoint = 8; // A node's IP address and port
}

message NodeAddressBookAbbreviated {
    repeated NodeAddressAbbreviated nodeAddressAbbreviated = 1; // Contains minimal node details for the network
}
message NodeAddress { // All the address book information
    bytes ipAddress = 1 [deprecated=true]; // The IP address of the Node with separator & octets
    int32 portno = 2 [deprecated=true]; // The port number of the grpc server for the node
    bytes memo = 3 [deprecated=true]; // The memo field of the node (usage to store account ID is deprecated)
    string RSA_PubKey = 4; // The RSA public key of the node
    int64 nodeId = 5; // A non-sequential identifier for the node. This value is the key between the full and abbreviated address books 
    AccountID nodeAccountId = 6; // The account to be paid for queries and transactions sent to this node
    bytes nodeCertHash = 7; // A hash of the X509 cert used for gRPC traffic to this node
    repeated NodeEndpoint = 8; // A node's IP address and port
    string description = 9; // A description of the node. Max 100 bytes.
    int64 stake = 10; // The amount of tinybars staked to this node 
}

message NodeAddressBook {
    repeated NodeAddress nodeAddress = 1; // Contains all details of the nodes for the network
}
@donaldthibeau donaldthibeau added this to the Hedera 0.10.0 milestone Nov 13, 2020
@donaldthibeau donaldthibeau added the Impact Potentially impacts SDK, clients, docs and/or users. label Nov 16, 2020
@noshmody noshmody modified the milestones: Hedera 0.10.0, Hedera 0.11.0 Dec 7, 2020
@noshmody noshmody removed this from the Hedera 0.11.0 milestone Jan 19, 2021
@noshmody noshmody added this to the Hedera 0.13.0 milestone Feb 12, 2021
@noshmody
Copy link

noshmody commented Feb 12, 2021

The above changes are expected to be backward compatible, until the deprecated fields are actually removed from the protobuf messages.

  • Add above protobuf fields with issue in protobuf repo.
  • Update hedera services code to set stake from address book provided by platform
  • When loading from existing state, Do NOT update address book file in the ledger at 0.0.101 or 0.0.102.
  • Update yahcli with new protobuf messages
  • Tests to update address book files in ledger at 0.0.101 and 0.0.102
  • Create document describing how stake should be used in validation of stream files - Created an issue Create document describing how stake should be used in validation of stream files #1202

Tests:

  • Test combinations using yahcli
  • Add unit tests for the changes
  • A fileDownload to 101/ 102 using old AddressBook protobuf should be able to successfully parse the files.
  • Similarly a fileUpload using old AddressBook protobuf should be successful.
  • Test with Public testnet and Mainnet state - This will be discussed on Monday, on how to test
  • Move validations to utils

@steven-sheehy
Copy link
Member Author

steven-sheehy commented Mar 6, 2021

@noshmody

  1. Can a node cert hash vary per ip and port combo?
  2. I think NodeAddressAbbreviated should have deprecated fields removed. The benefit of splitting them into two address books is that the client can continue to use NodeAddress to parse 101 and 102 for a deprecation period. If they update their code to use the new NodeAddressAbbreviated they should also be forced to not use the new fields as well.
  3. Perhaps we can add a comment to NodeAddress explaining that its use to parse 101 is deprecated and to use the new message?
  4. Still not a fan of the name NodeAddressBookAbbreviated. Would prefer something oriented around its intended audience (ClientNodeAddressBook since it contains info for clients to connect) or its contents (NodeEndpoints, NodeEndpointAddressBook, NodeConnectionAddressBook). Or if those aren't sufficient, perhaps just something a little easier to say like NodeAddressBookSummary or SimpleNodeAddressBook
    5. Should we make stake int64 since Java doesn't have unsigned?

@anighanta
Copy link
Contributor

issues created in protobufs repo hashgraph/hedera-protobufs-java#60 and
hashgraph/hedera-protobufs#21

@noshmody
Copy link

@steven-sheehy

  1. Each node has only 1 RSA Cert (each node has multiple proxies and therefore multiple TLS certs). Therefore assuming each proxy has single IP/Port exposed, each IP/Port combination has a TLS cert. Multiple IP/Port combinations have a single RSA certificate.
  2. If fields have been marked deprecated previously then we can remove them if Donald approves.
  3. @anighanta please would you add a comment as requested
  4. @steven-sheehy @donaldthibeau would you please pick a set of names and let us know. @anighanta will implement accordingly.

@anighanta
Copy link
Contributor

None of the fields were marked deprecated previously for NodeAddress

@steven-sheehy
Copy link
Member Author

steven-sheehy commented Mar 23, 2021

If fields have been marked deprecated previously then we can remove them if Donald approves.

None of the fields were marked deprecated previously for NodeAddress

NodeAddressAbbreviated is a new message type, so fields within it don't need to be marked deprecated and can be removed. Clients won't break since if they don't change they would continue to use the previous NodeAddress message to parse both 101 and 102 and have access to those fields.

@noshmody
Copy link

My (likely incorrect recollection of the deprecated fields was the following...Since the 2 Address Book files would contain the 2 messages listed below.
File 0.0.101 will be NodeAddressBookAbbreviated
File 0.0.102 will be NodeAddressBook

If we remove the fields to be marked with [deprecated=true] from the message and therefore the AB file 0.0.101 then we will break clients that parse the file today.

@steven-sheehy would you please confirm. Thanks

@steven-sheehy
Copy link
Member Author

steven-sheehy commented Mar 23, 2021

I don't think that is true, but would be good if you guys can verify with a test. Someone parsing 0.0.101 after these changes are live on mainnet can change nothing and continue to use their previous protos (e.g. NodeAddressBook only) to parse both 101 and 102. Even if they upgrade their protos, if their code continues to use NodeAddress to parse 101 then they won't break. They will only break if they update their code to use NodeAddressBookAbbreviated, but by doing so they would be aware that it's a new message type and should adjust things as necessary to work.

@noshmody
Copy link

noshmody commented Mar 23, 2021

Address Book messages have been updated to reflect @steven-sheehy & @lbaird proposals.

Engineering @Neeharika-Sompalli @anighanta will test enhance YAHCLI tool with options to read and write existing and new versions of NodeAddressBook and NodeAddressBookAbbreviated

YAHCLI Operations on Files 0.0.101 or 0.0.102

Nodes on (version 0.13.0)

  • Read / Write - NodeAddress (version 0.12.0)

Client is still on (version 0.12.0)

  1. Download addressBook [ 101 ] works.
  2. Upload addressBook [ 101 ] in (version 0.12.0) works.
  3. Download nodeDetails [ 102 ] works.
  4. Upload nodeDetails [ 102 ] in (version 0.12.0) works.
  • Read / Write - NodeAddress (version 0.13.0) Read as per JSON

Client moved on to (version 0.13.0)

Test case 1

  1. Download addressBook [ 101 ] works.
  2. Upload addressBook [ 101 ] with Duplicates works_.
  3. Download nodeDetails [ 102 ] works.
  4. Upload nodeDetails [ 102 ] in with Duplicates works.

Test case 2

Client moved on to (version 0.13.0)

  1. Download addressBook [ 101 ] works.
  2. Upload addressBook [ 101 ] with multiple NodeEndpoints for each NodeAddress [No Duplicates] _works.
  3. Download nodeDetails [ 102 ] works.
  4. Upload nodeDetails [ 102 ] with multiple NodeEndpoints for each NodeAddress [No Duplicates] works.

AddressBook formats test cases validated:

testCasesCovered.zip

@Neeharika-Sompalli
Copy link
Member

Neeharika-Sompalli commented Mar 24, 2021

As per review comments in hashgraph/hedera-protobufs#22 , changes forNodeAddressAbbreviated to be renamed to ClientNodeAddress and NodeAddressBookAbbreviated to be renamed to ClientNodeAddressBook are yet to be confirmed

@anighanta
Copy link
Contributor

anighanta commented Mar 25, 2021

@steven-sheehy @noshmody @lbaird @Neeharika-Sompalli
On Genesis, as we have no 101 and 102 files .. we create them using platform's address book. By using the latest suggested protobuf formats, I used NodeAddressBookAbbreviated to generate file 101 and NodeAddressBook to generate file 102. But when we do a fileDownload on 101 and try to parse it with old protobuf version NodeAddressBook it fails as expected fields ipaddress/port/memo are not present in the downloaded file.

@noshmody suggested to use the new NodeAddressBook to generate both 101 and 102 files on GENESIS, which fixed the issue [we can now download and parse successfully the new format using old protobufs].

@steven-sheehy
Copy link
Member Author

Right, we had said on the call both files should be generated by NodeAddressBook.

@donaldthibeau
Copy link
Contributor

We have aligned to the following changes after discussion with @lbaird and @steven-sheehy the two file names will be:

  • AddressBookForClients
  • AddressBook

The nested objects will be:

  • NodeAddressForClients
  • NodeAddress

@anighanta
Copy link
Contributor

We have aligned to the following changes after discussion with @lbaird and @steven-sheehy the two file names will be:

  • AddressBookForClients
  • AddressBook

The nested objects will be:

  • NodeAddressForClients
  • NodeAddress

@donaldthibeau @steven-sheehy as Platform also contains AddressBook -> com.swirlds.common.AddressBook A lot of explicit imports for com.hederahashgraph.api.proto.java.AddressBook are needed in the services code. can we have a different name for addressBook from protobuf to avoid this ambiguity.

Sample Error whilemvn clean install without explicit imports :

ERROR] hedera-services/hedera-node/src/main/java/com/hedera/services/state/initialization/HfsSystemFilesManager.java:[74,23] reference to AddressBook is ambiguous
[ERROR]   both class com.hederahashgraph.api.proto.java.AddressBook in com.hederahashgraph.api.proto.java and class com.swirlds.common.AddressBook in com.swirlds.common match

@SimiHunjan
Copy link
Contributor

@noshmody can you please provide the final naming convention and update the issue.

@anighanta
Copy link
Contributor

anighanta commented Mar 30, 2021

@noshmody can you please provide the final naming convention and update the issue.

@SimiHunjan These are the new message names and structures.

Version 0.13.0

message NodeEndpoint {
    string ipAddress = 1; // The IP address of the node
    string port = 2; // The port of the node
}

message NodeAddressForClients { // All the information required to connect to the network
    int64 nodeId = 5; // A non-sequential identifier for the node. This value is the key between the full and abbreviated address books 
    AccountID nodeAccountId = 6; // The account to be paid for queries and transactions sent to this node
    bytes nodeCertHash = 7; // A hash of the X509 cert used for gRPC traffic to this node
    repeated NodeEndpoint = 8; // A node's IP address and port
}

message AddressBookForClients {
    repeated NodeAddressForClients nodeAddressForClients = 1; // Contains minimal node details for the network
}
message NodeAddress { // All the address book information
    bytes ipAddress = 1 [deprecated=true]; // The IP address of the Node with separator & octets
    int32 portno = 2 [deprecated=true]; // The port number of the grpc server for the node
    bytes memo = 3 [deprecated=true]; // The memo field of the node (usage to store account ID is deprecated)
    string RSA_PubKey = 4; // The RSA public key of the node
    int64 nodeId = 5; // A non-sequential identifier for the node. This value is the key between the full and abbreviated address books 
    AccountID nodeAccountId = 6; // The account to be paid for queries and transactions sent to this node
    bytes nodeCertHash = 7; // A hash of the X509 cert used for gRPC traffic to this node
    repeated NodeEndpoint = 8; // A node's IP address and port
    string description = 9; // A description of the node. Max 100 bytes.
    int64 stake = 10; // The amount of tinybars staked to this node 
}

message AddressBook {
    repeated NodeAddress nodeAddress = 1; // Contains all details of the nodes for the network
}

@steven-sheehy
Copy link
Member Author

steven-sheehy commented Apr 8, 2021

Rationale

After further discussions, the above design has been noted to causes some pain points:

  • It makes it very difficult for mirror nodes to parse current and historical address books
    • A mirror node can start up now or any point in the future and needs to be able to parse all address books back to genesis
    • Applications can only have one version of the protobuf available on the classpath at a time
  • Removing deprecated fields in NodeAddressForClients makes it difficult for mirror and non-mirror clients alike to transition since v0.13 might be released but the address books may or may not be populating nodeEndpoint yet.

Proposal

Keep the same message structure for both 0.0.101 and 0.0.102. Never remove the deprecated fields as they will be needed for mirror nodes parsing historical address books. The assumption is any client who understands v0.13 will update their code to use serviceEndpoint if it's populated, otherwise they will fallback to using ipAddress and portno fields. If they don't upgrade to v0.13 or upgrade to v0.13 without adding this logic, then they will break regardless of server format after the deprecation period.

/*
Contains the IP address and the port representing a service endpoint of a Node in a network. Used to reach the Hedera API and submit transactions to the network.
*/
message ServiceEndpoint {
    bytes ipAddressV4 = 1; // The 32-bit IP address of the node encoded in left to right order (e.g. 127.0.0.1 has 127 as its first byte)
    int32 port = 2; // The port of the node
}

/*
The data about a Node – including IP Address, and the crypto account associated with the Node.

All fields are populated in the 0.0.102 address book file while only fields that start with # are populated in the 0.0.101 address book file.
*/
message NodeAddress {
    bytes ipAddress = 1 [deprecated=true]; // The IP address of the Node with separator & octets encoded in UTF-8. Usage is deprecated, ServiceEndpoint is preferred to retrieve a node's list of IP addresses and ports.
    int32 portno = 2 [deprecated=true]; // The port number of the grpc server for the node.  Usage is deprecated, ServiceEndpoint is preferred to retrieve a node's list of IP addresses and ports.
    bytes memo = 3 [deprecated=true]; // Usage is deprecated, nodeAccountId is preferred to retrieve a node's account ID.
    string RSA_PubKey = 4; // The x509 RSA public key of the node encoded in hex format.
    int64 nodeId = 5; // # A non-sequential identifier for the node.
    AccountID nodeAccountId = 6; // # The account to be paid for queries and transactions sent to this node.
    bytes nodeCertHash = 7; // # A hash of the X509 cert used for gRPC traffic to this node.
    repeated ServiceEndpoint serviceEndpoint = 8; // # A node's service IP addresses and ports.
    string description = 9; // A description of the node. Max 100 bytes.
    int64 stake = 10; // The amount of tinybars staked to this node.
}

/*
A list of nodes and their metadata that contains all details of the nodes for the network
*/
message NodeAddressBook {
    repeated NodeAddress nodeAddress = 1; // Metadata of nodes on the network
}

Compatibility Matrix

Client Server Deprecated fields populated Breaking
v0.12 v0.12 Yes No
v0.12 v0.13 Yes No
v0.12 v0.13 No Yes
v0.13 v0.12 Yes No
v0.13 v0.13 Yes No
v0.13 v0.13 No No

Note: Deprecated fields populated column means ipAddress and portno are still populated in the address book files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Impact Potentially impacts SDK, clients, docs and/or users.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants