Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIMD-0125: Incremental Accounts Hash #125

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

0x0ece
Copy link
Contributor

@0x0ece 0x0ece commented Mar 14, 2024

Note: this work is temp paused until Breakpoint.

TODOs

  • rename, as Incremental Accounts Hash already exists
    • Homomorphic Accounts Hash (since LtHash is called a Homomorphic Hash Function)?
    • Accumulator Accounts Hash (previous internal name, though "accumulator" might be confusing)
  • rewrite WIP sections
  • describe where/how to replace existing hash with this hash
  • describe impact on light clients - assess state of LC before resuming work

@0x0ece 0x0ece force-pushed the incremental-accounts-hash branch from 3122884 to 041727e Compare March 14, 2024 15:35
@0x0ece 0x0ece force-pushed the incremental-accounts-hash branch from 041727e to e09a335 Compare March 14, 2024 15:38

### Perfomance

Some performance result - rust code from Solana runtime.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a placeholder comment?

need two different hashes.

Incremental hash functions allow to compute the hash of a set (e.g., Solana
accounts) by *incrementally* accumulate the hash of each element, thus removing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
accounts) by *incrementally* accumulate the hash of each element, thus removing
accounts) by *incrementally* accumulating the hash of each element, thus removing

- Cons: won't scale.
- Verkle Trees.
- Pros: scale better, support inclusion proofs.
- Cons: not fast enough for Solana.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate why they are not fast enough? A typical Solana block updates ~3500 accounts today. Is it not possible to accumulate the updates within slot time for a set of 450 million accounts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From mnb today, I see ~1,000 accounts written per slot. Note, if an account is updated multiple times within a single slot, I'm only counting it once, since we only write the final version of an account per slot.


## Detailed Design

### LtHash (WIP)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolve this comment when WIP is removed


### (Single) Account Hash (WIP)

The hash of a single account is defined as:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is worth mentioning that the first 32 bytes of the lthash are equal to the current account hash.
And this is an important property for compatibility.

A validator can compute the initial value for the Incremental Accounts Hash at
boot, while loading the accounts state from snapshot.

At slot S, a list of txs is executed, making changes on a set of writeable
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
At slot S, a list of txs is executed, making changes on a set of writeable
At slot S, a list of transactions is executed, making changes on a set of writeable

Comment on lines +150 to +151
- sub all the hashes before
- add all the hashes after
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please provide references to the precise definitions of add and subtract

Comment on lines +153 to +164
Notes:

- Before hashes could be pre-computed and stored in the account metadata db. Or
not. We consider it an implementation choice.
- The order in which account hashes are add/sub is irrelevant, again this is
left as an implementation choice
- If an account is modified by multiple txs, it's possible to either compute its
hashes before/after each individual txs or the entire slot. The final result
is the same, so again this is left as an implementation choice
- Accounts that are created have before hash = 0. Analogously accounts that are
close have after hash = 0. Add/sub will work as expected in these edge cases,
or they could be optimized. Again this is left as an implementation choice
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Every one of these bullet points end in "this is left as an implenentation choice". Consider moving this into a "list of possible optimizations" or similar

Comment on lines +361 to +362
This change only impacts validators. It'll be a transparent change for any other
user.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe worth mentioning potential effects on light client and ZK rollup projects (none of which have shipped, AFAIK)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, removing the accounts delta root makes it impossible to watch for changes in an account's state without a trusted 3rd party. This breaks the existing light client design that Tinydancer describes here, and would also prevent rollups from having secure bridges with the base layer. Such a bridge is being actively developed by Zeta Markets, so this would force them to scrap their plans for an L2 on Solana.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, this is currently on hold, unless surprises we'll resume work after Breakpoint and re-evaluate.

As for TinyDancer and (I learn now) Zeta Markets, I'd pay extra attention to edge cases, for example accounts with 0 lamports that disappear from bank hash (in the current Solana implementation, independently from this proposal).
Unlike Ethereum that was designed with ZK & light clients in mind, Solana was not. So I'd be really surprise to see that "everything works fine", even in obscure edge cases.

When we originally discussed this work we agreed that scalability was a priority wrt light clients, and that if we want light clients in Solana we should have a proper spec, not just assume that what exists is working. But as I said, this work is currently paused and when we'll resume it we'll definitely re-evaluate the situation (and, of course, will alert in this PR). Thanks for the new info!

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any update on this now that Breakpoint is around the corner?

individual account hash, and 2) the Incremental Accounts Hash (sum of the
individual account hash, for all accounts).

## Detailed Design
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will the upgrade mechanism and effects on snapshot hashing also be described in this SIMD, or is that a follow-up proposal?

return lthash.fini()
```

### Incremental Accounts Hash (WIP)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's add pseudocode for the LTHASH operations, as well as test vectors

Copy link
Contributor

@brooksprumo brooksprumo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finished a first read through. Mostly grammar nits. I will review again too.


## Motivation

Our main goal is to scale Solana to billions accounts, and compute a "hash of
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
Our main goal is to scale Solana to billions accounts, and compute a "hash of
Our main goal is to scale Solana to billions of accounts, and compute a "hash of

## Motivation

Our main goal is to scale Solana to billions accounts, and compute a "hash of
all accounts" in a practical time and memory.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
all accounts" in a practical time and memory.
all accounts" in practical time and memory.


The current solution is to calculate the root of a Merkle tree of all accounts,
sorted by public key. Scaling to billions of accounts will make the Merkle tree
hard to manage (TBs of data), and sorting all the accounts extremely challenging
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
hard to manage (TBs of data), and sorting all the accounts extremely challenging
hard to manage (TBs of data), and sorting all the accounts is extremely challenging

hard to manage (TBs of data), and sorting all the accounts extremely challenging
(overall, an n*log(n) complexity).

Moreover, the current solution requires to distinguish between the **Epoch
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
Moreover, the current solution requires to distinguish between the **Epoch
Moreover, the current solution requires distinguishing between the **Epoch

accounts) by *incrementally* accumulate the hash of each element, thus removing
the need to sort and maintain large intermediate state.

This work proposes a specific incremental hash suitable for Solana needs, and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
This work proposes a specific incremental hash suitable for Solana needs, and
This work proposes a specific incremental hash suitable for Solana's needs, and

Comment on lines +82 to +84
**LtHash** is the specific incremental hash chosen to compute both: 1) the
individual account hash, and 2) the Incremental Accounts Hash (sum of the
individual account hash, for all accounts).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me, (1) here reads as we're changing the hash function used to compute the hash of a single account. Is that accurate?

Edit: Oh, I see down below we appear to be using the current implementation/definition for the hash of an account.

of the (single) account hash for all accounts.

A validator can compute the initial value for the Incremental Accounts Hash at
boot, while loading the accounts state from snapshot.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
boot, while loading the accounts state from snapshot.
boot, while loading the accounts state from snapshots.

or

Suggested change
boot, while loading the accounts state from snapshot.
boot, while loading the accounts state from a snapshot.

Comment on lines +162 to +163
- Accounts that are created have before hash = 0. Analogously accounts that are
close have after hash = 0. Add/sub will work as expected in these edge cases,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
- Accounts that are created have before hash = 0. Analogously accounts that are
close have after hash = 0. Add/sub will work as expected in these edge cases,
- Accounts that are created have before hash = 0. Analogously, accounts that are
closed have after hash = 0. Add/sub will work as expected in these edge cases,

1. Validator starts up at a given rooted slot
1. Needs every account
1. An account: pubkey + account state: primarily data, lamports, owner.
1. A hash can be created from the account data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"A hash can be created from the account data" reads ambiguously to me. The hash of an account is defined here:

The hash of a single account is defined as:
```
hash(account) :=
if account.lamports == 0:
return 00..00
else:
lthash.init()
lthash.append( account.lamports )
lthash.append( account.data )
lthash.append( account.is_executable )
lthash.append( account.owner )
lthash.append( account.pubkey )
return lthash.fini()
```

and is more than just the account data.

Comment on lines +264 to +268
4. We also replace the Epoch Accounts Hash with the new Incremental Accounts
Hash
1. The actual Incremental Accounts Hash is 2KiB long. For simplicity we use
a 32-byte long BLAKE3(Incremental Accounts Hash) as a replacement of the
Epoch Accounts Hash.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can remove the EAH, as it won't provide any different information when compared to the "Incremental Accounts Hash" from the same slot.

Maybe we leave the EAH for now, so we only change one thing at a time though?

Edit: Hah, literally the next section talks about this.

@brooksprumo brooksprumo self-requested a review April 1, 2024 16:47
@0x0ece 0x0ece marked this pull request as draft June 27, 2024 23:59
Copy link
Contributor

@brooksprumo brooksprumo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI I've written up a new SIMD for the accounts lattice hash here:
#215

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants