-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conditional CU metering #1
base: main
Are you sure you want to change the base?
Changes from 2 commits
55e4924
989a88f
6f6e049
19a48f1
08a9f56
9a22dfc
d307391
30f088c
d439937
a69ec93
0aed128
0dfaa4d
7350797
5194227
e8c74ac
100c980
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,78 @@ | ||
--- | ||
simd: 'XXXX' | ||
title: Conditional CU metering | ||
authors: | ||
- Tao Zhu (Anza) | ||
category: Standard | ||
type: Core | ||
status: Draft | ||
created: 2024-MM-DD | ||
feature: | ||
supersedes: | ||
superseded-by: | ||
extends: | ||
--- | ||
|
||
## Summary | ||
|
||
Adjusting how CU consumption is measured based on the conditions of transaction execution: successful completion will consume actual CUs, but certain irregular failures will result in the transaction automatically consuming all requested CUs. | ||
Check failure on line 18 in proposals/simd-XXXX-vm-consume-budget-for-percise-failure.md GitHub Actions / Markdown LinterLine length [Expected: 80; Actual: 242]
|
||
|
||
## Motivation | ||
|
||
### Background: | ||
|
||
In the Solana protocol, tracking transaction Compute Unit (CU) consumption is a critical aspect of maintaining consensus. Block costs are part of this consensus, meaning that all clients must agree on the execution cost of each transaction, including those that error out during execution. Ensuring consistency in CU tracking across clients is essential for maintaining protocol integrity. | ||
Check failure on line 24 in proposals/simd-XXXX-vm-consume-budget-for-percise-failure.md GitHub Actions / Markdown LinterLine length [Expected: 80; Actual: 389]
|
||
|
||
### Proposed Change: | ||
|
||
To improve performance, Solana programs are often compiled with a JIT that works at the level of Basic Blocks — linear sequences of sBPF instructions with a single entry and exit point, and no loops or branches. Basic Blocks allow for efficient execution by reducing the overhead associated with tracking CU consumption for each individual BPF instruction. | ||
Check failure on line 28 in proposals/simd-XXXX-vm-consume-budget-for-percise-failure.md GitHub Actions / Markdown LinterLine length [Expected: 80; Actual: 356]
|
||
|
||
Other than in rare, exceptional situations discussed below, the total CU consumption for a Basic Block is deterministic and, and CU accounting can be done once per basic block instead of at each instruction. | ||
Check failure on line 30 in proposals/simd-XXXX-vm-consume-budget-for-percise-failure.md GitHub Actions / Markdown LinterLine length [Expected: 80; Actual: 207]
|
||
A transaction completing successfully or with most errors implies that execution exited each basic block at its single exit point, | ||
Check failure on line 31 in proposals/simd-XXXX-vm-consume-budget-for-percise-failure.md GitHub Actions / Markdown LinterLine length [Expected: 80; Actual: 131]
|
||
and thus that the total CU consumption of the execution is equal to the sum of the CU cost of each Basic Block executed. | ||
Check failure on line 32 in proposals/simd-XXXX-vm-consume-budget-for-percise-failure.md GitHub Actions / Markdown LinterLine length [Expected: 80; Actual: 120]
|
||
|
||
However, when an exception is thrown during the execution of a Basic Block (e.g., a null memory dereference or other faults), determining the exact number of CUs consumed up to the point of failure requires additional effort. For instance, the Agave client implements a mechanism that tracks the Instruction Pointer (IP) or Program Counter (PC) to backtrack and estimate the CUs consumed when an exception occurs. More details on this mechanism can be found [here](https://github.com/solana-labs/rbpf/blob/57139e9e1fca4f01155f7d99bc55cdcc25b0bc04/src/jit.rs#L267). | ||
Check failure on line 34 in proposals/simd-XXXX-vm-consume-budget-for-percise-failure.md GitHub Actions / Markdown LinterLine length [Expected: 80; Actual: 564]
|
||
|
||
While this approach is effective, it introduces additional work and complexity. These mechanisms are often implementation-specific, and requiring all clients to track the exact number of executed BPF instructions for consensus is costly and unnecessary. Such precision is not essential for protocol-level consensus, especially since these cases are rare. | ||
Check failure on line 36 in proposals/simd-XXXX-vm-consume-budget-for-percise-failure.md GitHub Actions / Markdown LinterLine length [Expected: 80; Actual: 354]
|
||
|
||
### Clarified Protocol Behavior: | ||
|
||
Instead of mandating implementation-specific work to handle exceptions, we propose the following clarification in the protocol: | ||
Check failure on line 40 in proposals/simd-XXXX-vm-consume-budget-for-percise-failure.md GitHub Actions / Markdown LinterLine length [Expected: 80; Actual: 127]
|
||
|
||
- For successful execution of a Basic Block (i.e., the block exits at the last BPF instruction), the deterministic CU cost of the block will be charged to the transaction’s CU meter. This ensures that CU consumption for successful transactions is accurately accounted for. | ||
Check failure on line 42 in proposals/simd-XXXX-vm-consume-budget-for-percise-failure.md GitHub Actions / Markdown LinterLine length [Expected: 80; Actual: 272]
|
||
- In the event of an exception during Basic Block execution, where the block does not exit normally, the requested CUs for the transaction will be charged to the CU meter. This allows for a simple and efficient fallback mechanism that avoids the need for tracking the exact number of executed instructions up to the point of failure. | ||
|
||
By adopting this approach, the protocol avoids the overhead of requiring precise instruction-level CU tracking for transactions that fail. Instead, the requested CU limit of the transaction will be used, simplifying the handling of failed transactions while still maintaining consensus. | ||
|
||
### Conclusion: | ||
|
||
This proposal enhances performance and simplifies CU tracking by formalizing the use of Basic Blocks for efficient execution. It eliminates the need for costly, implementation-specific work to track CU consumption during execution failures, providing a clear and consistent approach to handling exceptions. This change allows clients to maintain consensus without sacrificing performance, ensuring that the protocol remains both efficient and robust. | ||
|
||
## Alternatives Considered | ||
|
||
None | ||
|
||
## New Terminology | ||
|
||
- [Basic Block](https://en.wikipedia.org/wiki/Basic_block):i In the context of JIT execution and BPF processing, a Basic Block is a sequence of BPF instructions that forms a single, linear flow of control with no loops or conditional branches except for the entry and exit points. It represents a segment of code where execution starts at the first instruction and proceeds sequentially through to the last instruction without deviation. The Basic Block is characterized by its predictable execution path, allowing for efficient budget checks and optimizations, as its Compute Unit (CU) cost can be determined before execution and verified at the end of the block. | ||
|
||
## Detailed Design | ||
|
||
At banking stage [here](https://github.com/anza-xyz/agave/blob/master/core/src/banking_stage/committer.rs#L99) and replay stage [here](https://github.com/anza-xyz/agave/blob/master/ledger/src/blockstore_processor.rs#L239) where Transaction's executed_units is checked, implement new logic: | ||
``` | ||
let execution_cu = match transaction.execution_results { | ||
Ok(_) || Err(TransactionError::CustomError(_)) => committed_tx.executed_cu, | ||
_ => transaction.requested_cu, | ||
}; | ||
... ... | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. alternatively, instead of charging different CUs based on |
||
``` | ||
|
||
## Impact | ||
|
||
None | ||
|
||
## Security Considerations | ||
|
||
One potential issue with using requested CUs in the case of failed transactions is the risk of transactions with grossly large CU requests consuming an excessive portion of the block's CU limit. This could effectively cause a denial-of-service effect by preventing legitimate transactions from being included in the block. To mitigate this risk, it is recommended that this proposal be implemented after SIMD-172 is deployed, which removes the possibility of accidentally requesting an excessively large number of CUs. | ||
|
||
By ensuring that CU requests are reasonable and controlled, the risk of failed transactions taking up disproportionate block space will be minimized, allowing the proposed solution to work effectively without compromising block utilization. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not too familiar with InstructionError, wondering if correct to use Requested_cu for any error but
TransactionError::CustomError(_)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the most important part to get right. In Agave's VM, I think it's
I'm not sure exactly what these translate to as
TransactionError
s though.In firedancer, it's maybe something like
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indeed, what about
AccessViolation
in agave's VM, should it also be considered as "irregular failure"?I failed to find where EbpfError converted to InstructionError, there are some casting at
bpf_loader
, is it the right place to look at @Lichtso ?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It happens here: https://github.com/anza-xyz/agave/blob/7741b250a6e76afc9e7385ceb64c341f4bc21622/programs/bpf_loader/src/lib.rs#L1473