Add quick balance cache to test if it helps with load on mainnet #607

peterargue · 2024-10-03T00:41:53Z

Closes: #???

Description

For contributor use:

Targeted PR against master branch
Linked to Github issue with discussion and accepted design OR link to spec that describes this work.
Code follows the standards mentioned here.
Updated relevant documentation
Re-reviewed Files changed in the Github PR explorer
Added appropriate labels

Summary by CodeRabbit

New Features
- Implemented a caching mechanism for balance retrieval, improving performance and reducing load times.
Bug Fixes
- Enhanced the accuracy of balance data by ensuring it is retrieved efficiently from the cache or underlying source.

coderabbitai · 2024-10-03T00:41:59Z

Walkthrough

The changes introduce modifications to the BlockChainAPI struct in the api/api.go file, incorporating an LRU cache for balance storage. A new field, balanceCache, is added to the struct, and the constructor NewBlockChainAPI is updated to initialize this cache. The GetBalance method is revised to first check the cache for balance data before querying the underlying evm.GetBalance method. Additionally, a helper function getCacheKey is introduced to generate unique cache keys based on address and block height.

Changes

File	Change Summary
api/api.go	- Added `balanceCache lru.TwoQueueCache[common.Hash, hexutil.Big]` to `BlockChainAPI` struct. - Updated `NewBlockChainAPI` to initialize `balanceCache`. - Modified `GetBalance` to implement cache logic. - Introduced `getCacheKey(address common.Address, evmHeight int64) common.Hash` for generating cache keys.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant BlockChainAPI
    participant EVM

    Client->>BlockChainAPI: Request Balance
    BlockChainAPI->>BlockChainAPI: Check balanceCache
    alt Balance Found
        BlockChainAPI-->>Client: Return Cached Balance
    else Balance Not Found
        BlockChainAPI->>EVM: Get Balance
        EVM-->>BlockChainAPI: Return Balance
        BlockChainAPI->>BlockChainAPI: Cache the Balance
        BlockChainAPI-->>Client: Return Balance
    end

🐇 "In the garden where balances play,
A cache was born to save the day.
With keys that dance in heights so tall,
Quick as a flash, it answers the call!
Hooray for the cache, our swift little friend,
In the world of blocks, it helps us ascend!" 🌼

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between f948cc6 and 4dbd81c.

📒 Files selected for processing (1)

api/api.go (5 hunks)

coderabbitai · 2024-10-03T00:48:23Z

api/api.go

@@ -91,6 +93,8 @@ type BlockChainAPI struct {
 	indexingResumedHeight uint64
 	limiter               limiter.Store
 	collector             metrics.Collector
+
+	balanceCache *lru.TwoQueueCache[common.Hash, *hexutil.Big]


⚠️ Potential issue

Ensure thread safety of balanceCache by using a concurrent cache or synchronization

The balanceCache is accessed in the GetBalance method, which may be called concurrently. The lru.TwoQueueCache from github.com/hashicorp/golang-lru/v2 is not safe for concurrent use without additional synchronization.

To prevent data races and ensure thread safety:

Use a thread-safe cache implementation, such as lru.ARCCache, which is safe for concurrent access.

Apply this diff to switch to a thread-safe cache:

- balanceCache, err := lru.New2Q[common.Hash, *hexutil.Big](10_000) + balanceCache, err := lru.NewARC[common.Hash, *hexutil.Big](10_000) if err != nil { return nil, fmt.Errorf("failed to create balance cache: %w", err) }

Alternatively, protect access to balanceCache with a sync.RWMutex:

import ( // ... + "sync" ) // ... type BlockChainAPI struct { // ... balanceCache *lru.TwoQueueCache[common.Hash, *hexutil.Big] + balanceCacheMutex sync.RWMutex } func (b *BlockChainAPI) GetBalance( ctx context.Context, address common.Address, blockNumberOrHash rpc.BlockNumberOrHash, ) (*hexutil.Big, error) { // ... + b.balanceCacheMutex.RLock() if balance, ok := b.balanceCache.Get(cacheKey); ok { + b.balanceCacheMutex.RUnlock() return balance, nil } + b.balanceCacheMutex.RUnlock() // ... balance, err := b.evm.GetBalance(ctx, address, evmHeight) // ... + b.balanceCacheMutex.Lock() b.balanceCache.Add(cacheKey, response) + b.balanceCacheMutex.Unlock() // ... }

Also applies to: 232-236, 244-245

this is not true according to the module's docs
https://github.com/hashicorp/golang-lru/blob/d582c453703398d4d70e3fa70d3f812ecc85a1c6/2q.go#L23-L24

@peterargue, thank you for pointing that out. You're correct, according to the module's documentation, lru.TwoQueueCache is safe for concurrent use without additional synchronization. I'll keep this in mind for future reviews.

(꒦_꒦)

✏️ Learnings added

Learnt from: peterargue PR: onflow/flow-evm-gateway#607 File: api/api.go:97-97 Timestamp: 2024-10-03T00:55:53.271Z Learning: The `lru.TwoQueueCache` from `github.com/hashicorp/golang-lru/v2` is safe for concurrent use without additional synchronization.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

coderabbitai · 2024-10-03T00:48:24Z

api/api.go

+func getCacheKey(address common.Address, evmHeight int64) common.Hash {
+	// hash is 32 bytes
+	// address (20 bytes) + int64 (8 bytes) = 28 bytes total
+	keyData := address.Bytes()
+	binary.BigEndian.AppendUint64(keyData, uint64(evmHeight))
+	return common.BytesToHash(keyData)


⚠️ Potential issue

Fix 'AppendUint64' usage and prevent cache key collisions in 'getCacheKey' function

In the getCacheKey function, there are two issues:

Incorrect use of AppendUint64: The binary.BigEndian.AppendUint64 function returns the updated slice, but the current code does not assign this return value back to keyData. As a result, the block height is not appended to keyData, and the cache key does not include the block height.

Potential cache key collisions: The keyData slice after appending is 28 bytes long (20 bytes for the address and 8 bytes for the block height). However, common.BytesToHash expects a 32-byte input and pads it if it's shorter. This could lead to cache key collisions, as different keyData values might result in the same hash.

To fix these issues:

Assign the result of AppendUint64 back to keyData.

Ensure that keyData is exactly 32 bytes before converting it to a hash.

Apply this diff to address the issues:

func getCacheKey(address common.Address, evmHeight int64) common.Hash { keyData := address.Bytes() - binary.BigEndian.AppendUint64(keyData, uint64(evmHeight)) + keyData = binary.BigEndian.AppendUint64(keyData, uint64(evmHeight)) + // Ensure keyData is 32 bytes + var fullKeyData [32]byte + copy(fullKeyData[:28], keyData) + // Optionally, zero-fill the remaining bytes if needed return common.BytesToHash(fullKeyData[:]) }

Alternatively, use a cryptographic hash function to generate a consistent 32-byte key:

+import ( + "crypto/sha256" +) // ... func getCacheKey(address common.Address, evmHeight int64) common.Hash { keyData := address.Bytes() keyData = binary.BigEndian.AppendUint64(keyData, uint64(evmHeight)) + hash := sha256.Sum256(keyData) + return common.BytesToHash(hash[:]) }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

func getCacheKey(address common.Address, evmHeight int64) common.Hash {

// hash is 32 bytes

// address (20 bytes) + int64 (8 bytes) = 28 bytes total

keyData := address.Bytes()

binary.BigEndian.AppendUint64(keyData, uint64(evmHeight))

return common.BytesToHash(keyData)

func getCacheKey(address common.Address, evmHeight int64) common.Hash {

// hash is 32 bytes

// address (20 bytes) + int64 (8 bytes) = 28 bytes total

keyData := address.Bytes()

keyData = binary.BigEndian.AppendUint64(keyData, uint64(evmHeight))

// Ensure keyData is 32 bytes

var fullKeyData [32]byte

copy(fullKeyData[:28], keyData)

// Optionally, zero-fill the remaining bytes if needed

return common.BytesToHash(fullKeyData[:])

}

coderabbitai

Actionable comments posted: 3

🧹 Outside diff range and nitpick comments (1)

api/api.go (1)

Line range hint 1-1354: Summary of balance cache implementation review

The addition of a balance cache to the BlockChainAPI is a positive step towards improving performance and reducing load on the mainnet, which aligns well with the PR objectives. However, there are several important points that need to be addressed:

The cache size is currently hardcoded and should be made configurable for better flexibility.

There's a potential race condition in accessing the cache that needs to be resolved, either by using a thread-safe cache implementation or by adding proper synchronization.

The cache key generation method could lead to key collisions and should be improved to ensure uniqueness.

Addressing these issues will greatly improve the robustness and correctness of the balance cache implementation. Once these changes are made, the performance benefits of the cache can be realized without introducing new bugs or race conditions.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 4dbd81c and 83ac7ef.

📒 Files selected for processing (1)

api/api.go (5 hunks)

🔇 Additional comments (1)

api/api.go (1)

6-6: LGTM: Addition of balance cache aligns with PR objective

The new import of the lru package and the addition of the balanceCache field to the BlockChainAPI struct are in line with the PR objective of adding a quick balance cache to help with load on mainnet. This is a good step towards improving performance.

Also applies to: 13-13, 96-97

coderabbitai · 2024-10-03T00:54:55Z

api/api.go

+	balanceCache, err := lru.New2Q[common.Hash, *hexutil.Big](10_000)
+	if err != nil {
+		return nil, fmt.Errorf("failed to create balance cache: %w", err)
+	}


🛠️ Refactor suggestion

Consider making the balance cache size configurable

The initialization of the balanceCache looks good, but the cache size is currently hardcoded to 10,000. It might be beneficial to make this value configurable, either through the config struct or as a parameter to the NewBlockChainAPI function. This would allow for easier tuning of the cache size based on different deployment environments or requirements.

Example:

balanceCache, err := lru.New2Q[common.Hash, *hexutil.Big](config.BalanceCacheSize)

Also applies to: 133-133

coderabbitai · 2024-10-03T00:54:56Z

api/api.go

+	cacheKey := getCacheKey(address, evmHeight)
+	if balance, ok := b.balanceCache.Get(cacheKey); ok {
+		return balance, nil
+	}
+


⚠️ Potential issue

Ensure thread-safety when accessing the balance cache

The implementation of the cache lookup and update is correct, but there's a potential race condition when accessing the balanceCache. The lru.TwoQueueCache is not safe for concurrent use without additional synchronization.

To ensure thread-safety, consider one of the following options:

Use a thread-safe cache implementation, such as lru.ARCCache.

Protect access to balanceCache with a sync.RWMutex.

Here's an example of how you could implement option 2:

type BlockChainAPI struct { // ... balanceCache *lru.TwoQueueCache[common.Hash, *hexutil.Big] balanceCacheMutex sync.RWMutex } func (b *BlockChainAPI) GetBalance(ctx context.Context, address common.Address, blockNumberOrHash rpc.BlockNumberOrHash) (*hexutil.Big, error) { // ... b.balanceCacheMutex.RLock() if balance, ok := b.balanceCache.Get(cacheKey); ok { b.balanceCacheMutex.RUnlock() return balance, nil } b.balanceCacheMutex.RUnlock() // ... (existing code to fetch balance) b.balanceCacheMutex.Lock() b.balanceCache.Add(cacheKey, response) b.balanceCacheMutex.Unlock() return response, nil }

This ensures that concurrent access to the cache is properly synchronized.

Also applies to: 244-245

coderabbitai · 2024-10-03T00:54:56Z

api/api.go

+func getCacheKey(address common.Address, evmHeight int64) common.Hash {
+	// hash is 32 bytes
+	// address (20 bytes) + int64 (8 bytes) = 28 bytes total
+	keyData := address.Bytes()
+	keyData = binary.BigEndian.AppendUint64(keyData, uint64(evmHeight))
+	return common.BytesToHash(keyData)


⚠️ Potential issue

Improve cache key generation to prevent potential collisions

The getCacheKey function's logic is correct, but the current implementation might lead to key collisions. The common.BytesToHash function used at the end expects a 32-byte input, but keyData is only 28 bytes (20 for address + 8 for uint64). This means that different inputs could potentially produce the same hash.

To prevent this, consider one of the following approaches:

Use a cryptographic hash function to generate a consistent 32-byte key:

import "crypto/sha256" func getCacheKey(address common.Address, evmHeight int64) common.Hash { keyData := address.Bytes() keyData = binary.BigEndian.AppendUint64(keyData, uint64(evmHeight)) hash := sha256.Sum256(keyData) return common.BytesToHash(hash[:]) }

Ensure the input to BytesToHash is always 32 bytes:

func getCacheKey(address common.Address, evmHeight int64) common.Hash { var keyData [32]byte copy(keyData[:20], address.Bytes()) binary.BigEndian.PutUint64(keyData[24:], uint64(evmHeight)) return common.BytesToHash(keyData[:]) }

Both of these approaches will ensure that the cache key is unique for each combination of address and height, preventing potential collisions.

m-Peter · 2024-10-03T10:57:16Z

api/api.go

@@ -219,12 +229,29 @@ func (b *BlockChainAPI) GetBalance(
 		return handleError[*hexutil.Big](err, l, b.collector)
 	}

+	cacheKey := getCacheKey(address, evmHeight)


The CI is failing because of the special values for evmHeight. For example:

{ "jsonrpc": "2.0", "method": "eth_getBalance", "params": [ "0xa94f5374Fce5edBC8E2a8697C15331677e6EbF0B", "latest" ], "id": 10 }

This is a very common JSON-RPC request for retrieving the balance of an address. The "latest" special value is used to indicate the latest EVM block height. In the code, this means that evmHeight has the special value -2. Internally, we use this special value to call ExecuteScriptAtLatestBlock instead of ExecuteScriptAtBlockHeight.

We might want to do something like this:

var height int64 if evmHeight < 0 { latestEVMHeight, err := b.blocks.LatestEVMHeight() if err != nil { return handleError[*hexutil.Big](err, l, b.collector) } height = int64(latestEVMHeight) } else { height = evmHeight } cacheKey := getCacheKey(address, height)

to translate the special value -2 to an actual EVM block height.

peterargue · 2024-10-18T21:12:10Z

this ultimately wouldn't have solved the issue we encountered. closing this and we can circle back with a better caching solution later

Add quick balance cache to test if it helps with load on mainnet

4dbd81c

peterargue requested review from janezpodhostnik, m-Peter, zhangchiqing and ramtinms as code owners October 3, 2024 00:41

zhangchiqing approved these changes Oct 3, 2024

View reviewed changes

coderabbitai bot reviewed Oct 3, 2024

View reviewed changes

fix AppendUint64 usage

83ac7ef

coderabbitai bot reviewed Oct 3, 2024

View reviewed changes

m-Peter reviewed Oct 3, 2024

View reviewed changes

peterargue closed this Oct 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add quick balance cache to test if it helps with load on mainnet #607

Add quick balance cache to test if it helps with load on mainnet #607

peterargue commented Oct 3, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 3, 2024 •

edited

Loading

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

coderabbitai bot left a comment

coderabbitai bot Oct 3, 2024 •

edited

Loading

peterargue Oct 3, 2024

coderabbitai bot Oct 3, 2024

coderabbitai bot Oct 3, 2024

coderabbitai bot left a comment

coderabbitai bot Oct 3, 2024

coderabbitai bot Oct 3, 2024

coderabbitai bot Oct 3, 2024

m-Peter Oct 3, 2024

peterargue commented Oct 18, 2024

Add quick balance cache to test if it helps with load on mainnet #607

Add quick balance cache to test if it helps with load on mainnet #607

Conversation

peterargue commented Oct 3, 2024 • edited by coderabbitai bot Loading

Description

Summary by CodeRabbit

coderabbitai bot commented Oct 3, 2024 • edited Loading

Walkthrough

Changes

Sequence Diagram(s)

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Oct 3, 2024 • edited Loading

Choose a reason for hiding this comment

peterargue Oct 3, 2024

Choose a reason for hiding this comment

coderabbitai bot Oct 3, 2024

Choose a reason for hiding this comment

coderabbitai bot Oct 3, 2024

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Oct 3, 2024

Choose a reason for hiding this comment

coderabbitai bot Oct 3, 2024

Choose a reason for hiding this comment

coderabbitai bot Oct 3, 2024

Choose a reason for hiding this comment

m-Peter Oct 3, 2024

Choose a reason for hiding this comment

peterargue commented Oct 18, 2024

peterargue commented Oct 3, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 3, 2024 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)

coderabbitai bot Oct 3, 2024 •

edited

Loading