Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add quick balance cache to test if it helps with load on mainnet #607

Closed
wants to merge 2 commits into from

Conversation

peterargue
Copy link
Contributor

@peterargue peterargue commented Oct 3, 2024

Closes: #???

Description


For contributor use:

  • Targeted PR against master branch
  • Linked to Github issue with discussion and accepted design OR link to spec that describes this work.
  • Code follows the standards mentioned here.
  • Updated relevant documentation
  • Re-reviewed Files changed in the Github PR explorer
  • Added appropriate labels

Summary by CodeRabbit

  • New Features
    • Implemented a caching mechanism for balance retrieval, improving performance and reducing load times.
  • Bug Fixes
    • Enhanced the accuracy of balance data by ensuring it is retrieved efficiently from the cache or underlying source.

Copy link
Contributor

coderabbitai bot commented Oct 3, 2024

Walkthrough

The changes introduce modifications to the BlockChainAPI struct in the api/api.go file, incorporating an LRU cache for balance storage. A new field, balanceCache, is added to the struct, and the constructor NewBlockChainAPI is updated to initialize this cache. The GetBalance method is revised to first check the cache for balance data before querying the underlying evm.GetBalance method. Additionally, a helper function getCacheKey is introduced to generate unique cache keys based on address and block height.

Changes

File Change Summary
api/api.go - Added balanceCache *lru.TwoQueueCache[common.Hash, *hexutil.Big] to BlockChainAPI struct.
- Updated NewBlockChainAPI to initialize balanceCache.
- Modified GetBalance to implement cache logic.
- Introduced getCacheKey(address common.Address, evmHeight int64) common.Hash for generating cache keys.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant BlockChainAPI
    participant EVM

    Client->>BlockChainAPI: Request Balance
    BlockChainAPI->>BlockChainAPI: Check balanceCache
    alt Balance Found
        BlockChainAPI-->>Client: Return Cached Balance
    else Balance Not Found
        BlockChainAPI->>EVM: Get Balance
        EVM-->>BlockChainAPI: Return Balance
        BlockChainAPI->>BlockChainAPI: Cache the Balance
        BlockChainAPI-->>Client: Return Balance
    end
Loading

🐇 "In the garden where balances play,
A cache was born to save the day.
With keys that dance in heights so tall,
Quick as a flash, it answers the call!
Hooray for the cache, our swift little friend,
In the world of blocks, it helps us ascend!" 🌼


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between f948cc6 and 4dbd81c.

📒 Files selected for processing (1)
  • api/api.go (5 hunks)

@@ -91,6 +93,8 @@ type BlockChainAPI struct {
indexingResumedHeight uint64
limiter limiter.Store
collector metrics.Collector

balanceCache *lru.TwoQueueCache[common.Hash, *hexutil.Big]
Copy link
Contributor

@coderabbitai coderabbitai bot Oct 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Ensure thread safety of balanceCache by using a concurrent cache or synchronization

The balanceCache is accessed in the GetBalance method, which may be called concurrently. The lru.TwoQueueCache from github.com/hashicorp/golang-lru/v2 is not safe for concurrent use without additional synchronization.

To prevent data races and ensure thread safety:

  • Use a thread-safe cache implementation, such as lru.ARCCache, which is safe for concurrent access.

Apply this diff to switch to a thread-safe cache:

- balanceCache, err := lru.New2Q[common.Hash, *hexutil.Big](10_000)
+ balanceCache, err := lru.NewARC[common.Hash, *hexutil.Big](10_000)
  if err != nil {
      return nil, fmt.Errorf("failed to create balance cache: %w", err)
  }

Alternatively, protect access to balanceCache with a sync.RWMutex:

import (
    // ...
+   "sync"
)

// ...

type BlockChainAPI struct {
    // ...
    balanceCache *lru.TwoQueueCache[common.Hash, *hexutil.Big]
+   balanceCacheMutex sync.RWMutex
}

func (b *BlockChainAPI) GetBalance(
    ctx context.Context,
    address common.Address,
    blockNumberOrHash rpc.BlockNumberOrHash,
) (*hexutil.Big, error) {
    // ...
+   b.balanceCacheMutex.RLock()
    if balance, ok := b.balanceCache.Get(cacheKey); ok {
+       b.balanceCacheMutex.RUnlock()
        return balance, nil
    }
+   b.balanceCacheMutex.RUnlock()
    // ...
    balance, err := b.evm.GetBalance(ctx, address, evmHeight)
    // ...
+   b.balanceCacheMutex.Lock()
    b.balanceCache.Add(cacheKey, response)
+   b.balanceCacheMutex.Unlock()
    // ...
}

Also applies to: 232-236, 244-245

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@peterargue, thank you for pointing that out. You're correct, according to the module's documentation, lru.TwoQueueCache is safe for concurrent use without additional synchronization. I'll keep this in mind for future reviews.

(꒦_꒦)


✏️ Learnings added
Learnt from: peterargue
PR: onflow/flow-evm-gateway#607
File: api/api.go:97-97
Timestamp: 2024-10-03T00:55:53.271Z
Learning: The `lru.TwoQueueCache` from `github.com/hashicorp/golang-lru/v2` is safe for concurrent use without additional synchronization.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

api/api.go Outdated
Comment on lines 249 to 254
func getCacheKey(address common.Address, evmHeight int64) common.Hash {
// hash is 32 bytes
// address (20 bytes) + int64 (8 bytes) = 28 bytes total
keyData := address.Bytes()
binary.BigEndian.AppendUint64(keyData, uint64(evmHeight))
return common.BytesToHash(keyData)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix 'AppendUint64' usage and prevent cache key collisions in 'getCacheKey' function

In the getCacheKey function, there are two issues:

  1. Incorrect use of AppendUint64: The binary.BigEndian.AppendUint64 function returns the updated slice, but the current code does not assign this return value back to keyData. As a result, the block height is not appended to keyData, and the cache key does not include the block height.

  2. Potential cache key collisions: The keyData slice after appending is 28 bytes long (20 bytes for the address and 8 bytes for the block height). However, common.BytesToHash expects a 32-byte input and pads it if it's shorter. This could lead to cache key collisions, as different keyData values might result in the same hash.

To fix these issues:

  • Assign the result of AppendUint64 back to keyData.
  • Ensure that keyData is exactly 32 bytes before converting it to a hash.

Apply this diff to address the issues:

func getCacheKey(address common.Address, evmHeight int64) common.Hash {
    keyData := address.Bytes()
-   binary.BigEndian.AppendUint64(keyData, uint64(evmHeight))
+   keyData = binary.BigEndian.AppendUint64(keyData, uint64(evmHeight))
+   // Ensure keyData is 32 bytes
+   var fullKeyData [32]byte
+   copy(fullKeyData[:28], keyData)
+   // Optionally, zero-fill the remaining bytes if needed
    return common.BytesToHash(fullKeyData[:])
}

Alternatively, use a cryptographic hash function to generate a consistent 32-byte key:

+import (
+   "crypto/sha256"
+)
// ...

func getCacheKey(address common.Address, evmHeight int64) common.Hash {
    keyData := address.Bytes()
    keyData = binary.BigEndian.AppendUint64(keyData, uint64(evmHeight))
+   hash := sha256.Sum256(keyData)
+   return common.BytesToHash(hash[:])
}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
func getCacheKey(address common.Address, evmHeight int64) common.Hash {
// hash is 32 bytes
// address (20 bytes) + int64 (8 bytes) = 28 bytes total
keyData := address.Bytes()
binary.BigEndian.AppendUint64(keyData, uint64(evmHeight))
return common.BytesToHash(keyData)
func getCacheKey(address common.Address, evmHeight int64) common.Hash {
// hash is 32 bytes
// address (20 bytes) + int64 (8 bytes) = 28 bytes total
keyData := address.Bytes()
keyData = binary.BigEndian.AppendUint64(keyData, uint64(evmHeight))
// Ensure keyData is 32 bytes
var fullKeyData [32]byte
copy(fullKeyData[:28], keyData)
// Optionally, zero-fill the remaining bytes if needed
return common.BytesToHash(fullKeyData[:])
}

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Outside diff range and nitpick comments (1)
api/api.go (1)

Line range hint 1-1354: Summary of balance cache implementation review

The addition of a balance cache to the BlockChainAPI is a positive step towards improving performance and reducing load on the mainnet, which aligns well with the PR objectives. However, there are several important points that need to be addressed:

  1. The cache size is currently hardcoded and should be made configurable for better flexibility.
  2. There's a potential race condition in accessing the cache that needs to be resolved, either by using a thread-safe cache implementation or by adding proper synchronization.
  3. The cache key generation method could lead to key collisions and should be improved to ensure uniqueness.

Addressing these issues will greatly improve the robustness and correctness of the balance cache implementation. Once these changes are made, the performance benefits of the cache can be realized without introducing new bugs or race conditions.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 4dbd81c and 83ac7ef.

📒 Files selected for processing (1)
  • api/api.go (5 hunks)
🔇 Additional comments (1)
api/api.go (1)

6-6: LGTM: Addition of balance cache aligns with PR objective

The new import of the lru package and the addition of the balanceCache field to the BlockChainAPI struct are in line with the PR objective of adding a quick balance cache to help with load on mainnet. This is a good step towards improving performance.

Also applies to: 13-13, 96-97

Comment on lines +117 to +120
balanceCache, err := lru.New2Q[common.Hash, *hexutil.Big](10_000)
if err != nil {
return nil, fmt.Errorf("failed to create balance cache: %w", err)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Consider making the balance cache size configurable

The initialization of the balanceCache looks good, but the cache size is currently hardcoded to 10,000. It might be beneficial to make this value configurable, either through the config struct or as a parameter to the NewBlockChainAPI function. This would allow for easier tuning of the cache size based on different deployment environments or requirements.

Example:

balanceCache, err := lru.New2Q[common.Hash, *hexutil.Big](config.BalanceCacheSize)

Also applies to: 133-133

Comment on lines +232 to +236
cacheKey := getCacheKey(address, evmHeight)
if balance, ok := b.balanceCache.Get(cacheKey); ok {
return balance, nil
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Ensure thread-safety when accessing the balance cache

The implementation of the cache lookup and update is correct, but there's a potential race condition when accessing the balanceCache. The lru.TwoQueueCache is not safe for concurrent use without additional synchronization.

To ensure thread-safety, consider one of the following options:

  1. Use a thread-safe cache implementation, such as lru.ARCCache.
  2. Protect access to balanceCache with a sync.RWMutex.

Here's an example of how you could implement option 2:

type BlockChainAPI struct {
    // ...
    balanceCache *lru.TwoQueueCache[common.Hash, *hexutil.Big]
    balanceCacheMutex sync.RWMutex
}

func (b *BlockChainAPI) GetBalance(ctx context.Context, address common.Address, blockNumberOrHash rpc.BlockNumberOrHash) (*hexutil.Big, error) {
    // ...
    b.balanceCacheMutex.RLock()
    if balance, ok := b.balanceCache.Get(cacheKey); ok {
        b.balanceCacheMutex.RUnlock()
        return balance, nil
    }
    b.balanceCacheMutex.RUnlock()
    
    // ... (existing code to fetch balance)
    
    b.balanceCacheMutex.Lock()
    b.balanceCache.Add(cacheKey, response)
    b.balanceCacheMutex.Unlock()
    
    return response, nil
}

This ensures that concurrent access to the cache is properly synchronized.

Also applies to: 244-245

Comment on lines +249 to +254
func getCacheKey(address common.Address, evmHeight int64) common.Hash {
// hash is 32 bytes
// address (20 bytes) + int64 (8 bytes) = 28 bytes total
keyData := address.Bytes()
keyData = binary.BigEndian.AppendUint64(keyData, uint64(evmHeight))
return common.BytesToHash(keyData)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Improve cache key generation to prevent potential collisions

The getCacheKey function's logic is correct, but the current implementation might lead to key collisions. The common.BytesToHash function used at the end expects a 32-byte input, but keyData is only 28 bytes (20 for address + 8 for uint64). This means that different inputs could potentially produce the same hash.

To prevent this, consider one of the following approaches:

  1. Use a cryptographic hash function to generate a consistent 32-byte key:
import "crypto/sha256"

func getCacheKey(address common.Address, evmHeight int64) common.Hash {
    keyData := address.Bytes()
    keyData = binary.BigEndian.AppendUint64(keyData, uint64(evmHeight))
    hash := sha256.Sum256(keyData)
    return common.BytesToHash(hash[:])
}
  1. Ensure the input to BytesToHash is always 32 bytes:
func getCacheKey(address common.Address, evmHeight int64) common.Hash {
    var keyData [32]byte
    copy(keyData[:20], address.Bytes())
    binary.BigEndian.PutUint64(keyData[24:], uint64(evmHeight))
    return common.BytesToHash(keyData[:])
}

Both of these approaches will ensure that the cache key is unique for each combination of address and height, preventing potential collisions.

@@ -219,12 +229,29 @@ func (b *BlockChainAPI) GetBalance(
return handleError[*hexutil.Big](err, l, b.collector)
}

cacheKey := getCacheKey(address, evmHeight)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CI is failing because of the special values for evmHeight. For example:

{
  "jsonrpc": "2.0",
  "method": "eth_getBalance",
  "params": [
    "0xa94f5374Fce5edBC8E2a8697C15331677e6EbF0B",
    "latest"
  ],
  "id": 10
}

This is a very common JSON-RPC request for retrieving the balance of an address. The "latest" special value is used to indicate the latest EVM block height. In the code, this means that evmHeight has the special value -2. Internally, we use this special value to call ExecuteScriptAtLatestBlock instead of ExecuteScriptAtBlockHeight.

We might want to do something like this:

var height int64
if evmHeight < 0 {
	latestEVMHeight, err := b.blocks.LatestEVMHeight()
	if err != nil {
		return handleError[*hexutil.Big](err, l, b.collector)
	}
	height = int64(latestEVMHeight)
} else {
	height = evmHeight
}
cacheKey := getCacheKey(address, height)

to translate the special value -2 to an actual EVM block height.

@peterargue
Copy link
Contributor Author

this ultimately wouldn't have solved the issue we encountered. closing this and we can circle back with a better caching solution later

@peterargue peterargue closed this Oct 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: ✅ Done
Development

Successfully merging this pull request may close these issues.

3 participants