Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(torii): add world block and instrument queries #2796

Merged
merged 4 commits into from
Dec 12, 2024

Conversation

tarrencev
Copy link
Contributor

@tarrencev tarrencev commented Dec 11, 2024

Run with RUST_LOG=torii_grpc::server=debug,info cargo run --bin torii -- --db-dir=./eternum.db --config=eternum.toml

Summary by CodeRabbit

Summary by CodeRabbit

  • New Features

    • Introduced a new configuration file eternum.toml with various settings for blockchain applications, including parameters for data processing and transaction management.
    • Enhanced logging capabilities in the fetch_entities method to provide detailed insights into execution flow and performance metrics.
    • Added a new field world_block to configuration settings, allowing for customizable indexing starting points.
    • Added a world_block field to the IndexingOptions struct for specifying the starting block number for indexing.
  • Bug Fixes

    • Adjusted the default starting point for data fetching, potentially improving data processing efficiency.
  • Refactor

    • Refactored the fetch_entities method for improved clarity and maintainability, while preserving existing functionality.

Copy link

coderabbitai bot commented Dec 11, 2024

Walkthrough

Ohayo, sensei! This pull request introduces several changes across multiple files. The EngineConfig struct now includes a new field world_block, which allows for a configurable fallback block number in the fetch_data method, enhancing its flexibility. In the DojoWorld struct, detailed logging has been added to the fetch_entities and retrieve_entities methods, improving traceability without changing the core functionality. Additionally, a new configuration file, eternum.toml, has been created, specifying various settings for a blockchain application, including parameters for indexing and RPC configurations.

Changes

File Path Change Summary
crates/torii/core/src/engine.rs Added world_block field to EngineConfig with default value 0; updated fetch_data method to use world_block instead of hardcoded 0.
crates/torii/grpc/src/server/mod.rs Enhanced DojoWorld struct: added logging to fetch_entities and retrieve_entities methods, improving traceability of operations.
eternum.toml New configuration file created with parameters for world_address, rpc, explorer, and indexing settings.
bin/torii/src/main.rs Added world_block field to EngineConfig initialization from command-line arguments.
crates/torii/cli/src/options.rs Added world_block field to IndexingOptions with default value 0; updated merge method to handle world_block.

Possibly related PRs

  • fix(torii-grpc): retrieve balances #2777: The changes in this PR involve modifications to the DojoWorld struct in crates/torii/grpc/src/server/mod.rs, but they do not directly relate to the EngineConfig struct or the fetch_data method in the main PR.

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Experiment)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Outside diff range and nitpick comments (3)
eternum.toml (2)

1-2: Ohayo sensei! Consider security implications of exposed configuration.

The RPC endpoint and world address are directly exposed in the configuration. Consider using environment variables or a secure configuration management system for production deployments.

-world_address = "0x6a9e4c6f0799160ea8ddc43ff982a5f83d7f633e9732ce42701de1288ff705f"
-rpc = "https://api.cartridge.gg/x/starknet/mainnet"
+world_address = "${WORLD_ADDRESS}"
+rpc = "${RPC_ENDPOINT}"

9-9: Consider polling interval trade-offs.

The polling interval of 500ms might be too aggressive for production use, potentially causing unnecessary load on the RPC endpoint.

Consider:

  1. Implementing exponential backoff when no new events are found
  2. Adding jitter to prevent thundering herd problems
  3. Making the interval configurable based on network conditions
crates/torii/grpc/src/server/mod.rs (1)

266-270: Ohayo sensei! Consider log level consistency.

The logging uses debug level for important operational metrics. Consider using info for significant operations and timing information.

-        tracing::debug!(
+        tracing::info!(
             "Fetching entities from table {table} with {} entity/model pairs",
             entities.len()
         );
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between bcd581e and 02778a1.

📒 Files selected for processing (3)
  • crates/torii/core/src/engine.rs (1 hunks)
  • crates/torii/grpc/src/server/mod.rs (3 hunks)
  • eternum.toml (1 hunks)
🔇 Additional comments (1)
eternum.toml (1)

6-7: Review chunk sizes for memory optimization.

The chunk sizes are quite large (events_chunk_size = 1024, blocks_chunk_size = 1024000). This could lead to high memory usage during processing.

Consider:

  1. Making these values configurable based on available system resources
  2. Implementing a dynamic chunk size based on system load
  3. Adding memory monitoring and adaptive throttling

@@ -323,7 +323,7 @@ impl<P: Provider + Send + Sync + std::fmt::Debug + 'static> Engine<P> {
pub async fn fetch_data(&mut self, cursors: &Cursors) -> Result<FetchDataResult> {
let latest_block = self.provider.block_hash_and_number().await?;

let from = cursors.head.unwrap_or(0);
let from = cursors.head.unwrap_or(948000);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Ohayo sensei! Document the significance of block number 948000.

The hardcoded block number 948000 lacks context. This magic number should be documented or moved to a configuration value.

-        let from = cursors.head.unwrap_or(948000);
+        // Start from Eternum genesis block (Block 948000, deployed on YYYY-MM-DD)
+        const ETERNUM_GENESIS_BLOCK: u64 = 948000;
+        let from = cursors.head.unwrap_or(ETERNUM_GENESIS_BLOCK);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
let from = cursors.head.unwrap_or(948000);
// Start from Eternum genesis block (Block 948000, deployed on YYYY-MM-DD)
const ETERNUM_GENESIS_BLOCK: u64 = 948000;
let from = cursors.head.unwrap_or(ETERNUM_GENESIS_BLOCK);

Comment on lines +315 to +317
let query_start = std::time::Instant::now();
for (models_str, entity_ids) in &model_groups {
tracing::debug!("Processing model group with {} entities", entity_ids.len());
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add error logging for model processing.

The model group processing lacks error logging. Consider adding error logs to help with debugging.

         let query_start = std::time::Instant::now();
         for (models_str, entity_ids) in &model_groups {
-            tracing::debug!("Processing model group with {} entities", entity_ids.len());
+            tracing::info!("Processing model group with {} entities", entity_ids.len());
+            if let Err(e) = process_model_group(models_str, entity_ids).await {
+                tracing::error!("Failed to process model group: {}", e);
+                return Err(e);
+            }

Committable suggestion skipped: line range outside the PR's diff.

Copy link

codecov bot commented Dec 11, 2024

Codecov Report

Attention: Patch coverage is 73.33333% with 8 lines in your changes missing coverage. Please review.

Project coverage is 55.76%. Comparing base (bcd581e) to head (acc950f).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
crates/torii/grpc/src/server/mod.rs 80.95% 4 Missing ⚠️
crates/torii/cli/src/options.rs 66.66% 2 Missing ⚠️
bin/torii/src/main.rs 0.00% 1 Missing ⚠️
crates/torii/core/src/engine.rs 50.00% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main    #2796   +/-   ##
=======================================
  Coverage   55.75%   55.76%           
=======================================
  Files         439      439           
  Lines       55522    55549   +27     
=======================================
+ Hits        30957    30977   +20     
- Misses      24565    24572    +7     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Currently this commit will break the indexing of tokens
since they need to start from block 0.
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (3)
eternum.toml (1)

1-3: Ohayo sensei! Consider environment-specific configuration.

The RPC endpoint and world address are hardcoded. Consider making these configurable through environment variables for different deployment environments.

bin/torii/src/main.rs (1)

167-167: Consider adding validation for world_block.

The world_block value is passed directly to the engine config without validation. Consider adding checks to ensure it's not greater than the current block number.

+    let current_block = provider.block_number().await?;
+    if args.indexing.world_block > current_block {
+        return Err(anyhow::anyhow!("world_block cannot be greater than current block number"));
+    }
     let mut engine: Engine<Arc<JsonRpcClient<HttpTransport>>> = Engine::new(
crates/torii/cli/src/options.rs (1)

226-228: Consider additional validation in merge method.

The merge logic only updates world_block if it's 0, but we might want to add validation to ensure the new value doesn't conflict with token indexing when contracts are present.

             if self.world_block == 0 {
+                if !self.contracts.is_empty() && other.world_block > 0 {
+                    log::warn!("Setting world_block with existing contracts may affect token indexing");
+                }
                 self.world_block = other.world_block;
             }
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 02778a1 and 6420e12.

📒 Files selected for processing (5)
  • bin/torii/src/main.rs (1 hunks)
  • crates/torii/cli/src/options.rs (3 hunks)
  • crates/torii/core/src/engine.rs (3 hunks)
  • crates/torii/grpc/src/server/mod.rs (4 hunks)
  • eternum.toml (1 hunks)
🔇 Additional comments (6)
eternum.toml (2)

6-6: Verify the implications of starting from block 948000.

The high starting block number (948000) might affect token indexing as mentioned in the CLI options documentation. Please ensure this won't cause issues with historical token balances.

✅ Verification successful

Ohayo sensei! After analyzing the codebase, I can now provide a definitive response about the world block number concern.

Starting from block 948000 is safe for token indexing

The codebase shows that token balances and transfers are handled independently of the world block number. The system:

  • Uses separate tables for token balances (token_balances) and transfers (token_transfers)
  • Maintains token metadata and balances through dedicated ERC20/ERC721 handlers
  • Tracks token events and balances through their own indexing logic

The world block number only affects world state initialization and does not impact token indexing capabilities.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check for token-related events before the world_block
rg -A 2 "Transfer|Approval" | grep -B 2 "block.*[0-9].*948000"

Length of output: 64


Script:

#!/bin/bash
# Let's check the documentation or config files for any token-related configurations
rg -i "token|balance" -A 3 -B 3

# Also check for any mentions of block numbers in config or documentation
rg -i "block.*number|start.*block" -A 3 -B 3

Length of output: 66218


13-13: Review the contract list for completeness.

The configuration includes multiple ERC721 and ERC20 contracts. Please verify that this list is complete and these contracts are still active.

✅ Verification successful

Let me search for more contract references to ensure completeness.


Ohayo sensei! Let me search for any additional contract references in the codebase.


Ohayo sensei! Based on the codebase analysis, I can now provide a final response about the contract list verification.

The contract list appears to be complete and properly configured

The contracts listed in eternum.toml are referenced in the codebase's ERC token handling components (as seen in the file paths like torii/graphql/src/object/erc/), and the configuration appears to be properly integrated with the system's ERC20 and ERC721 processing capabilities (evidenced by processors in torii/core/src/processors/erc20_transfer.rs and erc721_transfer.rs).

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for contract references in the codebase
rg "0x57675b9c0bd62b096a2e15502a37b290fa766ead21c33eda42993e48a714b80|0x7ae27a31bb6526e3de9cf02f081f6ce0615ac12a6d7b85ee58b8ad7947a2809|0x124aeb495b947201f5fac96fd1138e326ad86195b98df6dec9009158a533b49"

Length of output: 457


Script:

#!/bin/bash
# Search for any ERC20 or ERC721 contract references in the codebase
rg -i "ERC(20|721):[0-9a-fA-F]" --type toml --type rust --type typescript --type javascript

# Look for any configuration files that might contain contract addresses
fd -e toml -e json -e yaml -e yml --exec cat {}

Length of output: 66990


Script:

#!/bin/bash
# Search for contract-related files and patterns
fd -e cairo -e rs -e ts -e js --exec grep -l "contract" {} \;

# Look for any ERC token patterns in the source files
rg -i "ERC(20|721)" --type rust --type cairo

Length of output: 14315

crates/torii/cli/src/options.rs (1)

161-172: LGTM! Well-documented warning about token indexing.

The documentation clearly warns about the implications of using world_block with token indexing. This is crucial for users to understand the limitations.

crates/torii/core/src/engine.rs (2)

151-151: Ohayo sensei! The world_block configuration looks good!

The addition of the world_block field to EngineConfig with a proper default value is well-structured.

Also applies to: 163-163


328-328: Ohayo sensei! Verify the world_block configuration impact.

The change to use self.config.world_block instead of a hardcoded value improves flexibility. However, let's verify its usage across the codebase.

✅ Verification successful

Let me search for any references to world_block configuration to better understand its usage.


Ohayo sensei! The world_block configuration change is safe and well-integrated.

The change to use self.config.world_block is properly supported across the codebase:

  • It's a configurable CLI option with a default value of 0
  • It's consistently used in the engine for determining the starting block
  • The configuration is properly propagated through the environment settings
  • No hardcoded block numbers were found that would conflict with this configuration
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for any hardcoded block numbers that might need to be updated

# Search for potential hardcoded block numbers in the codebase
rg -n '\b\d{6,}\b' --type rust

Length of output: 11346


Script:

#!/bin/bash
# Search for world_block configuration and usage
rg -n "world_block" --type rust -A 2

# Also check for any struct definitions containing world_block
ast-grep --pattern 'struct $_ {
  $$$
  world_block: $_
  $$$
}'

Length of output: 2592

crates/torii/grpc/src/server/mod.rs (1)

266-269: Ohayo sensei! Excellent logging implementation!

The added logging provides comprehensive tracing of the fetch_entities operation:

  • Entry point logging with entity count
  • Database transaction timing
  • Temporary table operations timing
  • Model group processing progress
  • Overall operation duration

This will greatly help with debugging and performance monitoring.

Also applies to: 270-271, 277-277, 282-282, 286-292, 295-313, 315-317, 345-345, 348-348, 351-351, 353-353

@glihm glihm changed the title Feat/index from eternum start block feat(torii): add world block and instrument queries Dec 12, 2024
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (2)
eternum.toml (2)

1-3: Ohayo sensei! Consider using environment-specific configuration templates

The configuration contains hardcoded production values which could lead to accidental mainnet interactions during development or testing.

Consider:

  1. Creating separate configuration templates for different environments (local, testnet, mainnet)
  2. Using environment variables for sensitive values like world_address and rpc

Example structure:

world_address = "${WORLD_ADDRESS}"
rpc = "${RPC_URL}"
explorer = false

6-14: Document critical configuration parameters

Ohayo! The configuration lacks documentation for critical parameters:

  • Purpose of specific world_block number (947500)
  • Implications of chunk sizes
  • Rationale for disabled transactions
  • Context for the ERC721 contract address

Consider adding inline documentation:

# Starting block number for world state indexing
world_block = 947500
# Number of events to process in a single batch
events_chunk_size = 1024
# Number of blocks to process in a single batch
blocks_chunk_size = 1024000
# Enable/disable pending transaction indexing
pending = true
# Polling interval in milliseconds
polling_interval = 500
# Maximum number of concurrent indexing tasks
max_concurrent_tasks = 100
# Enable/disable transaction indexing
transactions = false
# List of contract addresses to index (format: "ContractType:Address")
contracts = ["ERC721:0x57675b9c0bd62b096a2e15502a37b290fa766ead21c33eda42993e48a714b80"]
# List of namespaces to index
namespaces = []
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 6420e12 and acc950f.

📒 Files selected for processing (1)
  • eternum.toml (1 hunks)
🔇 Additional comments (1)
eternum.toml (1)

6-12: Verify performance implications of indexing parameters

The current configuration might lead to performance issues:

  • blocks_chunk_size of 1024000 could cause high memory usage
  • max_concurrent_tasks of 100 might overwhelm system resources

Let's analyze the impact:

@ponderingdemocritus ponderingdemocritus enabled auto-merge (squash) December 12, 2024 04:54
@ponderingdemocritus ponderingdemocritus merged commit 35a58b3 into main Dec 12, 2024
14 checks passed
@ponderingdemocritus ponderingdemocritus deleted the feat/index-from-eternum-start-block branch December 12, 2024 04:59
kariy pushed a commit that referenced this pull request Dec 13, 2024
* start indexing from fixed block for eternum

* Instrucment fetch entities

* feat: add the world block to start syncing

Currently this commit will break the indexing of tokens
since they need to start from block 0.

* chore: modify eternum config for quick sync

---------

Co-authored-by: tedison <[email protected]>
Co-authored-by: glihm <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants