-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
545 strategy backtests add financial metrics #548
545 strategy backtests add financial metrics #548
Conversation
WalkthroughThe pull request introduces significant modifications across multiple files, primarily enhancing the functionality of the betting and market interaction systems. Key updates include the addition of financial metrics calculations, such as the Sharpe ratio, and improvements to the handling of Ethereum transaction data through caching mechanisms. New classes and methods have been introduced to facilitate these enhancements, including the Changes
Possibly related issues
Possibly related PRs
Suggested reviewers
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
def prepare_wallet_daily_balance_df(self, timestamp_col_name: str, | ||
profit_col_name: str) -> pd.DataFrame: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Simply generate a df that entails the daily wallet balance
Day | balance (xDAI)
1 | 50
2 | 52
...
def calculate_annual_sharpe_ratio(self, timestamp_col_name: str = "timestamp", | ||
profit_col_name: str = "sim_profit") -> SharpeOutput: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added profit_col_name
as arg to allow this to be used also for the original bets Sharpe calculation (column bet_profit
), see usage https://github.com/gnosis/prediction-market-agent-tooling/pull/548/files#diff-8df61548bdb13e9d38216f5c88637ff50c852babd5d5a488bff2b5c43b7a1081R294
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 11
🧹 Outside diff range and nitpick comments (14)
examples/monitor/data_models.py (1)
1-24
: Consider testing and validation requirements.Given that these models handle financial calculations:
- Ensure comprehensive unit tests with edge cases for financial metrics
- Consider adding property-based tests for mathematical invariants
- Add validation for mathematical relationships (e.g., Sharpe ratio calculation consistency)
- Consider adding a custom validator to ensure timestamp timezone consistency
Would you like assistance in generating test cases or validation code?
examples/monitor/transaction_cache.py (3)
12-15
: Add type hints and use more specific cache directory names.Consider making these improvements for better maintainability and type safety:
- Add type hints for cache instance variables
- Use more specific cache directory names to avoid potential conflicts
def __init__(self, web3: Web3): - self.block_number_cache = dc.Cache("block_cache_dir") - self.block_timestamp_cache = dc.Cache("timestamp_cache_dir") + self.block_number_cache: dc.Cache = dc.Cache("eth_block_number_cache") + self.block_timestamp_cache: dc.Cache = dc.Cache("eth_block_timestamp_cache") self.web3 = web3
17-25
: Add docstring to document the function's behavior and retry mechanism.The implementation is solid, but documentation would help users understand the retry behavior and potential exceptions.
@tenacity.retry( wait=wait_exponential(multiplier=1, min=1, max=4), stop=tenacity.stop_after_attempt(7), after=lambda x: logger.debug(f"fetch tx failed, {x.attempt_number=}."), ) def fetch_block_number(self, transaction_hash: str) -> int: + """Fetch the block number for a given transaction hash with retry logic. + + Args: + transaction_hash: The hash of the transaction + + Returns: + The block number containing the transaction + + Raises: + tenacity.RetryError: If all retry attempts fail + """ tx = self.web3.eth.get_transaction(HexStr(transaction_hash)) return tx['blockNumber']
1-50
: Consider implementing cache size limits and cleanup strategy.The current implementation lacks cache size management which could lead to unbounded disk usage over time. Consider:
- Setting a maximum cache size using
diskcache.Cache(size_limit=1e9)
(e.g., 1GB)- Implementing a periodic cache cleanup job
- Monitoring cache hit/miss rates for optimization
This is particularly important in a prediction market context where the number of transactions could grow significantly.
prediction_market_agent_tooling/tools/langfuse_client_utils.py (2)
73-73
: LGTM! Consider enhancing the debug log with item count.The addition of debug logging for pagination is helpful for observability.
Consider including the total number of items fetched so far to better track the pagination progress:
- logger.debug(f"fetching page {page}") + logger.debug(f"fetching page {page} (total traces so far: {len(all_agent_traces)})")
Line range hint
186-195
: Consider improving the timeout configuration and error message.The hardcoded 20-minute timeout (1200 seconds) for matching bets with traces could be made more maintainable and informative.
Consider these improvements:
- Make the timeout configurable:
+MAX_BET_TRACE_MATCH_TIMEOUT_SECONDS = 1200 # 20 minutes + def get_trace_for_bet( bet: ResolvedBet, traces: list[ProcessMarketTrace] ) -> ProcessMarketTrace | None: # ... existing code ... + time_diff_seconds = abs(candidate_trace.timestamp_datetime - bet.created_time).total_seconds() if ( - abs(candidate_trace.timestamp_datetime - bet.created_time).total_seconds() - > 1200 + time_diff_seconds > MAX_BET_TRACE_MATCH_TIMEOUT_SECONDS ): logger.info( - f"Closest trace to bet has timestamp {candidate_trace.timestamp}, " - f"but bet was created at {bet.created_time}. Not matching" + f"Time difference ({time_diff_seconds:.2f}s) exceeds maximum allowed " + f"({MAX_BET_TRACE_MATCH_TIMEOUT_SECONDS}s). Trace timestamp: " + f"{candidate_trace.timestamp}, bet creation time: {bet.created_time}" )prediction_market_agent_tooling/markets/omen/omen_subgraph_handler.py (1)
438-443
: LGTM! Consider adding docstring for the block_number parameter.The changes improve code organization and add useful functionality for querying historical market states. Consider adding a docstring to document the purpose of the
block_number
parameter.def get_omen_market_by_market_id(self, market_id: HexAddress, block_number: int | None = None) -> OmenMarket: + """ + Fetch an Omen market by its ID. + + Args: + market_id: The hex address of the market + block_number: Optional block number to query historical market state + """examples/monitor/financial_metrics.py (4)
47-47
: Consider using 252 trading days instead of 365 for annualizationIn financial calculations, it's common to use 252 trading days to annualize metrics like volatility and Sharpe ratio, as it reflects the number of trading days in a year.
Apply this diff to adjust the annualization factor:
+ annualization_factor = np.sqrt(252) - annualized_volatility = daily_volatility * np.sqrt(365) + annualized_volatility = daily_volatility * annualization_factor mean_daily_return = wallet_daily_balance_df["returns"].mean() if daily_volatility != 0: daily_sharpe_ratio = (mean_daily_return - self.risk_free_rate) / daily_volatility else: daily_sharpe_ratio = 0.0 # You may choose to set it to None or raise an exception - annualized_sharpe_ratio = daily_sharpe_ratio * np.sqrt(365) + annualized_sharpe_ratio = daily_sharpe_ratio * annualization_factorAlso applies to: 50-50
19-19
: Enhance the error message with missing columnsIncluding the specific missing columns in the error message can aid in debugging.
Apply this diff to improve the error message:
if not set(required_columns).issubset(self.df.columns): + missing_columns = set(required_columns) - set(self.df.columns) - raise ValueError("Dataframe doesn't contain all the required columns.") + raise ValueError(f"DataFrame is missing required columns: {missing_columns}")
15-15
: Rename the method to follow Python naming conventionsIn Python, methods intended to be private are typically prefixed with a single underscore.
Apply this diff to rename the method:
- def __has_df_valid_columns_else_exception(self, + def _has_df_valid_columns_else_exception(self,And update the method calls accordingly:
- self.__has_df_valid_columns_else_exception([timestamp_col_name, profit_col_name]) + self._has_df_valid_columns_else_exception([timestamp_col_name, profit_col_name])
40-40
: Reminder: Address the TODO comment for writing testsThere's a
# ToDo - Write tests
comment indicating that tests are needed for this class. Implementing unit tests will help ensure the reliability of your calculations.Do you need assistance in creating unit tests for this class? I can help draft some test cases to get you started.
examples/monitor/match_bets_with_langfuse_traces.py (3)
99-103
: Simplify profit calculation in the 'correct' branch.Since
correct
isTrue
in this branch, the ternary conditionalif correct else -buy_trade.amount.amount
is unnecessary. You can simplify the profit calculation by removing the redundant conditional.Apply this diff to simplify the profit calculation:
profit = ( - received_outcome_tokens - buy_trade.amount.amount - if correct - else -buy_trade.amount.amount + received_outcome_tokens - buy_trade.amount.amount )
253-254
: Address the TODO: Implement unredeemed tokens value tracking.The TODO comment indicates an intention to add tracking for the value of tokens that weren't redeemed yet, possibly for portfolio tracking. Implementing this feature would enhance the simulation's accuracy by accounting for all holdings.
Would you like assistance in implementing this functionality or creating a GitHub issue to track this task?
141-175
: Remove commented-out strategies to improve code readability.The large block of commented-out strategies can make the code harder to read and maintain. Consider removing unused strategies or managing them through configuration files or command-line arguments for better code organization.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
⛔ Files ignored due to path filters (2)
poetry.lock
is excluded by!**/*.lock
,!**/*.lock
pyproject.toml
is excluded by!**/*.toml
📒 Files selected for processing (7)
examples/monitor/data_models.py
(1 hunks)examples/monitor/financial_metrics.py
(1 hunks)examples/monitor/match_bets_with_langfuse_traces.py
(11 hunks)examples/monitor/transaction_cache.py
(1 hunks)prediction_market_agent_tooling/markets/omen/data_models.py
(1 hunks)prediction_market_agent_tooling/markets/omen/omen_subgraph_handler.py
(21 hunks)prediction_market_agent_tooling/tools/langfuse_client_utils.py
(1 hunks)
🔇 Additional comments (3)
examples/monitor/data_models.py (1)
1-4
: LGTM! Appropriate imports for Pydantic models with UTC timestamp support.
examples/monitor/transaction_cache.py (1)
1-10
: LGTM! Well-organized imports and clear class definition.
The imports are properly organized and all necessary dependencies are included.
prediction_market_agent_tooling/markets/omen/omen_subgraph_handler.py (1)
Line range hint 1-916
: Verify error handling consistency across subgraph interactions.
The code interacts with multiple subgraphs and handles financial data. We should verify consistent error handling for subgraph failures.
✅ Verification successful
Based on the shell script results, I can now provide a final response since I have enough information about the error handling patterns:
Error handling is properly implemented across subgraph interactions
The codebase shows consistent error handling through:
-
Base retry mechanism in
BaseSubgraphHandler
:query_json
andload_subgraph
operations have 3 retry attempts with 1 second delay- All subgraph handlers inherit this behavior
-
Proper validation checks:
- Market fetching validates expected result count
- Empty result handling for image mappings and predictions
- Consistent model validation using Pydantic across all response parsing
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Check for error handling patterns in subgraph interactions
# Look for try-except blocks around subgraph queries
echo "Checking error handling patterns..."
rg -A 2 "self\.sg\.query_json|self\.do_query"
# Check for error handling in methods that interact with multiple subgraphs
echo "Checking multi-subgraph interaction patterns..."
rg -B 2 "subgraph.*Query\."
Length of output: 10216
@@ -121,7 +121,7 @@ def _get_fields_for_bets(self, bets_field: FieldPath) -> list[FieldPath]: | |||
return fields_for_bets + fields_for_markets | |||
|
|||
def _get_fields_for_reality_questions( | |||
self, questions_field: FieldPath | |||
self, questions_field: FieldPath |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Critical fix: Added missing self
parameter to class methods.
The addition of the self
parameter to these methods is a critical fix. Previously, these methods would have raised a TypeError when called as instance methods since they were missing the required self
parameter.
Also applies to: 163-163, 205-224, 281-281, 316-328, 368-392, 459-459, 477-478, 494-497, 519-530, 574-581, 595-599, 611-615, 627-637, 695-703, 751-763, 803-816, 849-849, 860-860, 892-892
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (8)
examples/monitor/match_bets_with_langfuse_traces.py (4)
Line range hint
60-110
: Add docstring documentation for the enhanced function.The function has been significantly updated with new parameters and logic. Adding a docstring would help explain:
- Purpose of the function
- Description of new parameters
- Return value details
- Example usage
Example docstring:
def get_outcome_for_trace( strategy: BettingStrategy, trace: ProcessMarketTrace, market_outcome: bool, actual_placed_bet: ResolvedBet, tx_block_cache: TransactionBlockCache, ) -> SimulatedOutcome | None: """Calculate the simulated outcome for a given trace using historical market state. Args: strategy: The betting strategy to simulate trace: The process market trace containing market and answer data market_outcome: The actual market outcome (True/False) actual_placed_bet: The actual bet that was placed tx_block_cache: Cache for retrieving historical block data Returns: SimulatedOutcome object containing size, direction, correctness and profit, or None if no valid trades could be calculated """
112-115
: Remove commented-out code.The old implementation is kept as comments. Since we're using version control, this code can be safely removed to improve readability.
148-185
: Document strategy selection criteria.Most strategy configurations are commented out. It's unclear which configurations should be used and why. Consider:
- Adding comments explaining the selection criteria
- Moving the strategy configurations to a config file
- Making it configurable via environment variables
262-263
: Address the TODO comment about token redemption tracking.The comment suggests implementing portfolio tracking for unredeemed tokens. This could be important for accurate profit calculation.
Would you like me to help implement the token redemption tracking or create a GitHub issue to track this enhancement?
prediction_market_agent_tooling/markets/omen/omen_subgraph_handler.py (1)
Line range hint
438-452
: Consider caching frequently accessed historical market data.The addition of block number querying is great for financial metrics calculation. To optimize performance when computing metrics like Sharpe ratio that require historical data:
- Consider implementing a caching layer for frequently accessed historical market states
- Add batch querying capabilities for multiple block numbers to reduce API calls
examples/monitor/financial_metrics.py (3)
15-19
: Enhance error message to include missing columnsIncluding the names of the missing columns in the exception message will provide clearer guidance during debugging.
Apply this diff to improve the error message:
def __has_df_valid_columns_else_exception( self, required_columns: list[str] ) -> None: if not set(required_columns).issubset(self.df.columns): - raise ValueError("Dataframe doesn't contain all the required columns.") + missing_columns = set(required_columns) - set(self.df.columns) + raise ValueError(f"DataFrame is missing required columns: {missing_columns}")
15-19
: Use single underscore for private method namingAccording to PEP 8 conventions, private methods should be prefixed with a single underscore. Renaming
__has_df_valid_columns_else_exception
to_has_df_valid_columns_else_exception
enhances code readability and consistency.Apply this diff to rename the method:
- def __has_df_valid_columns_else_exception( + def _has_df_valid_columns_else_exception(
7-63
: Add docstrings to the class and methodsAdding docstrings to the
SharpeRatioCalculator
class and its methods will improve documentation and help other developers understand the codebase more easily.Consider applying the following diff to add docstrings:
class SharpeRatioCalculator: + """ + Calculator for computing financial metrics such as the annualized Sharpe ratio from simulation details. + """ def __init__( self, details: list[SimulationDetail], risk_free_rate: float = 0.0 ) -> None: + """ + Initializes the SharpeRatioCalculator with simulation details and an optional risk-free rate. + + :param details: List of simulation details. + :param risk_free_rate: The risk-free rate used in the Sharpe ratio calculation. + """ self.details = details self.df = pd.DataFrame([d.model_dump() for d in self.details]) self.risk_free_rate = risk_free_rate def _has_df_valid_columns_else_exception( self, required_columns: list[str] ) -> None: + """ + Validates that the DataFrame contains the required columns. + + :param required_columns: List of column names that must be present. + :raises ValueError: If any required columns are missing. + """ if not set(required_columns).issubset(self.df.columns): missing_columns = set(required_columns) - set(self.df.columns) raise ValueError(f"DataFrame is missing required columns: {missing_columns}") def prepare_wallet_daily_balance_df( self, timestamp_col_name: str, profit_col_name: str ) -> pd.DataFrame: + """ + Prepares a DataFrame with daily wallet balances and calculates daily returns. + + :param timestamp_col_name: Name of the timestamp column. + :param profit_col_name: Name of the profit/profit_cumsum column. + :return: DataFrame indexed by date with cumulative profits and returns. + """ self._has_df_valid_columns_else_exception( [timestamp_col_name, profit_col_name] ) df = self.df.copy() df[timestamp_col_name] = pd.to_datetime(df[timestamp_col_name]) df.sort_values(timestamp_col_name, ascending=True, inplace=True) df["profit_cumsum"] = df[profit_col_name].cumsum() df["profit_cumsum"] = df["profit_cumsum"] + 50 df = df.drop_duplicates(subset=timestamp_col_name, keep="last") df.set_index(timestamp_col_name, inplace=True) # We generate a new Dataframe with daily wallet balances, derived by the final wallet balance # from the previous day. wallet_balance_daily_df = df[["profit_cumsum"]].resample("D").ffill() wallet_balance_daily_df.dropna(inplace=True) wallet_balance_daily_df["returns"] = wallet_balance_daily_df[ "profit_cumsum" ].pct_change() return wallet_balance_daily_df def calculate_annual_sharpe_ratio( self, timestamp_col_name: str = "timestamp", profit_col_name: str = "sim_profit" ) -> SharpeOutput: + """ + Calculates the annualized Sharpe ratio based on the prepared wallet daily balance DataFrame. + + :param timestamp_col_name: Name of the timestamp column, default is "timestamp". + :param profit_col_name: Name of the profit column, default is "sim_profit". + :return: SharpeOutput object containing annualized volatility, mean daily return, and annualized Sharpe ratio. + """ wallet_daily_balance_df = self.prepare_wallet_daily_balance_df( timestamp_col_name=timestamp_col_name, profit_col_name=profit_col_name ) daily_volatility = wallet_daily_balance_df["returns"].std() annualized_volatility = daily_volatility * np.sqrt(365) mean_daily_return = wallet_daily_balance_df["returns"].mean() daily_sharpe_ratio = ( mean_daily_return - self.risk_free_rate ) / daily_volatility annualized_sharpe_ratio = daily_sharpe_ratio * np.sqrt(365) return SharpeOutput( annualized_volatility=annualized_volatility, mean_daily_return=mean_daily_return, annualized_sharpe_ratio=annualized_sharpe_ratio, )
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
⛔ Files ignored due to path filters (2)
poetry.lock
is excluded by!**/*.lock
,!**/*.lock
pyproject.toml
is excluded by!**/*.toml
📒 Files selected for processing (6)
examples/monitor/financial_metrics.py
(1 hunks)examples/monitor/match_bets_with_langfuse_traces.py
(9 hunks)examples/monitor/transaction_cache.py
(1 hunks)prediction_market_agent_tooling/markets/manifold/data_models.py
(0 hunks)prediction_market_agent_tooling/markets/omen/data_models.py
(2 hunks)prediction_market_agent_tooling/markets/omen/omen_subgraph_handler.py
(1 hunks)
💤 Files with no reviewable changes (1)
- prediction_market_agent_tooling/markets/manifold/data_models.py
🚧 Files skipped from review as they are similar to previous changes (2)
- examples/monitor/transaction_cache.py
- prediction_market_agent_tooling/markets/omen/data_models.py
🔇 Additional comments (5)
examples/monitor/match_bets_with_langfuse_traces.py (3)
4-16
: LGTM! Well-organized imports.
The new imports are properly organized and align with the PR objectives of adding financial metrics and transaction caching capabilities.
280-309
: LGTM! Well-structured financial metrics implementation.
The implementation:
- Uses a proper data model (SimulationDetail) for structured data
- Calculates Sharpe ratio for both simulated and original strategies
- Exports detailed data for analysis
- Aligns with the PR objectives of adding financial metrics
The code is well-organized and provides good traceability through CSV exports.
207-207
:
Verify the start_time date.
The start_time is set to October 2024, which is in the future. This might cause issues with retrieving data.
Consider using a past date or making it configurable:
- start_time = utc_datetime(2024, 10, 28)
+ start_time = utc_datetime(2023, 10, 28) # or use an environment variable
prediction_market_agent_tooling/markets/omen/omen_subgraph_handler.py (2)
438-444
: LGTM: Block number parameter addition is well implemented.
The addition of the optional block_number
parameter and the query filter construction is clean and follows best practices. This change enables historical market data queries which will be useful for financial metrics calculations.
Line range hint 445-452
: LGTM: Proper validation of query results.
The validation ensures that exactly one market is returned, which is crucial for maintaining data integrity when querying by market ID.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (2)
tests_integration_with_local_chain/markets/omen/test_contracts.py (2)
6-12
: LGTM on Web3 parameter addition, but test documentation could be improved.The addition of the
local_web3
parameter is a good practice for dependency injection. Consider adding a docstring to clarify the test's purpose and expectations.+""" +Test sDai asset balance retrieval functionality. + +Args: + local_web3 (Web3): Web3 instance for interacting with the blockchain + +Verifies that the asset token balance for a specific address is non-negative. +""" def test_sdai_asset_balance_of(local_web3: Web3) -> None:
9-9
: Consider extracting the hardcoded address to a constant or fixture.Using a hardcoded Ethereum address in tests can make maintenance difficult and obscures the test's intent. Consider moving it to a test constant or fixture with a descriptive name.
+# At the top of the file +TEST_WALLET_ADDRESS = "0x7d3A0DA18e14CCb63375cdC250E8A8399997816F" + def test_sdai_asset_balance_of(local_web3: Web3) -> None: assert ( sDaiContract().get_asset_token_balance( - Web3.to_checksum_address("0x7d3A0DA18e14CCb63375cdC250E8A8399997816F"), + Web3.to_checksum_address(TEST_WALLET_ADDRESS), web3=local_web3 ) >= 0 )
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
⛔ Files ignored due to path filters (1)
ape-config.yaml
is excluded by!**/*.yaml
📒 Files selected for processing (1)
tests_integration_with_local_chain/markets/omen/test_contracts.py
(1 hunks)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Outside diff range and nitpick comments (6)
prediction_market_agent_tooling/markets/data_models.py (2)
160-163
: Add documentation and validation for SharpeOutput.The class would benefit from documentation and validation to ensure financial metrics are within expected ranges.
Consider applying these changes:
class SharpeOutput(BaseModel): + """ + Contains the results of Sharpe ratio calculations and related financial metrics. + + Attributes: + annualized_volatility: Standard deviation of returns, annualized + mean_daily_return: Average daily return + annualized_sharpe_ratio: Risk-adjusted return metric, annualized + """ - annualized_volatility: float - mean_daily_return: float - annualized_sharpe_ratio: float + annualized_volatility: Annotated[float, BeforeValidator(lambda x: x if x >= 0 else ValueError("Volatility must be non-negative"))] + mean_daily_return: float + annualized_sharpe_ratio: float + + @computed_field + def risk_adjusted_return(self) -> float: + """Returns the absolute risk-adjusted return.""" + return self.mean_daily_return * (1 - self.annualized_volatility)
143-144
: Maintain consistent class spacing.According to PEP 8, classes should be separated by two blank lines. Currently, there are inconsistent gaps between classes.
Apply this formatting:
- - + + class SimulationDetail(BaseModel): # ... - - + + class SharpeOutput(BaseModel):Also applies to: 158-159
examples/monitor/match_bets_with_langfuse_traces.py (4)
91-112
: Consider adding error handling for historical data retrieval.The new profit calculation logic using historical market state is a good improvement for accuracy. However, the calls to external services (tx_block_cache and OmenSubgraphHandler) should include error handling.
Consider wrapping external calls with try-except:
- tx_block_number = tx_block_cache.get_block_number(actual_placed_bet.id) - market_at_block = OmenSubgraphHandler().get_omen_market_by_market_id( - HexAddress(HexStr(market.id)), block_number=tx_block_number - ) + try: + tx_block_number = tx_block_cache.get_block_number(actual_placed_bet.id) + market_at_block = OmenSubgraphHandler().get_omen_market_by_market_id( + HexAddress(HexStr(market.id)), block_number=tx_block_number + ) + except Exception as e: + logger.error(f"Failed to retrieve historical market state: {e}") + return None
195-198
: Consider adding cache configuration options.The TransactionBlockCache initialization could benefit from configurable parameters such as cache size and expiration time to optimize memory usage and data freshness.
277-291
: LGTM! Consider adding data validation.The SimulationDetail implementation provides comprehensive metrics and good data organization. Consider adding validation for numeric fields to ensure they're within expected ranges.
300-306
: LGTM! Consider adding documentation for financial metrics.The addition of Sharpe ratio calculations enhances the strategy evaluation capabilities. Consider adding documentation that explains:
- The interpretation of these financial metrics
- Their significance in strategy evaluation
- The chosen parameters (e.g., risk-free rate) and their rationale
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (3)
examples/monitor/financial_metrics.py
(1 hunks)examples/monitor/match_bets_with_langfuse_traces.py
(10 hunks)prediction_market_agent_tooling/markets/data_models.py
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- examples/monitor/financial_metrics.py
🔇 Additional comments (2)
examples/monitor/match_bets_with_langfuse_traces.py (2)
29-32
: LGTM!
The addition of SimulationDetail import aligns with the PR's objective to enhance financial metrics.
Line range hint 129-182
: Verify if limiting active strategies is intentional.
Most strategies are commented out, leaving only two KellyBettingStrategy configurations active. While this might be intentional for testing, consider:
- Adding a comment explaining why these specific configurations were chosen
- Whether these configurations provide sufficient coverage for testing the new financial metrics
profit = -buy_trade.amount.amount | ||
else: | ||
# We use a historical state (by passing in a block_number as arg) to get the correct outcome token balances. | ||
tx_block_number = tx_block_cache.get_block_number(actual_placed_bet.id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oooh, so if we used Market at state of latest block, get_buy_token_amount
produced rubbish? 😱
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
get_buy_token_amount
is block-dependent, since the price of the token is dictated by the size of the yes_outcome_pool_size
and no_outcome_pool_size
).amount | ||
profit = ( | ||
received_outcome_tokens - buy_trade.amount.amount | ||
if correct |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is duplicate of if-else above, right?
if correct |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually this was correct, but readability was horrible - please see updated version
profit = ( | ||
received_outcome_tokens - buy_trade.amount.amount | ||
if correct | ||
else -buy_trade.amount.amount |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
else -buy_trade.amount.amount |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not needed - see refactoring I mentioned earlier
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (2)
examples/monitor/match_bets_with_langfuse_traces.py (2)
Line range hint
125-178
: Clean up commented strategy configurations.Consider either:
- Removing the commented strategies if they're no longer needed
- Moving them to a configuration file if they're different testing scenarios
- Adding a comment explaining why certain strategies are commented out
255-256
: Address the TODO comment about unredeemed tokens.The TODO suggests implementing portfolio tracking for unredeemed tokens. This could be important for accurate performance measurement.
Would you like me to help implement the portfolio tracking for unredeemed tokens or create a GitHub issue to track this enhancement?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (1)
examples/monitor/match_bets_with_langfuse_traces.py
(10 hunks)
🔇 Additional comments (5)
examples/monitor/match_bets_with_langfuse_traces.py (5)
4-16
: Move environment variable handling to APIKeys as previously discussed.
As noted in previous discussions, environment variables should be handled through the APIKeys system instead of direct dotenv loading. This includes both the GNOSIS_RPC_URL
and Google API credentials.
29-32
: LGTM! Good architectural improvement.
Moving SimulationDetail to data_models promotes better code organization and reusability.
91-108
: LGTM! Improved accuracy with historical market state.
The new implementation correctly:
- Uses early returns for incorrect predictions
- Retrieves historical market state for accurate calculations
- Implements block caching for better performance
Line range hint 84-88
: Replace assert statements with explicit exception handling.
Using assert statements for critical checks in production code can be risky as they can be disabled when Python is run with optimization flags (-O).
296-301
: LGTM! Well-structured financial metrics implementation.
The addition of Sharpe ratio calculations for both simulated and original strategies provides valuable performance comparison metrics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Outside diff range and nitpick comments (7)
examples/monitor/match_bets_with_langfuse_traces.py (2)
255-256
: Address the TODO comment regarding unredeemed tokens.The TODO comment indicates a missing feature for tracking unredeemed tokens in the portfolio. This could lead to underestimating the actual portfolio value.
Would you like me to help implement portfolio tracking that includes unredeemed tokens? I can:
- Add a token balance tracker
- Calculate the current value of unredeemed tokens
- Include this in the portfolio valuation
273-285
: Enhance simulation details with additional financial metrics.Consider adding these valuable metrics to the simulation details:
- Maximum drawdown
- Win/Loss ratio
- Risk-adjusted return
- Daily/Monthly volatility
simulation_detail = SimulationDetail( strategy=repr(strategy), url=trace.market.url, market_p_yes=round(trace.market.current_p_yes, 4), agent_p_yes=round(trace.answer.p_yes, 4), agent_conf=round(trace.answer.confidence, 4), org_bet=round(bet.amount.amount, 4), sim_bet=round(simulated_outcome.size, 4), org_dir=bet.outcome, sim_dir=simulated_outcome.direction, org_profit=round(bet.profit.amount, 4), sim_profit=round(simulated_outcome.profit, 4), + max_drawdown=calc.calculate_max_drawdown(), + win_loss_ratio=calc.calculate_win_loss_ratio(), + risk_adjusted_return=calc.calculate_risk_adjusted_return(), + daily_volatility=calc.calculate_daily_volatility(), timestamp=bet_with_trace.trace.timestamp_datetime, )prediction_market_agent_tooling/monitor/financial_metrics/financial_metrics.py (5)
15-15
: Add validation when creating DataFrame from SimulationDetailsWhen converting
SimulationDetail
objects into a DataFrame, consider adding validation to ensure that each object contains the expected data and handle any exceptions that might occur duringd.model_dump()
.
18-23
: Consider using a single underscore for private method names instead of double underscoresThe method
__has_df_valid_columns_else_exception
uses a double underscore prefix, which triggers name mangling in Python. Unless name mangling is specifically required, it's conventional to use a single underscore prefix for private methods.Apply this diff to rename the method:
-def __has_df_valid_columns_else_exception( +def _has_df_valid_columns_else_exception(And update all calls to this method accordingly.
34-35
: Clarify the addition of 50 to cumulative profitLine 35 adds 50 to the cumulative profit:
df["profit_cumsum"] = df["profit_cumsum"] + 50
. It's unclear why this specific value is added. Consider parameterizing this initial balance or documenting its purpose for better clarity.
39-40
: Review necessity and clarity of commentsThe comments on lines 39-40 describe the code that follows. Ensure they provide additional value and accurately explain the logic. If they do not add clarity, consider refining or removing them to improve code readability.
56-56
: Verify the annualization factor in volatility and Sharpe ratio calculationsYou're using
np.sqrt(365)
to annualize daily volatility and the Sharpe ratio. Verify if 365 is the appropriate number of periods, considering whether your data includes weekends or only trading days. Usingnp.sqrt(252)
(the approximate number of trading days in a year) may be more appropriate for financial calculations.Also applies to: 61-61
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (2)
examples/monitor/match_bets_with_langfuse_traces.py
(10 hunks)prediction_market_agent_tooling/monitor/financial_metrics/financial_metrics.py
(1 hunks)
🔇 Additional comments (3)
examples/monitor/match_bets_with_langfuse_traces.py (2)
4-4
: Move environment variable handling to APIKeys as previously discussed.
As per previous discussions, environment variables should be handled through the APIKeys system instead of direct dotenv loading. This includes both the GNOSIS_RPC_URL
and Google API credentials.
Also applies to: 16-16
191-193
: Verify memory usage of TransactionBlockCache during long simulations.
The transaction block cache stores block numbers for all transactions, which could consume significant memory during long simulations with many transactions.
prediction_market_agent_tooling/monitor/financial_metrics/financial_metrics.py (1)
42-45
: Ensure correct handling of NaN values in returns calculation
After computing percentage change with pct_change()
, the first value in wallet_balance_daily_df["returns"]
will be NaN
. The subsequent dropna(inplace=True)
removes this first row, potentially reducing your data points. Consider whether dropping the first NaN
return is acceptable or if it needs to be handled differently, especially with small datasets.
mean_daily_return - self.risk_free_rate | ||
) / daily_volatility |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adjust risk-free rate to match the frequency of returns
Ensure that the risk_free_rate
used in calculating daily_sharpe_ratio
is expressed on a daily basis to match mean_daily_return
. If risk_free_rate
is an annual rate, consider dividing it by the appropriate number of periods (e.g., 365) to convert it to a daily rate.
Apply this diff:
- daily_sharpe_ratio = (
- mean_daily_return - self.risk_free_rate
- ) / daily_volatility
+ daily_risk_free_rate = self.risk_free_rate / 365
+ daily_sharpe_ratio = (
+ mean_daily_return - daily_risk_free_rate
+ ) / daily_volatility
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
mean_daily_return - self.risk_free_rate | |
) / daily_volatility | |
daily_risk_free_rate = self.risk_free_rate / 365 | |
daily_sharpe_ratio = ( | |
mean_daily_return - daily_risk_free_rate | |
) / daily_volatility |
prediction_market_agent_tooling/monitor/financial_metrics/financial_metrics.py
Show resolved
Hide resolved
#552) * Lower amount of DB connections (#547) * Single shared engine instance * fixes * revert accidental bump * fix pickle in github ci * fiiiix * fix loguru * revert * ah * ah2 * revert * Added gnosis_rpc_config, google_credentials_filename and chain_id to BaseSettings * Fixed black * Fixed black (2) | changed rpc * Raise OutOfFunds in withdraw_wxdai_to_xdai_to_keep_balance (#553) * Implemented PR comments * Fixed CI * Merged --------- Co-authored-by: Peter Jung <[email protected]>
…tegy-backtests-add-financial-metrics
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 7
🧹 Outside diff range and nitpick comments (14)
prediction_market_agent_tooling/tools/transaction_cache.py (1)
1-48
: Consider production readiness aspectsA few architectural considerations for production deployment:
Cache persistence:
- Consider implementing cache warmup strategies
- Add cache statistics monitoring
- Implement cache cleanup policies
Memory management:
- Monitor cache size growth
- Implement cache eviction policies
- Consider using shared cache for multiple instances
Concurrent access:
- Consider using Redis or similar for distributed caching
- Implement proper locking mechanisms
- Add metrics for cache hit/miss rates
Would you like assistance in implementing any of these production-ready features?
scripts/create_safe_for_agent.py (1)
39-40
: Consider improving error handling and performance.The current implementation could be enhanced in several ways:
- Add error handling for RPCConfig instantiation
- Validate the RPC URL format
- Consider moving RPCConfig instantiation outside the function to avoid repeated instantiation
Consider applying this improvement:
+from urllib.parse import urlparse + +def validate_rpc_url(url: str) -> bool: + try: + result = urlparse(url) + return all([result.scheme, result.netloc]) + except ValueError: + return False + +# Get config once at module level +_rpc_config = RPCConfig() + def create_safe_for_agent( from_private_key: str = typer.Option(), rpc_url: str | None = None, salt_nonce: int | None = None, fund_safe: bool = True, fund_amount_xdai: int = 1, ) -> None: salt_nonce = salt_nonce or secrets.randbits(256) - rpc_url = rpc_url if rpc_url else RPCConfig().gnosis_rpc_url + try: + rpc_url = rpc_url if rpc_url else _rpc_config.gnosis_rpc_url + if not validate_rpc_url(rpc_url): + raise ValueError(f"Invalid RPC URL format: {rpc_url}") + except Exception as e: + logger.error(f"Failed to configure RPC URL: {e}") + return ethereum_client = EthereumClient(URI(rpc_url))prediction_market_agent_tooling/tools/utils.py (2)
Line range hint
236-252
: Consider enhancing numerical stability ofcalculate_sell_amount_in_collateral
.The Newton's method implementation could benefit from additional parameters to control convergence and handle edge cases:
- The hardcoded multiplier (0.999999) for rounding errors could be made configurable
- Adding max iterations and tolerance parameters would improve robustness
Consider refactoring like this:
def calculate_sell_amount_in_collateral( shares_to_sell: float, holdings: float, other_holdings: float, fees: MarketFees, + rounding_factor: float = 0.999999, + max_iterations: int = 100, + tolerance: float = 1e-10, ) -> float: """ Computes the amount of collateral that needs to be sold to get `shares` amount of shares. Returns None if the amount can't be computed. Taken from https://github.com/protofire/omen-exchange/blob/29d0ab16bdafa5cc0d37933c1c7608a055400c73/app/src/util/tools/fpmm/trading/index.ts#L99 Simplified for binary markets. + + Args: + shares_to_sell: Number of shares to sell + holdings: Current holdings + other_holdings: Other holdings + fees: Market fees configuration + rounding_factor: Factor to avoid rounding errors (default: 0.999999) + max_iterations: Maximum number of iterations for Newton's method + tolerance: Convergence tolerance for Newton's method """ for v in [shares_to_sell, holdings, other_holdings]: if v <= 0: raise ValueError("All share args must be greater than 0") def f(r: float) -> float: R = (r + fees.absolute) / (1 - fees.bet_proportion) first_term = other_holdings - R second_term = holdings + shares_to_sell - R third_term = holdings * other_holdings return (first_term * second_term) - third_term - amount_to_sell = newton(f, 0) - return float(amount_to_sell) * 0.999999 # Avoid rounding errors + amount_to_sell = newton(f, 0, maxiter=max_iterations, tol=tolerance) + return float(amount_to_sell) * rounding_factor
Line range hint
1-252
: Consider adding financial metric utilities.Given the PR's objective to add financial metrics (Sharpe ratio, mean daily return, annual volatility), this utility file could be a good place to add these calculations, promoting code reuse.
Would you like me to help implement these financial metric utility functions? They could include:
- Sharpe ratio calculation
- Mean daily return calculation
- Annual volatility calculation
prediction_market_agent_tooling/config.py (2)
210-214
: Add class documentationPlease add docstring documentation to explain:
- The purpose of this configuration class
- How it integrates with the rest of the system
- The significance of the default values
Example:
class RPCConfig(BaseSettings): """Configuration for Gnosis Chain RPC connection. This class manages RPC URL and chain ID configuration for connecting to the Gnosis Chain. Default values are set for the public RPC endpoint, but can be overridden via environment variables for private/custom endpoints. """
229-234
: Enhance CloudCredentials implementation
- Add class documentation to explain its purpose and usage:
class CloudCredentials(BaseSettings): """Configuration for cloud service credentials. Currently supports Google Cloud Platform authentication via application credentials file. The credentials file path can be specified via the GOOGLE_APPLICATION_CREDENTIALS environment variable. """
- Consider adding path validation for the credentials file when it's provided:
@property def google_credentials_path(self) -> t.Optional[str]: if self.GOOGLE_APPLICATION_CREDENTIALS: path = Path(self.GOOGLE_APPLICATION_CREDENTIALS) if not path.exists(): raise ValueError(f"Credentials file not found: {path}") return str(path) return Noneexamples/monitor/match_bets_with_langfuse_traces.py (3)
Line range hint
125-178
: Document the rationale for commented-out strategies.A large number of strategies are commented out. While this might be intentional for testing, it would be helpful to document why only specific strategies are enabled.
Add a comment explaining the strategy selection:
+ # Note: Only using selected Kelly strategies for initial testing. + # Other strategies are commented out to reduce testing time and focus on core functionality. strategies = [
255-256
: Consider implementing portfolio tracking.The TODO comment suggests a valuable enhancement for tracking unredeemed tokens.
Would you like me to help implement portfolio tracking or create a GitHub issue to track this enhancement? This would involve:
- Tracking unredeemed tokens
- Calculating their current value
- Including them in the total portfolio value
289-293
: Consider adding data validation before CSV export.The CSV export could benefit from data validation to ensure all required fields are present and in the correct format.
Add validation before export:
+ # Validate data before export + if details_df.empty: + print(f"Warning: No data to export for {agent_name} with strategy {strategy}") + continue + + # Ensure all required columns are present + required_columns = {'strategy', 'url', 'market_p_yes', 'sim_profit'} + missing_columns = required_columns - set(details_df.columns) + if missing_columns: + raise ValueError(f"Missing required columns: {missing_columns}") + details_df.to_csv( output_directory / f"{agent_name} - {strategy} - all bets.csv", index=False, )tests_integration_with_local_chain/markets/omen/test_omen.py (1)
Line range hint
1-600
: Consider improving test suite maintainabilityThe test file shows several areas that could benefit from improvements:
- Multiple skipped tests with TODOs indicate technical debt
- Hardcoded addresses and mock data could make tests brittle
- Some tests lack proper isolation from external dependencies
Consider:
- Creating a comprehensive test fixture for local chain setup
- Using dynamic test data generation instead of hardcoded values
- Implementing proper test isolation patterns
Would you like help creating:
- A reusable fixture for local chain setup?
- Helper functions for dynamic test data generation?
- Patterns for better test isolation?
prediction_market_agent_tooling/tools/contract.py (1)
390-391
: Consider lazy initialization of RPCConfigWhile moving to configurable chain settings is good, instantiating
RPCConfig
at class variable definition time could have implications:
- Performance impact due to early initialization
- Potential issues in testing scenarios where you might want to override these values
- No error handling for RPCConfig initialization failures
Consider using a property decorator or lazy initialization pattern:
- CHAIN_ID = RPCConfig().chain_id - CHAIN_RPC_URL = RPCConfig().gnosis_rpc_url + _rpc_config = None + + @classmethod + def _get_rpc_config(cls): + if cls._rpc_config is None: + cls._rpc_config = RPCConfig() + return cls._rpc_config + + @classmethod + @property + def CHAIN_ID(cls): + return cls._get_rpc_config().chain_id + + @classmethod + @property + def CHAIN_RPC_URL(cls): + return cls._get_rpc_config().gnosis_rpc_urlThis approach provides:
- Lazy initialization
- Better error handling opportunities
- Easier testing through mocking
- Single instance reuse
prediction_market_agent_tooling/tools/google_utils.py (3)
62-62
: Avoid hardcoding the defaultproject_id
For better maintainability and flexibility, consider retrieving the default
project_id
from a configuration file or environment variable instead of hardcoding it.Apply this change:
-def get_private_key_from_gcp_secret( - secret_id: str, - project_id: str = "582587111398", # Gnosis AI default project_id - version_id: str = "latest", -) -> PrivateKey: +def get_private_key_from_gcp_secret( + secret_id: str, + project_id: str = CloudCredentials().GCP_PROJECT_ID, # Retrieve from configuration + version_id: str = "latest", +) -> PrivateKey:Ensure that
GCP_PROJECT_ID
is defined in yourCloudCredentials
configuration.
71-71
: Remove unnecessary type ignore commentWith the corrected method name, the mypy type-checking issue should be resolved. You can remove the
# type: ignore [call-arg]
comment.Apply this change:
-client = secretmanager.SecretManagerServiceClient.from_service_account_file(google_application_credentials_filename) # type: ignore [call-arg] +client = secretmanager.SecretManagerServiceClient.from_service_account_file(google_application_credentials_filename)
60-80
: Consider adding unit tests forget_private_key_from_gcp_secret
Adding unit tests will help verify the correctness of the new function and ensure it behaves as expected in different scenarios.
Would you like assistance in creating unit tests for this function?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
⛔ Files ignored due to path filters (1)
pyproject.toml
is excluded by!**/*.toml
📒 Files selected for processing (11)
examples/monitor/match_bets_with_langfuse_traces.py
(10 hunks)prediction_market_agent_tooling/config.py
(2 hunks)prediction_market_agent_tooling/markets/omen/data_models.py
(2 hunks)prediction_market_agent_tooling/markets/polymarket/utils.py
(1 hunks)prediction_market_agent_tooling/tools/contract.py
(2 hunks)prediction_market_agent_tooling/tools/gnosis_rpc.py
(0 hunks)prediction_market_agent_tooling/tools/google_utils.py
(2 hunks)prediction_market_agent_tooling/tools/transaction_cache.py
(1 hunks)prediction_market_agent_tooling/tools/utils.py
(1 hunks)scripts/create_safe_for_agent.py
(2 hunks)tests_integration_with_local_chain/markets/omen/test_omen.py
(1 hunks)
💤 Files with no reviewable changes (1)
- prediction_market_agent_tooling/tools/gnosis_rpc.py
✅ Files skipped from review due to trivial changes (1)
- prediction_market_agent_tooling/markets/polymarket/utils.py
🚧 Files skipped from review as they are similar to previous changes (1)
- prediction_market_agent_tooling/markets/omen/data_models.py
🔇 Additional comments (12)
scripts/create_safe_for_agent.py (2)
10-10
: LGTM! Good refactoring of configuration management.
The change from hardcoded RPC URL to a centralized configuration class improves maintainability and flexibility.
Line range hint 42-58
: Enhance transaction safety checks and error handling.
The code should include additional safety checks for financial operations:
- Verify sufficient balance before sending funds
- Add timeout handling for Safe deployment
- Verify transaction success for fund transfer
Let's check if similar safety measures exist in other files:
Consider implementing these safety checks:
def verify_balance(web3: Web3, from_address: str, amount: int) -> bool:
balance = web3.eth.get_balance(from_address)
return balance >= amount
# In create_safe_for_agent:
if fund_safe:
value = xdai_to_wei(xDai(fund_amount_xdai))
if not verify_balance(ethereum_client.w3, account.address, value):
logger.error("Insufficient balance for funding. Aborting.")
return
tx_hash = send_xdai_to(...)
# Wait for transaction confirmation
try:
receipt = ethereum_client.w3.eth.wait_for_transaction_receipt(
tx_hash, timeout=120
)
if not receipt.status:
logger.error("Fund transfer failed. Aborting.")
return
except Exception as e:
logger.error(f"Transaction failed: {e}")
return
prediction_market_agent_tooling/tools/utils.py (1)
12-14
: LGTM! Verify GCP secret management migration.
The import changes are well-organized and align with the remaining functionality. The removal of GCP-related imports is consistent with the migration of secret management functionality to google_utils.py
.
Let's verify that all GCP secret management code has been properly migrated:
✅ Verification successful
Let me gather more context about the usage of these GCP functions to ensure complete verification:
Based on the gathered information, I can now provide a final response:
GCP secret management migration verified with some concerns
The GCP secret management code has been properly migrated from utils.py
to google_utils.py
, but there are two additional files that still use GCP secret manager functionality:
deploy/gcp/utils.py
: Contains GCP-specific deployment utilities and appropriately uses SecretManagerServiceClientexamples/monitor/match_bets_with_langfuse_traces.py
: Uses the migratedget_private_key_from_gcp_secret
fromgoogle_utils.py
The changes are consistent with the codebase organization - GCP-related functionality is now properly isolated in dedicated modules rather than in general utilities.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify GCP secret management migration
# Check for any remaining GCP secret manager references
rg -l "google\.cloud\.secretmanager|get_private_key_from_gcp_secret"
# Check if the functionality exists in google_utils.py
fd -g "google_utils.py" -x cat {} \; | grep -A 10 "get_private_key_from_gcp_secret"
Length of output: 846
Script:
#!/bin/bash
# Check for any imports or usage of google_utils in the codebase
rg -l "from prediction_market_agent_tooling.tools.google_utils import|import prediction_market_agent_tooling.tools.google_utils"
# Check the content of the files that still reference GCP secret manager
cat prediction_market_agent_tooling/deploy/gcp/utils.py
cat examples/monitor/match_bets_with_langfuse_traces.py
Length of output: 22223
prediction_market_agent_tooling/config.py (2)
3-3
: LGTM: Import changes are well-organized
The new imports for Field
and ChainID
are properly organized and support the new configuration classes.
Also applies to: 11-11
210-234
: Verify necessity of configuration changes
These configuration classes (RPCConfig
and CloudCredentials
) seem peripheral to the PR's main objective of adding financial metrics for strategy backtesting. Please clarify:
- How these configuration changes support the financial metrics implementation
- Whether these changes could be split into a separate infrastructure PR
examples/monitor/match_bets_with_langfuse_traces.py (5)
5-5
: LGTM! Well-structured imports for new functionality.
The new imports are logically grouped and support the enhanced functionality for:
- Ethereum address handling (
HexAddress
,HexStr
) - Financial metrics (
SharpeRatioCalculator
) - Transaction caching (
TransactionBlockCache
)
Also applies to: 17-33, 41-42
63-64
: LGTM! Function signature enhancement for historical data.
The addition of actual_placed_bet
and tx_block_cache
parameters enables accurate profit calculation using historical market states.
191-193
: LGTM! Efficient transaction block caching.
The TransactionBlockCache
initialization is well-placed and will help improve performance by caching block numbers.
273-301
: LGTM! Comprehensive financial metrics calculation.
The addition of SimulationDetail
and Sharpe ratio calculations provides valuable insights into strategy performance. The implementation correctly handles both simulated and original strategy metrics.
91-108
:
Consider adding error handling for market state retrieval.
The profit calculation relies on retrieving historical market state, but there's no error handling for potential failures.
Apply this diff to add error handling:
if not correct:
profit = -buy_trade.amount.amount
else:
- tx_block_number = tx_block_cache.get_block_number(actual_placed_bet.id)
- market_at_block = OmenSubgraphHandler().get_omen_market_by_market_id(
- HexAddress(HexStr(market.id)), block_number=tx_block_number
- )
- omen_agent_market_at_block = OmenAgentMarket.from_data_model(market_at_block)
+ try:
+ tx_block_number = tx_block_cache.get_block_number(actual_placed_bet.id)
+ if tx_block_number is None:
+ raise ValueError(f"Block number not found for transaction {actual_placed_bet.id}")
+
+ market_at_block = OmenSubgraphHandler().get_omen_market_by_market_id(
+ HexAddress(HexStr(market.id)), block_number=tx_block_number
+ )
+ if market_at_block is None:
+ raise ValueError(f"Market state not found at block {tx_block_number}")
+
+ omen_agent_market_at_block = OmenAgentMarket.from_data_model(market_at_block)
+ received_outcome_tokens = omen_agent_market_at_block.get_buy_token_amount(
+ bet_amount=omen_agent_market_at_block.get_bet_amount(
+ buy_trade.amount.amount
+ ),
+ direction=buy_trade.outcome,
+ ).amount
+ profit = received_outcome_tokens - buy_trade.amount.amount
+ except Exception as e:
+ logger.error(f"Failed to calculate profit: {e}")
+ # Fallback to simple loss calculation
+ profit = -buy_trade.amount.amount
Likely invalid or redundant comment.
prediction_market_agent_tooling/tools/contract.py (1)
11-11
: LGTM!
The addition of RPCConfig
import alongside APIKeys
is clean and follows Python's import style guidelines.
prediction_market_agent_tooling/tools/google_utils.py (1)
6-8
: Imports are appropriate for the new functionality
The added imports are necessary for the implementation of get_private_key_from_gcp_secret
and enhance the module's capabilities.
def get_block_number(self, tx_hash: str) -> int: | ||
if tx_hash in self.block_number_cache: | ||
return int(self.block_number_cache[tx_hash]) | ||
|
||
block_number = self.fetch_block_number(tx_hash) | ||
self.block_number_cache[tx_hash] = block_number | ||
return block_number | ||
|
||
def get_block_timestamp(self, block_number: int) -> int: | ||
if block_number in self.block_timestamp_cache: | ||
return int(self.block_timestamp_cache[block_number]) | ||
|
||
block_timestamp = self.fetch_block_timestamp(block_number) | ||
self.block_timestamp_cache[block_number] = block_timestamp | ||
return block_timestamp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Enhance cache operations with validation and error handling
The cache getter methods could benefit from:
- Input validation
- Error handling for cache operations
- Race condition prevention
- Documentation of caching behavior
Here's a suggested improvement:
def get_block_number(self, tx_hash: str) -> int:
+ """Get block number from cache or fetch from chain.
+
+ Args:
+ tx_hash: Ethereum transaction hash (with or without '0x' prefix)
+
+ Returns:
+ Block number for the transaction
+
+ Raises:
+ ValueError: If tx_hash is invalid
+ """
+ # Validate transaction hash
+ if not tx_hash or not isinstance(tx_hash, str):
+ raise ValueError("Invalid transaction hash")
+
+ # Normalize transaction hash
+ tx_hash = tx_hash.lower()
+ if not tx_hash.startswith("0x"):
+ tx_hash = f"0x{tx_hash}"
+
if tx_hash in self.block_number_cache:
- return int(self.block_number_cache[tx_hash])
+ try:
+ return int(self.block_number_cache[tx_hash])
+ except (ValueError, TypeError) as e:
+ logger.warning(f"Invalid cache value for {tx_hash}: {e}")
+ # Fall through to fetch
block_number = self.fetch_block_number(tx_hash)
- self.block_number_cache[tx_hash] = block_number
+ try:
+ self.block_number_cache.set(tx_hash, block_number, retry=True)
+ except Exception as e:
+ logger.error(f"Failed to cache block number for {tx_hash}: {e}")
return block_number
def get_block_timestamp(self, block_number: int) -> int:
+ """Get block timestamp from cache or fetch from chain.
+
+ Args:
+ block_number: Ethereum block number
+
+ Returns:
+ Unix timestamp for the block
+
+ Raises:
+ ValueError: If block_number is invalid
+ """
+ # Validate block number
+ if not isinstance(block_number, int) or block_number < 0:
+ raise ValueError("Invalid block number")
+
if block_number in self.block_timestamp_cache:
- return int(self.block_timestamp_cache[block_number])
+ try:
+ return int(self.block_timestamp_cache[block_number])
+ except (ValueError, TypeError) as e:
+ logger.warning(f"Invalid cache value for block {block_number}: {e}")
+ # Fall through to fetch
block_timestamp = self.fetch_block_timestamp(block_number)
- self.block_timestamp_cache[block_number] = block_timestamp
+ try:
+ self.block_timestamp_cache.set(block_number, block_timestamp, retry=True)
+ except Exception as e:
+ logger.error(f"Failed to cache timestamp for block {block_number}: {e}")
return block_timestamp
Committable suggestion skipped: line range outside the PR's diff.
def __init__(self, web3: Web3): | ||
self.block_number_cache = dc.Cache("block_cache_dir") | ||
self.block_timestamp_cache = dc.Cache("timestamp_cache_dir") | ||
self.web3 = web3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Enhance cache configuration and type safety
The cache initialization could be improved in several ways:
- Cache directories should be configurable rather than hardcoded
- Consider adding cache size limits and expiration policies
- Add type hints for better code maintainability
Here's a suggested improvement:
- def __init__(self, web3: Web3):
- self.block_number_cache = dc.Cache("block_cache_dir")
- self.block_timestamp_cache = dc.Cache("timestamp_cache_dir")
- self.web3 = web3
+ def __init__(
+ self,
+ web3: Web3,
+ cache_dir: str = "cache",
+ size_limit: int = int(1e9), # 1GB default
+ ttl: int = 60 * 60 * 24, # 24 hours default
+ ) -> None:
+ """Initialize transaction cache with configurable parameters.
+
+ Args:
+ web3: Web3 instance for blockchain interaction
+ cache_dir: Base directory for cache storage
+ size_limit: Maximum cache size in bytes
+ ttl: Time-to-live for cache entries in seconds
+ """
+ self.block_number_cache = dc.Cache(
+ directory=f"{cache_dir}/block_numbers",
+ size_limit=size_limit,
+ ttl=ttl,
+ )
+ self.block_timestamp_cache = dc.Cache(
+ directory=f"{cache_dir}/timestamps",
+ size_limit=size_limit,
+ ttl=ttl,
+ )
+ self.web3 = web3
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
def __init__(self, web3: Web3): | |
self.block_number_cache = dc.Cache("block_cache_dir") | |
self.block_timestamp_cache = dc.Cache("timestamp_cache_dir") | |
self.web3 = web3 | |
def __init__( | |
self, | |
web3: Web3, | |
cache_dir: str = "cache", | |
size_limit: int = int(1e9), # 1GB default | |
ttl: int = 60 * 60 * 24, # 24 hours default | |
) -> None: | |
"""Initialize transaction cache with configurable parameters. | |
Args: | |
web3: Web3 instance for blockchain interaction | |
cache_dir: Base directory for cache storage | |
size_limit: Maximum cache size in bytes | |
ttl: Time-to-live for cache entries in seconds | |
""" | |
self.block_number_cache = dc.Cache( | |
directory=f"{cache_dir}/block_numbers", | |
size_limit=size_limit, | |
ttl=ttl, | |
) | |
self.block_timestamp_cache = dc.Cache( | |
directory=f"{cache_dir}/timestamps", | |
size_limit=size_limit, | |
ttl=ttl, | |
) | |
self.web3 = web3 |
@tenacity.retry( | ||
wait=wait_exponential(multiplier=1, min=1, max=4), | ||
stop=tenacity.stop_after_attempt(7), | ||
after=lambda x: logger.debug(f"fetch tx failed, {x.attempt_number=}."), | ||
) | ||
def fetch_block_number(self, transaction_hash: str) -> int: | ||
tx = self.web3.eth.get_transaction(HexStr(transaction_hash)) | ||
return tx["blockNumber"] | ||
|
||
@tenacity.retry( | ||
wait=wait_exponential(multiplier=1, min=1, max=4), | ||
stop=tenacity.stop_after_attempt(7), | ||
after=lambda x: logger.debug(f"fetch tx failed, {x.attempt_number=}."), | ||
) | ||
def fetch_block_timestamp(self, block_number: int) -> int: | ||
block = self.web3.eth.get_block(block_number) | ||
return block["timestamp"] | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Improve retry logic and error handling
The retry-decorated methods could benefit from:
- Extracting common retry configuration to avoid duplication
- Adding specific exception handling for Ethereum-related errors
- Enhancing debug logging with error details
- Adding docstrings to explain retry behavior
Here's a suggested improvement:
+ # Define retry configuration once
+ RETRY_CONFIG = dict(
+ wait=wait_exponential(multiplier=1, min=1, max=4),
+ stop=tenacity.stop_after_attempt(7),
+ retry=tenacity.retry_if_exception_type(
+ (web3.exceptions.TimeExhausted, web3.exceptions.TransactionNotFound)
+ ),
+ after=lambda retry_state: logger.debug(
+ f"Fetch failed: {retry_state.outcome.exception()}. "
+ f"Attempt {retry_state.attempt_number}/7"
+ ),
+ )
+
- @tenacity.retry(
- wait=wait_exponential(multiplier=1, min=1, max=4),
- stop=tenacity.stop_after_attempt(7),
- after=lambda x: logger.debug(f"fetch tx failed, {x.attempt_number=}."),
- )
+ @tenacity.retry(**RETRY_CONFIG)
def fetch_block_number(self, transaction_hash: str) -> int:
+ """Fetch block number for transaction with retry logic.
+
+ Args:
+ transaction_hash: Ethereum transaction hash
+
+ Returns:
+ Block number for the transaction
+
+ Raises:
+ web3.exceptions.TransactionNotFound: If transaction doesn't exist
+ tenacity.RetryError: If all retry attempts fail
+ """
tx = self.web3.eth.get_transaction(HexStr(transaction_hash))
return tx["blockNumber"]
- @tenacity.retry(
- wait=wait_exponential(multiplier=1, min=1, max=4),
- stop=tenacity.stop_after_attempt(7),
- after=lambda x: logger.debug(f"fetch tx failed, {x.attempt_number=}."),
- )
+ @tenacity.retry(**RETRY_CONFIG)
def fetch_block_timestamp(self, block_number: int) -> int:
+ """Fetch block timestamp with retry logic.
+
+ Args:
+ block_number: Ethereum block number
+
+ Returns:
+ Unix timestamp for the block
+
+ Raises:
+ web3.exceptions.BlockNotFound: If block doesn't exist
+ tenacity.RetryError: If all retry attempts fail
+ """
block = self.web3.eth.get_block(block_number)
return block["timestamp"]
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
@tenacity.retry( | |
wait=wait_exponential(multiplier=1, min=1, max=4), | |
stop=tenacity.stop_after_attempt(7), | |
after=lambda x: logger.debug(f"fetch tx failed, {x.attempt_number=}."), | |
) | |
def fetch_block_number(self, transaction_hash: str) -> int: | |
tx = self.web3.eth.get_transaction(HexStr(transaction_hash)) | |
return tx["blockNumber"] | |
@tenacity.retry( | |
wait=wait_exponential(multiplier=1, min=1, max=4), | |
stop=tenacity.stop_after_attempt(7), | |
after=lambda x: logger.debug(f"fetch tx failed, {x.attempt_number=}."), | |
) | |
def fetch_block_timestamp(self, block_number: int) -> int: | |
block = self.web3.eth.get_block(block_number) | |
return block["timestamp"] | |
# Define retry configuration once | |
RETRY_CONFIG = dict( | |
wait=wait_exponential(multiplier=1, min=1, max=4), | |
stop=tenacity.stop_after_attempt(7), | |
retry=tenacity.retry_if_exception_type( | |
(web3.exceptions.TimeExhausted, web3.exceptions.TransactionNotFound) | |
), | |
after=lambda retry_state: logger.debug( | |
f"Fetch failed: {retry_state.outcome.exception()}. " | |
f"Attempt {retry_state.attempt_number}/7" | |
), | |
) | |
@tenacity.retry(**RETRY_CONFIG) | |
def fetch_block_number(self, transaction_hash: str) -> int: | |
"""Fetch block number for transaction with retry logic. | |
Args: | |
transaction_hash: Ethereum transaction hash | |
Returns: | |
Block number for the transaction | |
Raises: | |
web3.exceptions.TransactionNotFound: If transaction doesn't exist | |
tenacity.RetryError: If all retry attempts fail | |
""" | |
tx = self.web3.eth.get_transaction(HexStr(transaction_hash)) | |
return tx["blockNumber"] | |
@tenacity.retry(**RETRY_CONFIG) | |
def fetch_block_timestamp(self, block_number: int) -> int: | |
"""Fetch block timestamp with retry logic. | |
Args: | |
block_number: Ethereum block number | |
Returns: | |
Unix timestamp for the block | |
Raises: | |
web3.exceptions.BlockNotFound: If block doesn't exist | |
tenacity.RetryError: If all retry attempts fail | |
""" | |
block = self.web3.eth.get_block(block_number) | |
return block["timestamp"] |
@property | ||
def gnosis_rpc_url(self) -> str: | ||
return check_not_none( | ||
self.GNOSIS_RPC_URL, "GNOSIS_RPC_URL missing in the environment." | ||
) | ||
|
||
@property | ||
def chain_id(self) -> ChainID: | ||
return check_not_none(self.CHAIN_ID, "CHAIN_ID missing in the environment.") | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Remove redundant null checks in properties
The properties gnosis_rpc_url
and chain_id
use check_not_none
, but the fields already have default values making these checks redundant. Consider simplifying:
@property
def gnosis_rpc_url(self) -> str:
- return check_not_none(
- self.GNOSIS_RPC_URL, "GNOSIS_RPC_URL missing in the environment."
- )
+ return self.GNOSIS_RPC_URL
@property
def chain_id(self) -> ChainID:
- return check_not_none(self.CHAIN_ID, "CHAIN_ID missing in the environment.")
+ return self.CHAIN_ID
Committable suggestion skipped: line range outside the PR's diff.
@@ -410,7 +410,7 @@ def test_place_bet_with_autodeposit( | |||
local_web3 | |||
) | |||
assert ( | |||
collateral_token_contract.symbol() == expected_symbol | |||
collateral_token_contract.symbol(local_web3) == expected_symbol |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codebase verification
Inconsistent usage of symbol() method found in test file
There are two instances of symbol()
calls in tests_integration_with_local_chain/markets/omen/test_omen.py
:
- Line 413:
collateral_token_contract.symbol(local_web3)
(with parameter) - Another instance:
collateral_token_contract.symbol()
(without parameter)
The contract implementation in prediction_market_agent_tooling/tools/contract.py
shows that symbol()
method accepts an optional web3
parameter, so both calls should be consistent and include the local_web3
parameter.
🔗 Analysis chain
Verify all symbol() calls are consistently updated across the codebase
The addition of local_web3
parameter to symbol()
call is correct. However, we should ensure all other similar calls are updated consistently.
Let's verify other potential instances that might need updating:
Let me search for more context around symbol calls using a different pattern to ensure we catch all variations:
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Find other symbol() calls that might need the web3 parameter
# Look for potential symbol() calls in test files and contract files
rg -g '*.py' 'symbol\(\s*\)' --context 2
Length of output: 504
Script:
#!/bin/bash
# Search for symbol calls with any parameters
rg -g '*.py' 'symbol\(' --context 2
Length of output: 1859
def get_private_key_from_gcp_secret( | ||
secret_id: str, | ||
project_id: str = "582587111398", # Gnosis AI default project_id | ||
version_id: str = "latest", | ||
) -> PrivateKey: | ||
# If credentials filename specified, use that, else read using default credentials path. | ||
google_application_credentials_filename = ( | ||
CloudCredentials().GOOGLE_APPLICATION_CREDENTIALS | ||
) | ||
if google_application_credentials_filename is not None: | ||
# mypy interprets incorrectly that from_service_account_json requires further args. | ||
client = secretmanager.SecretManagerServiceClient.from_service_account_json(filename=google_application_credentials_filename) # type: ignore [call-arg] | ||
else: | ||
client = secretmanager.SecretManagerServiceClient() | ||
name = f"projects/{project_id}/secrets/{secret_id}/versions/{version_id}" | ||
response = client.access_secret_version(request={"name": name}) | ||
secret_payload = response.payload.data.decode("UTF-8") | ||
secret_json = json.loads(secret_payload) | ||
if "private_key" not in secret_json: | ||
raise ValueError(f"Private key not found in gcp secret {secret_id}") | ||
return PrivateKey(SecretStr(secret_json["private_key"])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add exception handling when accessing secrets
Currently, exceptions that may occur during the call to client.access_secret_version()
(e.g., network issues, permission errors) are not handled. To enhance robustness, consider adding exception handling to manage these potential errors gracefully.
Here’s an example of how you might implement exception handling:
from google.api_core.exceptions import GoogleAPICallError, RetryError
try:
response = client.access_secret_version(request={"name": name})
except (GoogleAPICallError, RetryError) as e:
logger.error(f"Error accessing secret version {name}: {e}")
raise
Ensure you have the necessary imports for the exceptions and that logger
is properly configured.
) | ||
if google_application_credentials_filename is not None: | ||
# mypy interprets incorrectly that from_service_account_json requires further args. | ||
client = secretmanager.SecretManagerServiceClient.from_service_account_json(filename=google_application_credentials_filename) # type: ignore [call-arg] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use from_service_account_file
instead of from_service_account_json
The method from_service_account_json
is not available for SecretManagerServiceClient
. The correct method to instantiate a client from a service account file is from_service_account_file
. Update the method call accordingly.
Apply this diff to fix the issue:
-client = secretmanager.SecretManagerServiceClient.from_service_account_json(filename=google_application_credentials_filename) # type: ignore [call-arg]
+client = secretmanager.SecretManagerServiceClient.from_service_account_file(google_application_credentials_filename)
After this change, the # type: ignore [call-arg]
comment is no longer necessary.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
client = secretmanager.SecretManagerServiceClient.from_service_account_json(filename=google_application_credentials_filename) # type: ignore [call-arg] | |
client = secretmanager.SecretManagerServiceClient.from_service_account_file(google_application_credentials_filename) |
-> Added Sharpe ratio, mean daily return and annual volatility as extra variables when selecting strategies