Lower amount of DB connections #547

kongzii · 2024-11-07T10:30:42Z

deploy please

coderabbitai · 2024-11-07T10:30:49Z

Walkthrough

The pull request introduces changes to three files: config.py, db_cache.py, and relevant_news_cache.py. In config.py, the default value of the ENABLE_CACHE attribute in the APIKeys class is modified from True to False, disabling caching by default. In db_cache.py, a new parameter pool_size is added to the create_engine function call, and the cache saving logic is simplified. Additionally, engine.dispose() is called to manage database connections. Similar changes are made in relevant_news_cache.py, where the pool_size parameter is also added to the create_engine call.

Changes

File	Change Summary
`prediction_market_agent_tooling/config.py`	Updated `ENABLE_CACHE` from `True` to `False` in `APIKeys` class.
`prediction_market_agent_tooling/tools/caches/db_cache.py`	Added `pool_size=1` to `create_engine` call; simplified cache entry saving logic; added `engine.dispose()` to manage database connections.
`prediction_market_agent_tooling/tools/relevant_news_analysis/relevant_news_cache.py`	Added `pool_size=1` to `create_engine` call in `RelevantNewsResponseCache` class.

Possibly related PRs

Initialize db_cache engine in wrapper #546: The changes in this PR involve the ENABLE_CACHE attribute from the APIKeys class, as it checks if caching is enabled before proceeding with caching operations in the db_cache decorator function.

Suggested reviewers

evangriffiths

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 2

🧹 Outside diff range and nitpick comments (2)

prediction_market_agent_tooling/config.py (1)

56-57: Document the caching behavior change.

The change in default caching behavior should be documented for users of this library. Consider:

Adding a note to the changelog

Updating relevant documentation about caching configuration

Adding migration instructions for users who might need to explicitly enable caching
prediction_market_agent_tooling/tools/caches/db_cache.py (1)
43-44: Consider thread-safety implications of global engine.

The global DB_CACHE_ENGINE introduces shared mutable state which could lead to race conditions in concurrent environments. While SQLAlchemy engines are thread-safe for connection pooling, the initialization check-then-set pattern could still race.

Consider:

Adding documentation explaining the purpose and thread-safety guarantees

Using a thread-safe initialization pattern like:
from threading import Lock

_engine_lock = Lock()
DB_CACHE_ENGINE: None | Engine = None

def get_engine() -> Engine:
    global DB_CACHE_ENGINE
    if DB_CACHE_ENGINE is None:
        with _engine_lock:
            if DB_CACHE_ENGINE is None:
                DB_CACHE_ENGINE = create_engine(...)
    return DB_CACHE_ENGINE

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between dfa90ca and 730be04.

⛔ Files ignored due to path filters (1)

pyproject.toml is excluded by !**/*.toml

📒 Files selected for processing (2)

prediction_market_agent_tooling/config.py (1 hunks)
prediction_market_agent_tooling/tools/caches/db_cache.py (6 hunks)

🔇 Additional comments (2)

prediction_market_agent_tooling/config.py (1)

56-56: Verify the impact of disabling cache by default.

Changing ENABLE_CACHE default from True to False could impact performance in code paths that expect caching to be enabled. Please ensure this change is intentional and aligns with the introduction of the shared engine instance.

Let's verify the usage of this configuration:

Consider adding a comment explaining why caching is disabled by default, especially in relation to the shared engine instance changes. This helps future maintainers understand the design decision:

-    ENABLE_CACHE: bool = False
+    # Disabled by default as we're using a single shared database engine instance
+    # See db_cache.py for the caching implementation
+    ENABLE_CACHE: bool = False

✅ Verification successful

Based on the shell script results, I can now generate the final response:

Disabling cache by default is safe and aligned with test expectations

The codebase already handles both cached and non-cached scenarios properly:

The db_cache decorator explicitly checks api_keys.ENABLE_CACHE and falls back to non-cached execution
Tests explicitly verify the behavior with cache disabled (see test_db_cache.py)
The caching system is properly abstracted, making it easy to enable/disable without breaking functionality

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Find code that relies on ENABLE_CACHE configuration
# Expected results: Identify potential impact points where caching is assumed

# Search for direct references to ENABLE_CACHE
echo "Direct references to ENABLE_CACHE:"
rg "ENABLE_CACHE" -B 2 -A 2

# Search for cache-related function calls that might be affected
echo "\nCache-related function calls:"
rg "\.cache|cache\." -B 2 -A 2

Length of output: 9715

prediction_market_agent_tooling/tools/caches/db_cache.py (1)

Line range hint 43-211: Verify caching usage and performance impact.

The switch to a single shared engine instance could impact application performance and resource usage.

Let's analyze the caching usage:

✅ Verification successful

Let me gather more information about the caching implementation and its usage patterns.

Based on the comprehensive analysis of the codebase, I can now provide a final response:

Global DB engine instance is safe and follows best practices

The shared DB_CACHE_ENGINE implementation is correct and safe because:

Each decorated function creates independent database sessions using with Session(DB_CACHE_ENGINE), ensuring thread-safety
The codebase shows no concurrent access patterns that could cause issues with the shared engine
The implementation is similar to other caching patterns in the codebase (e.g., RelevantNewsResponseCache)
Test coverage demonstrates proper functionality across various scenarios including:
- Multiple data types and Pydantic models
- Cache invalidation
- Concurrent access patterns
- Error handling

The switch to a single shared engine instance is an improvement over creating multiple engines, as it:

Reduces resource overhead
Follows SQLAlchemy's connection pooling best practices
Maintains thread-safety through proper session management

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Analyze cache usage patterns and potential impact

# Check for cache decorator usage
echo "Checking cache decorator usage patterns..."
rg -l "@db_cache" 

# Look for direct engine/session creation that might need consolidation
echo "Checking for other database engine instances..."
rg "create_engine|Session\(" --type py

# Check for potential concurrent access patterns
echo "Checking for concurrent execution patterns..."
rg -l "ThreadPoolExecutor|ProcessPoolExecutor|asyncio|concurrent" --type py

Length of output: 1731

Script:

#!/bin/bash
# Check the implementation details of files using db_cache

# Check test file for usage patterns
echo "=== Test file implementation ==="
cat tests/tools/test_db_cache.py

# Check actual usage in application code
echo -e "\n=== Google implementation ==="
cat prediction_market_agent_tooling/tools/google.py

echo -e "\n=== Tavily implementation ==="
cat prediction_market_agent_tooling/tools/tavily/tavily_search.py

# Check relevant_news_cache implementation for comparison
echo -e "\n=== Relevant news cache implementation ==="
cat prediction_market_agent_tooling/tools/relevant_news_analysis/relevant_news_cache.py

Length of output: 20191

kongzii · 2024-11-09T08:12:23Z

prediction_market_agent_tooling/config.py

@@ -53,7 +53,7 @@ class APIKeys(BaseSettings):

    SQLALCHEMY_DB_URL: t.Optional[SecretStr] = None

-    ENABLE_CACHE: bool = True
+    ENABLE_CACHE: bool = False


By default set to False, to not error out cached functions if postgres access isn't set.

coderabbitai

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (1)

prediction_market_agent_tooling/tools/relevant_news_analysis/relevant_news_cache.py (1)
28-33: LGTM! Consider additional connection pool configurations.

The addition of pool_size=1 aligns with the PR objective to reduce DB connections. However, consider adding these configurations for better connection management:
         self.engine = create_engine(
             (
                 sqlalchemy_db_url
                 if sqlalchemy_db_url
                 else APIKeys().sqlalchemy_db_url.get_secret_value()
             ),
-            pool_size=1,
+            pool_size=1,
+            pool_timeout=30,  # Timeout waiting for connections from pool
+            pool_recycle=3600,  # Recycle connections after 1 hour
+            pool_pre_ping=True,  # Verify connection validity before checkout
         )
+        # Add engine cleanup
+        import atexit
+        atexit.register(self.engine.dispose)
This ensures:

Connections don't hang indefinitely waiting for the pool

Stale connections are recycled

Dead connections are detected before use

Engine resources are properly cleaned up on application exit

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 730be04 and 15477b2.

📒 Files selected for processing (2)

prediction_market_agent_tooling/tools/caches/db_cache.py (3 hunks)
prediction_market_agent_tooling/tools/relevant_news_analysis/relevant_news_cache.py (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

prediction_market_agent_tooling/tools/caches/db_cache.py

#552) * Lower amount of DB connections (#547) * Single shared engine instance * fixes * revert accidental bump * fix pickle in github ci * fiiiix * fix loguru * revert * ah * ah2 * revert * Added gnosis_rpc_config, google_credentials_filename and chain_id to BaseSettings * Fixed black * Fixed black (2) | changed rpc * Raise OutOfFunds in withdraw_wxdai_to_xdai_to_keep_balance (#553) * Implemented PR comments * Fixed CI * Merged --------- Co-authored-by: Peter Jung <[email protected]>

* Added block_time fetching for more accurate outcome tokens received. * Financial metrics being published * Reverted caching to diskcache * Fixing CI * lock file * Updated network foundry * Updated RPC to gateway * Moved local_chain test to proper folder * Fixed black * Applied PR comments * Small fixes after PR review * Small refactoring to address PR comments * Added gnosis_rpc_config, google_credentials_filename and chain_id to … (#552) * Lower amount of DB connections (#547) * Single shared engine instance * fixes * revert accidental bump * fix pickle in github ci * fiiiix * fix loguru * revert * ah * ah2 * revert * Added gnosis_rpc_config, google_credentials_filename and chain_id to BaseSettings * Fixed black * Fixed black (2) | changed rpc * Raise OutOfFunds in withdraw_wxdai_to_xdai_to_keep_balance (#553) * Implemented PR comments * Fixed CI * Merged --------- Co-authored-by: Peter Jung <[email protected]> --------- Co-authored-by: Peter Jung <[email protected]>

Single shared engine instance

730be04

coderabbitai bot reviewed Nov 7, 2024

View reviewed changes

kongzii marked this pull request as draft November 7, 2024 12:05

fixes

d45edeb

kongzii commented Nov 9, 2024

View reviewed changes

kongzii added 4 commits November 9, 2024 15:17

revert accidental bump

48a5ae6

fix pickle in github ci

2e6ba93

fiiiix

c9e8544

fix loguru

1c9b8aa

kongzii changed the title ~~Single shared engine instance~~ Fix multiprocessing Nov 10, 2024

kongzii force-pushed the peter/fix-db-cache-2 branch from a4caa71 to 6559f52 Compare November 10, 2024 07:44

revert

56c9b76

kongzii force-pushed the peter/fix-db-cache-2 branch from 6559f52 to 56c9b76 Compare November 10, 2024 10:48

kongzii added 2 commits November 10, 2024 21:31

ah

00ecc96

ah2

a7fa2d0

kongzii changed the title ~~Fix multiprocessing~~ Lower amount of DB connections Nov 10, 2024

revert

15477b2

kongzii mentioned this pull request Nov 11, 2024

Bump PMAT and small niches gnosis/prediction-market-agent#548

Merged

gnosis deleted a comment from coderabbitai bot Nov 11, 2024

kongzii marked this pull request as ready for review November 11, 2024 08:14

kongzii mentioned this pull request Nov 11, 2024

Have only one instance of engine connection to Postgres #551

Open

coderabbitai bot reviewed Nov 11, 2024

View reviewed changes

gabrielfior approved these changes Nov 11, 2024

View reviewed changes

gabrielfior merged commit 3c7e09a into main Nov 11, 2024
16 checks passed

gabrielfior deleted the peter/fix-db-cache-2 branch November 11, 2024 12:25

gabrielfior mentioned this pull request Nov 19, 2024

First attempt at DB_Manager #555

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lower amount of DB connections #547

Lower amount of DB connections #547

kongzii commented Nov 7, 2024 •

edited

Loading

coderabbitai bot commented Nov 7, 2024 •

edited

Loading

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

coderabbitai bot left a comment

kongzii Nov 9, 2024

coderabbitai bot left a comment

Lower amount of DB connections #547

Lower amount of DB connections #547

Conversation

kongzii commented Nov 7, 2024 • edited Loading

coderabbitai bot commented Nov 7, 2024 • edited Loading

Walkthrough

Changes

Possibly related PRs

Suggested reviewers

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

kongzii Nov 9, 2024

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

kongzii commented Nov 7, 2024 •

edited

Loading

coderabbitai bot commented Nov 7, 2024 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)