Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tapdb: Add Universe indices, optimize SQL queries #1174

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

jbrill
Copy link

@jbrill jbrill commented Nov 5, 2024

PR Description: Optimized Indexing for Universe Data Queries

Summary

This PR introduces strategic indexing improvements across critical tables in the tapdb database, designed to optimize performance for frequent query patterns within the universe data model. The indexing changes were tested across SQLite (the initial implementation) and PostgreSQL (newly evaluated). Below are the results, including a detailed performance comparison, database size analysis, and recommendations for further refinement.

Key Changes

  • New Indexes Introduced:
    • Universe Roots: idx_universe_roots_namespace, idx_universe_roots_issuance
    • Universe Leaves: idx_universe_leaves_lookup, idx_universe_leaves_sort
    • MSSMT Nodes: idx_mssmt_nodes_namespace, idx_mssmt_nodes_key_lookup
    • Universe Events: idx_universe_events_stats, idx_universe_events_root_type
    • Federation Proof Sync Log: idx_federation_sync_composite
    • Multiverse Leaves: idx_multiverse_leaves_composite

These indexes aim to reduce latency for high-volume operations and complex data access patterns, with a focus on improving both query speed and data retrieval efficiency.

Performance Results

SQLite Summary

Indexing in SQLite yielded significant performance improvements for most query types:

  • Universe Events by Type: 1.44x improvement
  • Universe Leaves Namespace: 12.57x improvement
  • Universe Root Namespace: 0.86x improvement (slightly slower with indexing in specific cases)

PostgreSQL Summary

In PostgreSQL, indexing yielded smaller performance improvements, and certain queries saw minimal or slightly negative impacts:

  • Universe Events by Type: 1.22x improvement
  • Universe Leaves Namespace: 3.44x improvement
  • Universe Root Namespace: 1.20x improvement

Detailed Database Size Analysis (PostgreSQL)

  • Data Size: 2.5 MB
  • Index Size: 7.0 MB (278.3% index overhead)
  • High-Overhead Tables:
    • Universe Events and Universe Leaves have substantial index overhead due to their high data volumes, making these indexes valuable as data scales.
    • Addrs and Universe Roots show high overhead but yielded minimal performance gains, indicating potential for index refinement or removal.

Observations and Recommendations

  • Performance Gains: SQLite saw substantial improvements, with indexing enhancing most queries, particularly those with namespace filtering and event-type retrieval.

    • PostgreSQL, however, exhibited mixed results; some queries were only marginally improved or even performed slightly worse due to index overhead. This suggests that PostgreSQL's native query optimization strategies may make certain indexes redundant at lower data volumes.
  • Index Optimization: High index overhead observed in specific tables (e.g., addrs, universe_roots) calls for further examination. Certain indexes may be redundant given PostgreSQL's built-in optimizations, especially with smaller datasets.

  • Next Steps:

    • Run with Larger Datasets: Additional tests with higher data volumes are recommended, particularly on PostgreSQL, to confirm index necessity and further adjust indexing strategies for each database.
    • Selective Index Refinement: Explore removing or optimizing indexes with high overhead but limited performance impact in PostgreSQL. This would reduce storage requirements and potentially streamline query performance.

This PR provides a solid foundation for query optimization in the tapdb database. The results indicate that these indexes are highly beneficial in SQLite and moderately effective in PostgreSQL, with room for future refinement as data scales.

@jbrill jbrill force-pushed the universe-index-optimizations branch from 8d3ca1f to 1188e50 Compare November 5, 2024 16:12
@jbrill jbrill changed the title Universe index optimizations tapdb: add indices for universe fetching Nov 5, 2024
@dstadulis dstadulis changed the title tapdb: add indices for universe fetching tapdb/sqlc: Add Universe indices, optimize SQL queries Nov 5, 2024
@coveralls
Copy link

coveralls commented Nov 5, 2024

Pull Request Test Coverage Report for Build 11690972604

Details

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • 31 unchanged lines in 6 files lost coverage.
  • Overall coverage increased (+0.009%) to 40.691%

Files with Coverage Reduction New Missed Lines %
commitment/tap.go 1 83.64%
tapdb/addrs.go 2 79.04%
tappsbt/create.go 2 53.22%
tapchannel/aux_leaf_signer.go 3 36.33%
tapgarden/caretaker.go 4 68.87%
universe/interface.go 19 47.09%
Totals Coverage Status
Change from base Build 11684840814: 0.009%
Covered Lines: 24689
Relevant Lines: 60675

💛 - Coveralls

@jbrill
Copy link
Author

jbrill commented Nov 5, 2024

note: incorporating postgres functionality now for the test

@dstadulis dstadulis changed the title tapdb/sqlc: Add Universe indices, optimize SQL queries tapdb: Add Universe indices, optimize SQL queries Nov 5, 2024
@jbrill jbrill force-pushed the universe-index-optimizations branch 2 times, most recently from f5209eb to 632ee1f Compare November 5, 2024 19:06
@jbrill jbrill force-pushed the universe-index-optimizations branch from 632ee1f to d3561e5 Compare November 5, 2024 19:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: 🏗 In progress
Development

Successfully merging this pull request may close these issues.

3 participants