tapdb: Add Universe indices, optimize SQL queries #1174
+573
−135
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
PR Description: Optimized Indexing for Universe Data Queries
Summary
This PR introduces strategic indexing improvements across critical tables in the
tapdb
database, designed to optimize performance for frequent query patterns within the universe data model. The indexing changes were tested across SQLite (the initial implementation) and PostgreSQL (newly evaluated). Below are the results, including a detailed performance comparison, database size analysis, and recommendations for further refinement.Key Changes
idx_universe_roots_namespace
,idx_universe_roots_issuance
idx_universe_leaves_lookup
,idx_universe_leaves_sort
idx_mssmt_nodes_namespace
,idx_mssmt_nodes_key_lookup
idx_universe_events_stats
,idx_universe_events_root_type
idx_federation_sync_composite
idx_multiverse_leaves_composite
These indexes aim to reduce latency for high-volume operations and complex data access patterns, with a focus on improving both query speed and data retrieval efficiency.
Performance Results
SQLite Summary
Indexing in SQLite yielded significant performance improvements for most query types:
PostgreSQL Summary
In PostgreSQL, indexing yielded smaller performance improvements, and certain queries saw minimal or slightly negative impacts:
Detailed Database Size Analysis (PostgreSQL)
Observations and Recommendations
Performance Gains: SQLite saw substantial improvements, with indexing enhancing most queries, particularly those with namespace filtering and event-type retrieval.
Index Optimization: High index overhead observed in specific tables (e.g.,
addrs
,universe_roots
) calls for further examination. Certain indexes may be redundant given PostgreSQL's built-in optimizations, especially with smaller datasets.Next Steps:
This PR provides a solid foundation for query optimization in the
tapdb
database. The results indicate that these indexes are highly beneficial in SQLite and moderately effective in PostgreSQL, with room for future refinement as data scales.