-
Maybe I'm incorrect in my assumptions here, but here we go: There are many cases where having graph queries is not necessary because your dataset is aggregated/denormalized and you don't need to join on anything, but you want to enable simple pagination over the sets. I know that datalog queries provide offset and limit to enable paging over result tuples returned from utilizing the graph index, however the fact that it has to realize the whole query seems to be a limiting factor for efficiently enabling the display of larger sets of denormalized data. Also, I'm not sure it's also beneficial from the hardware side of things to spill to disk every time a query with a deep offset occurs because it would rip through the write ops of the underlying SSDs no? Example use-case: We want to store synced emails in XTDB, but if we were to display these emails directly to the user with pagination, it's not possible to provide this in an optimal way, especially with potentially millions of them spilling to disk every time? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Hey @coadan 👋 You're right in that HTH! James |
Beta Was this translation helpful? Give feedback.
Hey @coadan 👋
You're right in that
:offset
requires XT paging over all the results that you want to skip over (Postgres etc too, fwiw). In similar situations, I've tended to use attributes in the documents to filter on instead ('cursor-based pagination') - so, if you were paging through emails in the order they were received, and the last email on the page was received last Thursday, your client then requests '100 emails starting from last Thursday'. In this case, the query planner can then skip straight to the first item on the next page.HTH!
James