You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I'm trying out macrobase on a dataset with ~15 million rows and ~30 columns. I selected three columns to try out first but it's taking forever to run. So I'm wondering if there's any information about time complexity for the algorithm.
The text was updated successfully, but these errors were encountered:
Hi @madcarrot, thanks for using MacroBase! Could you give us some information on:
the version (commit hash) of MacroBase you're running on
the interface you're using (the UI, the SQL shell, or the CLI runner)
the minimum support and minimum ratio metric you're using in your query (you may not know this if you're using the UI)
the number of distinct elements in those 3 columns that you selected initially
MacroBase uses a variant of the APriori algorithm in its query engine. The time complexity is linear in the number of rows; while it can be combinatorial in the number of columns, we do a lot of pruning by ignoring low-frequency combinations of column values during execution of the algorithm.
If you only selected three columns to run on, then I'm surprised that it's taking so long to run; that's why I'm wondering if maybe it's simply an old version of the code. Let us know, and we'll figure out what's going on.
Hi, I'm trying out macrobase on a dataset with ~15 million rows and ~30 columns. I selected three columns to try out first but it's taking forever to run. So I'm wondering if there's any information about time complexity for the algorithm.
The text was updated successfully, but these errors were encountered: