From 0aea7e41b24fb479b2a1bbc71ab72f43e823f3a7 Mon Sep 17 00:00:00 2001 From: Sebastiaan Huber Date: Thu, 21 Sep 2023 18:26:21 +0200 Subject: [PATCH] Docs: Add important note on using `iterall` and `iterdict` (#6126) Using the `all` and `dict` equivalents are very inefficient for large query results and will lead to performance problems. --- docs/source/howto/query.rst | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/docs/source/howto/query.rst b/docs/source/howto/query.rst index 2787d9e1d3..711067ce2a 100644 --- a/docs/source/howto/query.rst +++ b/docs/source/howto/query.rst @@ -74,12 +74,15 @@ There are several ways to obtain data from a query: all_results_l = qb.all() # Returns a list of lists -In case you are working with a large dataset, you can also return your query as a generator: +.. tip:: + + If your query only has a single projection, use ``flat=True`` in the ``first`` and ``all`` methods to return a single value or a flat list, respectively. + +You can also return your query as a generator: .. code-block:: python all_res_d_gen = qb.iterdict() # Return a generator of dictionaries - # of all results all_res_l_gen = qb.iterall() # Returns a generator of lists This will retrieve the data in batches, and you can start working with the data before the query has completely finished. @@ -90,6 +93,12 @@ For example, you can iterate over the results of your query in a for loop: for entry in qb.iterall(): # do something with a single entry in the query result +.. important:: + + When looping over the result of a query, use the ``iterall`` (or ``iterdict``) generator instead of ``all`` (or ``dict``). + This avoids loading the entire query result into memory, and it also delays committing changes made to AiiDA objects inside the loop until the end of the loop is reached. + + .. _how-to:query:filters: Filters