Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Serious performance degradation due to SQL query nesting #340

Open
saurabhnanda opened this issue Oct 20, 2017 · 46 comments · May be fixed by #347
Open

Serious performance degradation due to SQL query nesting #340

saurabhnanda opened this issue Oct 20, 2017 · 46 comments · May be fixed by #347

Comments

@saurabhnanda
Copy link
Contributor

saurabhnanda commented Oct 20, 2017

Related to #334 & #284

Some comparative numbers while running the same "fetch by PK" type of query in Rails/ActiveRecord and Opaleye. The query generated by Opaleye is ~3-8x slower. (compare the query-timing from PG logs for the 3x difference. 8x difference is observed as soon as a similar query is re-fired -- it seems that bind/execute, which Rails is firing, caches better at the PG-level, compared to what Opaleye is firing).

Rails/ActiveRecord

console> Benchmark.measure { Client.find(105) }

Common::Client Load (17.6ms)  SELECT "clients".* FROM "clients" WHERE "clients"."id" = $1 LIMIT 1  [["id", 105]]

Timing from PG Logs: 14.046ms (total of parse, bind, & execute step)

2017-10-20 17:51:20 IST [11138-291] b2b@b2b LOG:  duration: 0.617 ms  parse a1: SELECT  "clients".* FROM "clients"  WHERE "clients"."id" = $1 LIMIT 1
2017-10-20 17:51:20 IST [11138-292] b2b@b2b LOG:  duration: 12.040 ms  bind a1: SELECT  "clients".* FROM "clients"  WHERE "clients"."id" = $1 LIMIT 1
2017-10-20 17:51:20 IST [11138-293] b2b@b2b DETAIL:  parameters: $1 = '127'
2017-10-20 17:51:20 IST [11138-294] b2b@b2b LOG:  duration: 1.389 ms  execute a1: SELECT  "clients".* FROM "clients"  WHERE "clients"."id" = $1 LIMIT 1
2017-10-20 17:51:20 IST [11138-295] b2b@b2b DETAIL:  parameters: $1 = '127'```

Opaleye

Timing from a wrapper written on top of Opaleye.runQuery: 53.026ms

GHCi> x <- runRepl $ findByPk (PK 105 :: ClientId)

2d6b3eda-1dfe-43c2-b2d6-2639a91959d1 | SQL | SELECT "id0_1" as "result1_2", | 2017-10-20 12:16:14.723184 UTC | 0.053026s (cpu: 1.2e-2) | Affected rows: 1

Timing from PG Logs: 40.622 (total of all three queries given below)

2017-10-20 17:34:39 IST [9400-1] b2b@b2b LOG:  duration: 0.146 ms  statement: SET datestyle TO ISO;SET client_encoding TO UTF8;SET standard_conforming_strings TO on
2017-10-20 17:34:39 IST [9400-2] b2b@b2b LOG:  duration: 35.547 ms  statement: SELECT "id0_1" as "result1_2",
           "name1_1" as "result2_2",
           "status2_1" as "result3_2",
           -- 
           -- SNIP: Total of 79 cols being selected
           -- 
           "type79_1" as "result80_2"
    FROM (SELECT *
          FROM (SELECT "id" as "id0_1",
                       "name" as "name1_1",
                       "status" as "status2_1",
                       -- 
                       -- SNIP: Total of 79 cols being selected
                       -- 
                       "type" as "type79_1"
                FROM "clients" as "T1") as "T1"
          WHERE (("id0_1") = 105)) as "T1"
2017-10-20 17:34:39 IST [9400-3] b2b@b2b LOG:  duration: 4.929 ms  statement: SELECT oid, typcategory, typdelim, typname, typelem, typrelid FROM pg_type WHERE oid = 16947

Questions to ponder:

  • Why is the following SQL query being fired? Who is firing it - Opaleye or PG-Simple?
SELECT oid, typcategory, typdelim, typname, typelem, typrelid FROM pg_type WHERE oid = 16947
  • The Opaleye query was generated with the following Arrow notation, and every term/transformation/clause of the Arrow increases one level of nesting in the SQL. Is this nesting really required? Can't the basic SQL "fragments" of SELECT - FROM - WHERE - ORDER - LIMIT/OFFSET be applied in a single SQL query without nesting? The same question can be extended to JOINs (Opaleye generated left join queries seems to slow down the Postgresql's query planner #284)
proc () -> do

  -- The following corresponds to the innermost SQL query which is basically selecting all columns from the table
  r <- queryTable tableForClient -< ()

  -- The following corresponds to the "middle" SQL query, which is simply applying the single WHERE clause on the `id` field
  restrict -< (r ^. Prelude.id) .== (constant (PK 105 :: ClientId))

  -- The following corresponds to the outermost SQL query, which is again selected all columns from the inner/middle query
  returnA -< r

Conclusion

For a RDBMS-backed webapp, which is making 4-10 queries in every request/response cycle, this performance degradation adds up really quickly. Similar endpoints written in Rails are responding under 80ms, and the average response time in Haskell is about 250-300ms, most of which can be attributed the queries being generated by Opaleye.

If this isn't fixed, Opaleye's killer feature (query composition), ends-up being a great PoC, but unusable in production. What can we do to fix this?

@saurabhnanda
Copy link
Contributor Author

Hang on... pressed submit, by mistake.

@tomjaguarpaw
Copy link
Owner

Regarding the pg_type query, see:

It seems that this should be a once-per-connection query issued by postgresql-simple. Have you tried issuing the Opaleye query multiple times and taking the incremental time as your benchmark figure, as opposed to the one-off time?

@saurabhnanda
Copy link
Contributor Author

Okay... I'm done with my monster post!

@tomjaguarpaw
Copy link
Owner

I should give you notice up front that unless you can convince me that issuing hundreds of thousands of consecutive individual queries with one-row results is something you really need to do then I'm not going to be very inclined to work on this performance issue!

@tomjaguarpaw
Copy link
Owner

Here are a few things you should check.

  • The Opaleye query is run through postgresql-simple. The latter seems to inject some additional bookkeeping queries and you should strike that out of the time taken by Opaleye. We're looking at 14ms for Rails versus 36ms for Opaleye.

  • You should carefully benchmark the time that Rails and Opaleye take to generate the query string. Optimizing the query will take additional time and I don't want it to make the problem worse.

  • The single thing that will most encourage me to spend time optimizing Opaleye's query generation is a robust benchmark that I can run each iteration against. The benchmark should break down run time into "query preparation" (converting a Query ... into a String) and "query execution" (sending the String to the database) steps.

@tomjaguarpaw
Copy link
Owner

I should give you notice up front that unless you can convince me that issuing hundreds of thousands of consecutive individual queries with one-row results is something you really need to do then I'm not going to be very inclined to work on this performance issue!

Thank you for providing the context about your request-response cycle which does indeed convince me that low latency is important for your application.

At this point the most helpful thing you can do is write the benchmark alluded to in my previous comment.

@saurabhnanda
Copy link
Contributor Author

I should give you notice up front that unless you can convince me that issuing hundreds of thousands of consecutive individual queries with one-row results is something you really need to do then I'm not going to be very inclined to work on this performance issue!

We moved away from using JOINs via Opaleye because the nesting is more severe with them. However, even with simple lookups by PK, the queries are way slower. Here's the log from a single request/response cycle along with the reason for why those queries are being made:

-- Fetch the currently signed-in user (along with the tenant, in a multi-tenant app). This is a two table INNER JOIN.
5d857097-9c34-4970-950e-d8c02c706b2d | SQL | SELECT "id0_1" as "result1_3", | 2017-10-20 12:49:43.2178 UTC | 0.014055s (cpu: 7.663e-3) | Affected rows: 1

-- Fetch the top-most object required to construct the JSON response (essentially a findByPk)
5d857097-9c34-4970-950e-d8c02c706b2d | SQL | SELECT "id0_1" as "result1_2", | 2017-10-20 12:49:43.233332 UTC | 0.004322s (cpu: 1.313e-3) | Affected rows: 1

-- Fetch additional master data required to construct the JSON response. Again, a 2-table INNER JOIN.
5d857097-9c34-4970-950e-d8c02c706b2d | SQL | SELECT "booking_ref7_1" as "re | 2017-10-20 12:49:43.237991 UTC | 0.017731s (cpu: 6.5179999999999995e-3) | Affected rows: 1

-- Find by PK on the Customer contacts table (for an existing order/booking)
5d857097-9c34-4970-950e-d8c02c706b2d | SQL | SELECT "id0_1" as "result1_2", | 2017-10-20 12:49:43.256715 UTC | 0.007778s (cpu: 5.362e-3) | Affected rows: 1

-- Find the last payment made for this particular order. Conceptually -- SELECT MAX(created_at)... FROM payments WHERE payment.order_id=?
5d857097-9c34-4970-950e-d8c02c706b2d | SQL | SELECT "result0_2" as "result1 | 2017-10-20 12:49:43.264833 UTC | 0.003222s (cpu: 0.0) | Affected rows: 1

-- Find by PK for product-variant of this order (SKU)
5d857097-9c34-4970-950e-d8c02c706b2d | SQL | SELECT "id0_1" as "result1_2", | 2017-10-20 12:49:43.268322 UTC | 0.003107s (cpu: 1.961e-3) | Affected rows: 1

-- Fetch all the pricing-items (invoice line-items) for this order
5d857097-9c34-4970-950e-d8c02c706b2d | SQL | SELECT "id0_1" as "result1_2", | 2017-10-20 12:49:43.271888 UTC | 0.002005s (cpu: 1.051e-3) | Affected rows: 2

-- Fetch the top-most object required to construct the JSON response. This is needlessly being fetch again -- I just realised while typing this out.
5d857097-9c34-4970-950e-d8c02c706b2d | SQL | SELECT "id0_1" as "result1_2", | 2017-10-20 12:49:43.274129 UTC | 0.001993s (cpu: 1.486e-3) | Affected rows: 1

-- Fetch a few records associated with the top-most object
5d857097-9c34-4970-950e-d8c02c706b2d | SQL | SELECT "id0_1" as "result1_2", | 2017-10-20 12:49:43.276375 UTC | 0.002043s (cpu: 9.49e-4) | Affected rows: 4

Does this query pattern seem reasonable for a typical request/response cycle? If the query times are added it comes to 56ms and the response was sent out in 61ms So, querying the DB efficiently is critical to the kind of webapps we're writing.

A different endpoint written in Rails, which is making 10 SQL queries, is responding in 47.8ms out of which ActiveRecord is taking only 11.7ms! The DB layer is essentially 5x faster.

@saurabhnanda
Copy link
Contributor Author

You should carefully benchmark the time that Rails and Opaleye take to generate the query string.

Isn's this a foregone conclusion? Query strings generated by Opaleye are easily 40-50x longer than the ones generated by Rails, due to enumeration all column names across multiple levels of nesting.

The single thing that will most encourage me to spend time optimizing Opaleye's query generation is a robust benchmark that I can run each iteration against. The benchmark should break down run time into "query preparation" (converting a Query ... into a String) and "query execution" (sending the String to the database) steps.

Is nesting the query a necessity? Can't multiple levels of nesting be collapsed in the internal Opaleye AST itself, or while converting the AST to the query-string? Your thoughts?

@tomjaguarpaw
Copy link
Owner

You should carefully benchmark the time that Rails and Opaleye take to generate the query string.

Isn's this a foregone conclusion? Query strings generated by Opaleye are easily 40-50x longer than the ones generated by Rails, due to enumeration all column names across multiple levels of nesting.

I'm not saying you should benchmark them to see who's faster. I'm saying you should benchmark them to see what proportion of Opaleye's time is spent in query generation as opposed to query execution.

Is nesting the query a necessity?

No

Can't multiple levels of nesting be collapsed in the internal Opaleye AST itself, or while converting the AST to the query-string?

Yes

Your thoughts?

My thought is I'm not going to implement that until I have a robust benchmark.

@tomjaguarpaw
Copy link
Owner

My thought is I'm not going to implement that until I have a robust benchmark.

If I'm going to spend my time doing something that I want to be able to very carefully assess its impact.

@saurabhnanda
Copy link
Contributor Author

The benchmark should break down run time into "query preparation" (converting a Query ... into a String) and "query execution" (sending the String to the database) steps.

Am I missing something here? Doesn't the data shared above demonstrate that query execution ( on PG side) is a perf-bottleneck? Do you have another hypothesis about query generation (on Opaleye side) also being a perf-bottleneck?

@tomjaguarpaw
Copy link
Owner

Do you have another hypothesis about query generation (on Opaleye side) also being a perf-bottleneck?

Yes, I suspect query generation is also a bottleneck. It manipulates Haskell Strings which are known to perform poorly!

In order to fix this performance issue I need a robust way of getting performance numbers out for both the generation and execution stages.

@saurabhnanda
Copy link
Contributor Author

In order to fix this performance issue I need a robust way of getting performance numbers out for both the generation and execution stages.

Execution stage would basically be an instrumented version of Opaleye.runQuery, right? And generation would be instrumentation over prepareQuery, correct?

@tomjaguarpaw
Copy link
Owner

Yes, I would say split runQueryExplicit into two bits and time them separately.

  1. The prepareQuery bit (be careful to make sure the query string is fully evaluated, maybe with deepSeq)
  2. The queryWith_ bit

@tomjaguarpaw
Copy link
Owner

Where are we on this?

@saurabhnanda
Copy link
Contributor Author

saurabhnanda commented Oct 30, 2017 via email

@saurabhnanda
Copy link
Contributor Author

We are wrapping up all docs/fixes that we came across while building our Haskell module. This will be picked once all of that is over. A few weeks.

I'm planning to pick this up now. How would you like the benchmarking tooling to work? What would you like your dev workflow to be?

@tomjaguarpaw
Copy link
Owner

I need some function that I can run that will produce performance statistics for the cases you care about. I would like statistics about the run time of query generation and also the run time of query execution. I don't think I mind much more than that.

@saurabhnanda
Copy link
Contributor Author

I need some function that I can run that will produce performance statistics for the cases you care about. I would like statistics about the run time of query generation and also the run time of query execution. I don't think I mind much more than that.

So, just a function (or bunch of functions) that can be run via GHCi? Will running it via GHCi cause the perf numbers to vary wildly? Do you use byte-code or object-code in GHCi? Will that have any impact?

@tomjaguarpaw
Copy link
Owner

You can also have a main function that wraps those functions into an executable if you like.

@saurabhnanda
Copy link
Contributor Author

So, let's benchmark the following use-cases:

  • Fetch a complete row by PK
  • Fetch a complete row by a bunch of WHERE clauses
  • Fetch few columns by PK
  • Fetch few columns by a bunch of WHERE clauses
  • LeftJoin
  • LeftJoinF
  • InnerJoin
  • Limit / Offset
  • Group by aggregation
  • Max/min aggregation
  • Distinct aggregation

Anything else you (or anyone else?) can think of?

@tomjaguarpaw
Copy link
Owner

Seems fine to me. We can always add new use-cases later.

@saurabhnanda
Copy link
Contributor Author

Why has prepareQuery been marked as deprecated? It is being superseded by which function?

@tomjaguarpaw
Copy link
Owner

It's been deprecated because it's going to be moved into an internal module. It's fine for you to use it in this benchmark harness.

@saurabhnanda saurabhnanda linked a pull request Nov 8, 2017 that will close this issue
@saurabhnanda
Copy link
Contributor Author

I think in view of #476 this issue should be re-opened. Probably prepare if for Hackoberfest so that we can tackle this as a community and get it fixed once and for all.

@tomjaguarpaw
Copy link
Owner

Yes, would be great to have a solid benchmark suite so that we can see what difference any improvements make and ensure that we don't get performance regressions.

@tomjaguarpaw
Copy link
Owner

I will look again at the benchmarks in #347

@tomjaguarpaw
Copy link
Owner

My idea about this is that instead of generating unique column names each time we use a select we instead generate unique tables names. That means that instead of

SELECT
"col10_1" as "result1_3",
"col21_1" as "result2_3",
"col10_2" as "result3_3",
"col21_2" as "result4_3"
FROM (SELECT
      *
      FROM (SELECT
            "col1" as "col10_1",
            "col2" as "col21_1"
            FROM "foo" as "T1") as "T1",
           (SELECT
            "col1" as "col10_2",
            "col2" as "col21_2"
            FROM "foo" as "T1") as "T2") as "T1"

we could generate

SELECT
T1."col1" as "result1_3",
T1."col2" as "result2_3",
T2."col1" as "result3_3",
T3."col2" as "result4_3"
FROM  "foo" as "T1", "foo" as "T2"

An orthogonal simplification would be do combine clauses where valid. For example, instead of

SELECT
"col10_1" as "result1_2",
"col21_1" as "result2_2"
FROM (SELECT
      *
      FROM (SELECT
            *
            FROM (SELECT
                  *
                  FROM (SELECT
                        *
                        FROM (SELECT
                              *
                              FROM (SELECT
                                    "col1" as "col10_1",
                                    "col2" as "col21_1"
                                    FROM "foo" as "T1") as "T1") as "T1"
                        OFFSET 10) as "T1") as "T1"
            LIMIT 20) as "T1") as "T1"

we could generate

SELECT
"col10_1" as "result1_2",
"col21_1" as "result2_2"
FROM (SELECT
      *
      FROM (SELECT
            *
            FROM (SELECT
                  *
                  FROM (SELECT
                        *
                        FROM (SELECT
                            "col1" as "col10_1",
                            "col2" as "col21_1"
                             FROM "foo" as "T1") as "T1") as "T1"
                OFFSET 10 LIMIT 20) as "T1") as "T1") as "T1"

With transformations like this we could collapse a lot of the subquery nesting.

@tomjaguarpaw
Copy link
Owner

There's a nice improvement on the branch mergeLimitOffset (cae2950).

Before:

*Main Opaleye Data.Profunctor.Product Data.String> let q = let mytable = table "mytable" (p2 (requiredTableField "f1", requiredTableField "f2")) :: Table (Field SqlInt4, Field SqlInt4) (Field SqlInt4, Field SqlInt4) in limit 30 $ offset 10 $ offset 10 $ limit 100 (selectTable mytable)
*Main Opaleye Data.Profunctor.Product Data.String> mapM_ putStrLn $ showSql q
SELECT
"f10_1" as "result1_2",
"f21_1" as "result2_2"
FROM (SELECT
      *
      FROM (SELECT
            *
            FROM (SELECT
                  *
                  FROM (SELECT
                        *
                        FROM (SELECT
                              "f1" as "f10_1",
                              "f2" as "f21_1"
                              FROM "mytable" as "T1") as "T1"
                        LIMIT 100) as "T1"
                  OFFSET 10) as "T1"
            OFFSET 10) as "T1"
      LIMIT 30) as "T1"

After

*Main Opaleye Data.Profunctor.Product Data.String> let q = let mytable = table "mytable" (p2 (requiredTableField "f1", requiredTableField "f2")) :: Table (Field SqlInt4, Field SqlInt4) (Field SqlInt4, Field SqlInt4) in limit 30 $ offset 10 $ offset 10 $ limit 100 (selectTable mytable)
*Main Opaleye Data.Profunctor.Product Data.String> mapM_ putStrLn $ showSql q
SELECT
"f10_1" as "result1_2",
"f21_1" as "result2_2"
FROM (SELECT
      *
      FROM (SELECT
            "f1" as "f10_1",
            "f2" as "f21_1"
            FROM "mytable" as "T1") as "T1"
      LIMIT 30
      OFFSET 20) as "T1"

@ocharles
Copy link
Collaborator

ocharles commented Oct 3, 2020

This certainly looks a lot easier to debug! I'm still sceptical it brings performance improvements, but definitely looks worth having

@ocharles
Copy link
Collaborator

ocharles commented Oct 3, 2020

Actually, I take that back - if the pretty printer is slow, then printing less is almost definitely going to be faster!

@tomjaguarpaw
Copy link
Owner

@saurabhnanda wrote some benchmarks that do show performance improvements when cruft is removed from small queries. I haven't specifically targeted those benchmarks with these improvements but I'm hopeful that something good for performance will come out of it.

@nh2
Copy link

nh2 commented May 21, 2022

Hi, here is another data point of where opaleye's nested queries make postgres slower than needed.

We have a work-queue implemented in postgres, with the following code.

It does the equivalent of:

SELECT *
FROM myproject.jobs
WHERE "status_code" = 'pending' AND "queue" 'myqueue'
ORDER BY "priority" DESC NULLS FIRST, 
         "attempts" ASC NULLS LAST, 
         "creation_time" ASC NULLS LAST 

In Opaleye:

proc () -> do
  row <- orderBy (desc jrPriority <> asc jrAttempts <> asc jrCreationTime) (jobQuery hdl) -< ()
  restrict -< jrQueue row .== pgStrictText queue
  codeRestrict -< row
  returnA -< row
  where
    codeRestrict = case mbCode of
      Nothing -> proc _row -> returnA -< ()
      Just code -> proc row -> restrict -< jrStatusCode row .== pgStrictText code

Opaleye-generated query:

SELECT
  "id0_1" as "result1_2"
  FROM (SELECT
        *
        FROM (SELECT
              *
              FROM (SELECT
                    *
                    FROM (SELECT
                          *
                          FROM (SELECT
                                *
                                FROM (SELECT
                                      "id" as "id0_1",
                                      "queue" as "queue1_1",
                                      "priority" as "priority2_1",
                                      "creation_time" as "creation_time3_1",
                                      "payload_json" as "payload_json4_1",
                                      "status_code" as "status_code5_1",
                                      "attempts" as "attempts6_1",
                                      "worker" as "worker7_1",
                                      "start_time" as "start_time8_1",
                                      "end_time" as "end_time9_1",
                                      "last_heartbeat" as "last_heartbeat10_1",
                                      "result_json" as "result_json11_1"
                                      FROM "myproject"."jobs" as "T1") as "T1") as "T1"
                          ORDER BY "priority2_1" DESC NULLS FIRST,
                                   "attempts6_1" ASC NULLS LAST,
                                   "creation_time3_1" ASC NULLS LAST) as "T1"
                    WHERE (("status_code5_1") = (CAST(E'pending' AS text))) AND (("queue1_1") = (CAST(E'myqueue' AS text)))) as "T1"
              LIMIT 1) as "T1") as "T1"

Performance:

postgres=# EXPLAIN ANALYZE SELECT "id0_1" as "result1_2" FROM (SELECT * FROM (SELECT * FROM (SELECT * FROM (SELECT * FROM (SELECT * FROM (SELECT "id" as "id0_1", "queue" as "queue1_1", "priority" as "priority2_1", "creation_time" as "creation_time3_1", "payload_json" as "payload_json4_1", "status_code" as "status_code5_1", "attempts" as "attempts6_1", "worker" as "worker7_1", "start_time" as "start_time8_1", "end_time" as "end_time9_1", "last_heartbeat" as "last_heartbeat10_1", "result_json" as "result_json11_1" FROM "myproject"."jobs" as "T1") as "T1") as "T1" ORDER BY "priority2_1" DESC NULLS FIRST, "attempts6_1" ASC NULLS LAST, "creation_time3_1" ASC NULLS LAST) as "T1" WHERE (("status_code5_1") = (CAST(E'pending' AS text))) AND (("queue1_1") = (CAST(E'myqueue' AS text)))) as "T1" LIMIT 1) as "T1") as "T1";

                                                                                  QUERY PLAN                                                                                   
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Subquery Scan on "T1"  (cost=12574.94..12574.96 rows=1 width=16) (actual time=34.448..34.449 rows=1 loops=1)
   ->  Limit  (cost=12574.94..12574.95 rows=1 width=220) (actual time=34.448..34.448 rows=1 loops=1)
         ->  Subquery Scan on "T1_1"  (cost=12574.94..12800.85 rows=18073 width=220) (actual time=34.446..34.446 rows=1 loops=1)
               ->  Sort  (cost=12574.94..12620.12 rows=18073 width=220) (actual time=34.445..34.445 rows=1 loops=1)
                     Sort Key: "T1_2".priority DESC, "T1_2".attempts, "T1_2".creation_time
                     Sort Method: quicksort  Memory: 3201kB
                     ->  Index Scan using jobs_queue_status_code_idx on jobs "T1_2"  (cost=0.42..10173.63 rows=18073 width=220) (actual time=0.031..15.184 rows=31133 loops=1)
                           Index Cond: ((queue = 'myqueue'::text) AND (status_code = 'pending'::text))
 Planning time: 0.240 ms
 Execution time: 34.646 ms

Simplifying the query, removing Opaleye's nesting, improves performance 2x (34 ms -> 18 ms), removing the Subquery Scan:

SELECT 
  "id", 
  "queue", 
  "priority", 
  "creation_time", 
  "payload_json", 
  "status_code", 
  "attempts", 
  "worker", 
  "start_time", 
  "end_time", 
  "last_heartbeat", 
  "result_json" 
FROM "myproject"."jobs" 
WHERE ("status_code" = (CAST(E'pending' AS text))) AND (("queue") = (CAST(E'myqueue' AS text))) 
ORDER BY "priority" DESC NULLS FIRST, 
         "attempts" ASC NULLS LAST, 
         "creation_time" ASC NULLS LAST 
LIMIT 1; 
                                                                         QUERY PLAN                                                                          
-------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=10264.00..10264.00 rows=1 width=1106) (actual time=18.530..18.531 rows=1 loops=1)
   ->  Sort  (cost=10264.00..10309.18 rows=18073 width=1106) (actual time=18.530..18.530 rows=1 loops=1)
         Sort Key: priority DESC, attempts, creation_time
         Sort Method: top-N heapsort  Memory: 25kB
         ->  Index Scan using jobs_queue_status_code_idx on jobs  (cost=0.42..10173.63 rows=18073 width=1106) (actual time=0.030..12.072 rows=31133 loops=1)
               Index Cond: ((queue = 'myqueue'::text) AND (status_code = 'pending'::text))
 Planning time: 0.099 ms
 Execution time: 18.553 ms

@nh2
Copy link

nh2 commented May 21, 2022

Also note how the sort changed:

-Sort Method: quicksort       Memory: 3201kB
+Sort Method: top-N heapsort  Memory:   25kB

@ocharles
Copy link
Collaborator

@nh2 Can you also share the PostgreSQL version this was ran on?

@nh2
Copy link

nh2 commented May 21, 2022

Postgresql 9.6 with recommended settings for SSD (changing to random_page_cost = '1.1', which can affect query planner results).

@ocharles
Copy link
Collaborator

Thanks. That's a fair few major versions back now, I wonder what PG 14 would make of the above query. https://dbfiddle.uk and friends might be able to help here, though it's a bit of work. (I'm also not saying "oh just upgrade" - obviously if we can fix this for PG 9.6, we should).

@tomjaguarpaw
Copy link
Owner

Thanks for the report @nh2. Could you please check that the performance degradation is not due to the relative placement of ORDER BY and WHERE? Specifically, in

orderBy (desc jrPriority <> asc jrAttempts <> asc jrCreationTime) (jobQuery hdl)

could you please put the restrict inside the orderBy, wrapping the (jobQuery hdl)?

If that doesn't recapture the missing performance then I will investigate how to remove these nested queries to see if that does the job. I'm not sure how I can most effectively proceed though, given that I have neither Postgres 9.6 (Debian stable is on 13.7) nor any data to run the query on. Suggestions welcome.

@tomjaguarpaw
Copy link
Owner

tomjaguarpaw commented May 21, 2022

Oh, I see the plan does the Limit before the Sort in any case, so perhaps the hope of improvements from that are forlorn. In any case I would be grateful if you could try out my suggestion, just to confirm.

EDIT: Hmm, I don't mean Limit before Sort, I mean Index Scan before Sort.

@nh2
Copy link

nh2 commented May 21, 2022

@tomjaguarpaw I can try to do that, but it may take some time.

I should also be able to try the same query on a local Postgres 13 or 14.

neither Postgres 9.6

That one is easy with Nix:

$ nix-shell -p postgresql_9_6 --run 'postgres --version' 
postgres (PostgreSQL) 9.6.24

@tomjaguarpaw
Copy link
Owner

OK, I understand what is happening in the EXPLAIN ANALYZEs. In the handwritten version it knows it can get away with a top-N heapsort, because it knows it is subsequently taking LIMIT 1. In the Opaleye version it uses quicksort, and (presumably) sorts the whole table, before discarding almost all of it.

I have to say, I think this is a weakness of Postgres, and I wouldn't be surprised if it were fixed in a version later than 9.6 (the first version in the 9.6 series was released nearly 6 years ago). So please check with

  • a more up-to-date Postgres, and
  • the restrict inside the orderBy (wrapping (jobQuery hdl))

In principle I'm willing to help try to fix this regardless, but I shall certainly be more motivated to do so if it's still bad when you try one, or especially both, of the above.

@ocharles
Copy link
Collaborator

I believe that upgrading to PG 14 won't fix this problem: https://dbfiddle.uk/?rdbms=postgres_14&fiddle=943a8f083302602b734c290242066432

@ocharles
Copy link
Collaborator

ocharles commented May 22, 2022

Here's a bit more info: https://dbfiddle.uk/?rdbms=postgres_14&fiddle=4a1eb1d20a03aa3ca0cc8ca1bd17d000

It looks like the important thing is that the LIMIT and the ORDER are combined.

@tomjaguarpaw
Copy link
Owner

Hmm, the "optimal" version there uses quicksort though (and they all have the same predicted "cost", but I'm not if that's meaningful without data in the table).

@ocharles
Copy link
Collaborator

I was more looking at the overall plan and when a subquery scan happened. I agree populating the table will probably matter too

@tomjaguarpaw
Copy link
Owner

  1. Does anyone understand what a subquery scan actually is? According to etutorials.org

    Each of these operators [subquery scan and subplan] are used for internal bookkeeping purposes and really don't affect the overall query plan?you can usually ignore them.

    So do they actually have any run time effect?

  2. This thread on the pgsql-performance mailing list seems to refer to a similar thing, and the reply notes some limitations on rewriting queries.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants