Remove includes course sites on provider courses page #4782
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Context
There is a performance issue for some providers which has many courses to display in the courses list page.
The line in the controller in the culprit:
This is generating a rails default query where the main nested loop are being higher and making the query takes more than 500ms on average sometimes reaching more than x seconds (it varies).
Explain
Taking a look on the query that the includes generates:
If you run the query above you will find some interesting info.
If I can summarise in a statement of my interpretation is:
Somehow that it requires more deep investigations Rails default includes generates the query above and the query above does not use the composite index of course_site.course_id and course_site.site_id.
The details of the explain
For each row in this result, the system performs an inner scan (the inner
loop) to match records from site_statuses and PK_ucas_campus. This can
result in a high number of rows if the tables being joined are large or
if there's a large number of qualifying rows in the tables.
Rows in the actual result suggests that the query planner was
likely underestimating the number of rows returned from the join.
This can occur if there is high cardinality (many different combinations of records) between the tables
involved in the join.
The high row count suggests that each course_site row is joining with multiple rows in site_statuses.
If each course_site record matches many site_statuses rows, the number of output rows can grow exponentially.
For every row returned by the first index scan (course_site), there are
5775 iterations (loops) over the site data and site_statuses,
which increases the overall row count.
Changes proposed in this pull request
Removal the includes for now. Don't merge without testing on the review app
Guidance to review