-
-
Notifications
You must be signed in to change notification settings - Fork 886
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Combine action tables #4459
Combine action tables #4459
Conversation
Before you go forward and spend too much time on this, it needs a lot of discussion, because we could lose a lot of data integrity solely for the sake of post_view query speed. An update to a There are a lot of inside-postgres things we could do before getting rid of the |
@dessalines Would that problem be fixed by using a composite type for each action that stores multiple values? Edit: or multi-column constraints, like |
I'm not sure I like that option either, at least for source data. The only thing I can think of rn, that would also help with the linked issue below, is to do what you're doing with the We desperately need some SQL experts that could help us with this one, as well as #2444 which is a similar problem. |
I dont think this implementation would create any problems with data integrity, as you have mandatory columns for person_id, post_id etc and then optional columns for each action. In effect its the same integrity we have with existing table definitions. There is a risk to read or write the wrong column, but that seems unlikely as we can keep using existing wrapper methods such as On the other hand storing the data in another table and using triggers will definitely give us consistency bugs, as happened with comment counts. So I would say go ahead with this approach. |
I've posted this to ![email protected] to see if any SQL experts can chime in on a correct way of doing this. |
I changed the implementation of the existing post functions to use the post_actions table. The only remotely scary thing is automatically deleting rows after all actions are unset. I will do that with a trigger that runs |
conflicts are resolved now |
We still gotta get more ppl than me looking at this. Its been on our PR list for too long, and it'll give a lot of potential performance benefits. |
My comments are not adressed yet. |
…ead_comments_amount
Did you actually compare the query plans eg for PostView before and after these changes to verify that there is a major benefit? These changes are very complex and can cause strange bugs from AssumeNotNull, as well as making future code changes much more difficult. So if there is only a minor benefit I would rather skip it and keep the current implementation. It may not be the most efficient, but at least its easy to understand and maintain. If we merge this then you definitely need to add tests for uplete.rs. In case there is a weird failure in api tests it would be very hard to track it down to a specific part of that file otherwise. |
Now there's tests in the uplete module. I don't remember checking the query plans and durations. I will do that soon. Or you could do it if you have enough time in the next few days, which should be super easy with scripts/db_perf.sh. If you do, remember to merge from main right before checking. I don't completely agree about the maintainability tradeoff. I think the current action-related code is completely the opposite of "easy to understand and maintain". There's already much simpler joins now with the combined tables, and maybe overall more ease in adding more actions. In the future there can be less maintainability problems by not using separate structs, or separate fields in views, for each individual action type. Let me know if you want me to reduce the assume_not_null risk before this PR is merged, at the expense of this PR taking a much longer time. |
Alright Ive approved it. |
@dullbananas once you work out the conflicts, we can get this merged. This is one of the oldest prs so it should take merging order priority so you don't have to keep maintaining it. |
With this PR merged there are some errors in scheduled tasks again:
|
I'll make an issue for that. |
Just saw this - I think you're missing some index(es). For listing saved posts with The existing |
This should cause a huge improvement in query plans, especially for queries that previously reached the from/join collapse limits. For example, getting saved posts might now start with an index scan of the post_actions table, which avoids scanning posts that the user didn't do anything with (or all non-saved posts if I add partial indexes, but I don't know if I should do that).
This will also make the code much cleaner and reduce the size of the database. (Edit: it may or may not reduce size)
Indexes for the new action tables will use
INCLUDE
WHERE
withIS NULL
for each action column to keep index-only scans possible.In the new joins, person_id will not use a bind parameter if it's
None
, so there can still be separate generic query plans for users that are not logged in.