Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sliding Sync: Lazy-loading room members on incremental sync (remember memberships) #17809

Open
wants to merge 31 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 20 commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
af9c72d
Correctly changes to required state config
erikjohnston Oct 1, 2024
2fd3a5a
Add tests
erikjohnston Oct 3, 2024
5e66e98
Newsfile
erikjohnston Oct 3, 2024
596581a
Sliding Sync: Parameterize and add more tests for tracking changes to…
MadLittleMods Oct 9, 2024
4b631a5
Apply suggestions from code review
erikjohnston Oct 9, 2024
e913037
Fixup
erikjohnston Oct 9, 2024
35532d6
comment __bool__
erikjohnston Oct 9, 2024
859d4a8
Fix handling of your own user ID
erikjohnston Oct 9, 2024
56565dc
Remove unused var
erikjohnston Oct 9, 2024
0d0207e
Remove TODOs
erikjohnston Oct 9, 2024
fafb2fe
Move comment down
erikjohnston Oct 9, 2024
c23d326
Clarify None comment
erikjohnston Oct 9, 2024
6391b0f
LAZY only applies to memberships
erikjohnston Oct 9, 2024
fbe9894
Lazy-loading room members on incremental sync (remember memberships)
MadLittleMods Oct 9, 2024
b9a8780
Add changelog
MadLittleMods Oct 9, 2024
760810f
Re-use `invalidated_state_keys`
MadLittleMods Oct 9, 2024
ddc9769
Better clarify `$ME` translation
MadLittleMods Oct 9, 2024
8b906c4
Fill in TODO
MadLittleMods Oct 9, 2024
2080cc5
Fix grammar
MadLittleMods Oct 9, 2024
0087ad2
Move `prev_room_sync_config` up
MadLittleMods Oct 9, 2024
f8feef5
Move `prev_room_sync_config` up
MadLittleMods Oct 9, 2024
a566347
Merge branch 'develop' into erikj/ss_required_state
erikjohnston Oct 10, 2024
4e662da
Comment why we're making new sets
MadLittleMods Oct 10, 2024
fb5bca4
Merge branch 'erikj/ss_required_state' into madlittlemods/sliding-syn…
MadLittleMods Oct 10, 2024
7e667fe
Merge branch 'develop' into madlittlemods/sliding-sync-lazy-load-memb…
MadLittleMods Oct 14, 2024
aec7455
Limit the number of state_keys that we remember
MadLittleMods Oct 16, 2024
87497c8
Update comments
MadLittleMods Oct 16, 2024
525222b
Fix bug when requesting more than we can remember
MadLittleMods Oct 16, 2024
90f0741
Merge branch 'develop' into madlittlemods/sliding-sync-lazy-load-memb…
MadLittleMods Oct 16, 2024
96c9a91
Fix bug remembering too much when over limit
MadLittleMods Oct 16, 2024
cf13ca0
Merge branch 'develop' into madlittlemods/sliding-sync-lazy-load-memb…
MadLittleMods Oct 17, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions changelog.d/17785.bugfix
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Fix bug with sliding sync where the server would not return state that was added to the `required_state` config.
1 change: 1 addition & 0 deletions changelog.d/17805.bugfix
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Fix bug with sliding sync where the server would not return state that was added to the `required_state` config.
1 change: 1 addition & 0 deletions changelog.d/17809.bugfix
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Fix bug with sliding sync where `$LAZY`-loading room members would not return `required_state` membership in incremental syncs.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am slightly worried about the increase in the size of the tables if we keep adding more and more members in there, but I'll have a little think about how we might make that more efficient.

More concretely: there's a few concerns I have:

  1. Over the lifetime of a connection the size of the store required state could grow very large (think Matrix HQ).
  2. Currently, we pull out everything when we get a request, so the size of the data matters
  3. Every time we change the state we end up copying all the rows from the previous connection position to the new position, so the amount of data again matters.

On point 3. I think we can change the DB tables up a bit so that we have one table that holds the base values that apply to all live positions in the connection (there will be max two), and then another table that are the delta from the base to that position. This means that we'd only need to copy over the deltas to the base and insert new deltas whenever we persist a new position (which should be a lot less data).

I don't really know what we do about not pulling lots of data from the DB. We could only pull out that data for the rooms we're sending down? Or maybe we we don't pull it out and instead query the DB for "which of these users have we previously sent membership down for?", though that feels like a bigger change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could only pull out that data for the rooms we're sending down?

This would probably be good to do in any case.

But it seems we're also concerned with how big an individual (room_id, user_id) could get.

I don't really know what we do about not pulling lots of data from the DB. We could only pull out that data for the rooms we're sending down? Or maybe we we don't pull it out and instead query the DB for "which of these users have we previously sent membership down for?", though that feels like a bigger change.

Are we set on wanting to track it? The dumb simple solution is to not track it at all which means returning a membership event for any timeline events.

Another alternative is that we could throw away a type when the state_key's grow too large. Or just throw the whole required_state_map when it grows too large. When I say throw-away, I just mean reset to whatever the requested required_state_map has. This is a decent middle ground between tracking nothing and everything. Ideally, we could kick out entries by recency but the complexity is probably not needed. It also doesn't require us to change our database schema.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we do want to track the sent memberships, otherwise you are often just doubling the number of events you have to pull out and send to the clients.

Another alternative is that we could throw away a type when the state_key's grow too large. Or just throw the whole required_state_map when it grows too large. When I say throw-away, I just mean reset to whatever the requested required_state_map has. This is a decent middle ground between tracking nothing and everything. Ideally, we could kick out entries by recency but the complexity is probably not needed. It also doesn't require us to change our database schema.

This is probably a decent idea that should be fairly easy to do? Even something like "if we have $LAZY and the set of members we've sent down is bigger than 100, reset", or something. It's not pretty but a) means for most rooms we'll never reset, and b) most of the time we won't send down redundant memberships.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to limit the state_keys we remember from previous requests to 100 for any type (also added tests).

MadLittleMods marked this conversation as resolved.
Show resolved Hide resolved
290 changes: 277 additions & 13 deletions synapse/handlers/sliding_sync/__init__.py

Large diffs are not rendered by default.

4 changes: 2 additions & 2 deletions synapse/storage/databases/main/sliding_sync.py
Original file line number Diff line number Diff line change
Expand Up @@ -386,8 +386,8 @@ def _get_and_clear_connection_positions_txn(
required_state_map: Dict[int, Dict[str, Set[str]]] = {}
for row in rows:
state = required_state_map[row[0]] = {}
for event_type, state_keys in db_to_json(row[1]):
state[event_type] = set(state_keys)
for event_type, state_key in db_to_json(row[1]):
state.setdefault(event_type, set()).add(state_key)

# Get all the room configs, looking up the required state from the map
# above.
Expand Down
7 changes: 7 additions & 0 deletions synapse/types/state.py
Original file line number Diff line number Diff line change
Expand Up @@ -616,6 +616,13 @@ def __contains__(self, key: Any) -> bool:

return False

def __bool__(self) -> bool:
"""Returns true if this state filter will match any state, or false if
this is the empty filter"""
if self.include_others:
return True
return bool(self.types)


_ALL_STATE_FILTER = StateFilter(types=immutabledict(), include_others=True)
_ALL_NON_MEMBER_STATE_FILTER = StateFilter(
Expand Down
924 changes: 922 additions & 2 deletions tests/handlers/test_sliding_sync.py

Large diffs are not rendered by default.

Loading
Loading