Room DAG concepts
-Edges
-The word "edge" comes from graph theory lingo. An edge is just a connection
-between two events. In Synapse, we connect events by specifying their
-prev_events
. A subsequent event points back at a previous event.
A (oldest) <---- B <---- C (most recent)
-
-Depth and stream ordering
-Events are normally sorted by (topological_ordering, stream_ordering)
where
-topological_ordering
is just depth
. In other words, we first sort by depth
-and then tie-break based on stream_ordering
. depth
is incremented as new
-messages are added to the DAG. Normally, stream_ordering
is an auto
-incrementing integer, but backfilled events start with stream_ordering=-1
and decrement.
-
-
-
/sync
returns things in the order they arrive at the server (stream_ordering
).
-/messages
(and/backfill
in the federation API) return them in the order determined by the event graph(topological_ordering, stream_ordering)
.
-
The general idea is that, if you're following a room in real-time (i.e.
-/sync
), you probably want to see the messages as they arrive at your server,
-rather than skipping any that arrived late; whereas if you're looking at a
-historical section of timeline (i.e. /messages
), you want to see the best
-representation of the state of the room as others were seeing it at the time.
Forward extremity
-Most-recent-in-time events in the DAG which are not referenced by any other events' prev_events
yet.
The forward extremities of a room are used as the prev_events
when the next event is sent.
Backwards extremity
-The current marker of where we have backfilled up to and will generally be the -oldest-in-time events we know of in the DAG.
-This is an event where we haven't fetched all of the prev_events
for.
Once we have fetched all of its prev_events
, it's unmarked as a backwards
-extremity (although we may have formed new backwards extremities from the prev
-events during the backfilling process).
Outliers
-We mark an event as an outlier
when we haven't figured out the state for the
-room at that point in the DAG yet.
We won't necessarily have the prev_events
of an outlier
in the database,
-but it's entirely possible that we might. The status of whether we have all of
-the prev_events
is marked as a backwards extremity.
For example, when we fetch the event auth chain or state for a given event, we -mark all of those claimed auth events as outliers because we haven't done the -state calculation ourself.
-State groups
-For every non-outlier event we need to know the state at that event. Instead of
-storing the full state for each event in the DB (i.e. a event_id -> state
-mapping), which is very space inefficient when state doesn't change, we
-instead assign each different set of state a "state group" and then have
-mappings of event_id -> state_group
and state_group -> state
.
Stage group edges
-TODO: state_group_edges
is a further optimization...
-notes from @Azrenbeth, https://pastebin.com/seUGVGeT