summary refs log tree commit diff
path: root/docs/room_and_user_statistics.md
blob: e1facb38d41484d9d14a35cda2fceb4ef5962d3f (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
Room and User Statistics
========================

Synapse maintains room and user statistics (as well as a cache of room state),
in various tables. These can be used for administrative purposes but are also
used when generating the public room directory.


# Synapse Developer Documentation

## High-Level Concepts

### Definitions

* **subject**: Something we are tracking stats about – currently a room or user.
* **current row**: An entry for a subject in the appropriate current statistics
    table. Each subject can have only one.
* **historical row**: An entry for a subject in the appropriate historical
    statistics table. Each subject can have any number of these.

### Overview

Stats are maintained as time series. There are two kinds of column:

* absolute columns – where the value is correct for the time given by `end_ts`
    in the stats row. (Imagine a line graph for these values)
    * They can also be thought of as 'gauges' in Prometheus, if you are familiar.
* per-slice columns – where the value corresponds to how many of the occurrences
    occurred within the time slice given by `(end_ts − bucket_size)…end_ts`
    or `start_ts…end_ts`. (Imagine a histogram for these values)

Stats are maintained in two tables (for each type): current and historical.

Current stats correspond to the present values. Each subject can only have one
entry.

Historical stats correspond to values in the past. Subjects may have multiple
entries.

## Concepts around the management of stats

### Current rows

Current rows contain the most up-to-date statistics for a room.
They only contain absolute columns

### Historical rows

Historical rows can always be considered to be valid for the time slice and
end time specified.

* historical rows will not exist for every time slice – they will be omitted
    if there were no changes. In this case, the following assumptions can be
    made to interpolate/recreate missing rows:
    - absolute fields have the same values as in the preceding row
    - per-slice fields are zero (`0`)
* historical rows will not be retained forever – rows older than a configurable
    time will be purged.

#### Purge

The purging of historical rows is not yet implemented.