From 306ba25c38f864280c9f78a06816d713f1113416 Mon Sep 17 00:00:00 2001 From: erikjohnston Date: Wed, 13 Dec 2023 16:39:14 +0000 Subject: deploy: 930dc9e2d3efd6d82b86c2205b80d6ccb9b4bb86 --- .../understanding_synapse_through_grafana_graphs.html | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) (limited to 'develop/usage/administration/understanding_synapse_through_grafana_graphs.html') diff --git a/develop/usage/administration/understanding_synapse_through_grafana_graphs.html b/develop/usage/administration/understanding_synapse_through_grafana_graphs.html index 051d85c464..11c85d56b1 100644 --- a/develop/usage/administration/understanding_synapse_through_grafana_graphs.html +++ b/develop/usage/administration/understanding_synapse_through_grafana_graphs.html @@ -163,11 +163,11 @@

It is possible to monitor much of the internal state of Synapse using Prometheus metrics and Grafana. A guide for configuring Synapse to provide metrics is available here -and information on setting up Grafana is here. +and information on setting up Grafana is here. In this setup, Prometheus will periodically scrape the information Synapse provides and store a record of it over time. Grafana is then used as an interface to query and present this information through a series of pretty graphs.

-

Once you have grafana set up, and assuming you're using our grafana dashboard template, look for the following graphs when debugging a slow/overloaded Synapse:

+

Once you have grafana set up, and assuming you're using our grafana dashboard template, look for the following graphs when debugging a slow/overloaded Synapse:

Message Event Send Time

image

This, along with the CPU and Memory graphs, is a good way to check the general health of your Synapse instance. It represents how long it takes for a user on your homeserver to send a message.

@@ -194,7 +194,7 @@ present this information through a series of pretty graphs.

This is quite a useful graph. It shows how many times Synapse attempts to retrieve a piece of data from a cache which the cache did not contain, thus resulting in a call to the database. We can see here that the _get_joined_profile_from_event_id cache is being requested a lot, and often the data we're after is not cached.

Cross-referencing this with the Eviction Rate graph, which shows that entries are being evicted from _get_joined_profile_from_event_id quite often:

image

-

we should probably consider raising the size of that cache by raising its cache factor (a multiplier value for the size of an individual cache). Information on doing so is available here (note that the configuration of individual cache factors through the configuration file is available in Synapse v1.14.0+, whereas doing so through environment variables has been supported for a very long time). Note that this will increase Synapse's overall memory usage.

+

we should probably consider raising the size of that cache by raising its cache factor (a multiplier value for the size of an individual cache). Information on doing so is available here (note that the configuration of individual cache factors through the configuration file is available in Synapse v1.14.0+, whereas doing so through environment variables has been supported for a very long time). Note that this will increase Synapse's overall memory usage.

Forward Extremities

image

Forward extremities are the leaf events at the end of a DAG in a room, aka events that have no children. The more that exist in a room, the more state resolution that Synapse needs to perform (hint: it's an expensive operation). While Synapse has code to prevent too many of these existing at one time in a room, bugs can sometimes make them crop up again.

-- cgit 1.5.1