summary refs log tree commit diff
path: root/docs/metrics-howto.rst
diff options
context:
space:
mode:
authorMichael Telatynski <7t3chguy@gmail.com>2018-07-24 17:17:46 +0100
committerMichael Telatynski <7t3chguy@gmail.com>2018-07-24 17:17:46 +0100
commit87951d3891efb5bccedf72c12b3da0d6ab482253 (patch)
treede7d997567c66c5a4d8743c1f3b9d6b474f5cfd9 /docs/metrics-howto.rst
parentif inviter_display_name == ""||None then default to inviter MXID (diff)
parentMerge pull request #3595 from matrix-org/erikj/use_deltas (diff)
downloadsynapse-87951d3891efb5bccedf72c12b3da0d6ab482253.tar.xz
Merge branch 'develop' of github.com:matrix-org/synapse into t3chguy/default_inviter_display_name_3pid
Diffstat (limited to 'docs/metrics-howto.rst')
-rw-r--r--docs/metrics-howto.rst148
1 files changed, 130 insertions, 18 deletions
diff --git a/docs/metrics-howto.rst b/docs/metrics-howto.rst
index 143cd0f42f..5bbb5a4f3a 100644
--- a/docs/metrics-howto.rst
+++ b/docs/metrics-howto.rst
@@ -1,25 +1,47 @@
 How to monitor Synapse metrics using Prometheus
 ===============================================
 
-1. Install prometheus:
+1. Install Prometheus:
 
    Follow instructions at http://prometheus.io/docs/introduction/install/
 
-2. Enable synapse metrics:
+2. Enable Synapse metrics:
 
-   Simply setting a (local) port number will enable it. Pick a port.
-   prometheus itself defaults to 9090, so starting just above that for
-   locally monitored services seems reasonable. E.g. 9092:
+   There are two methods of enabling metrics in Synapse.
 
-   Add to homeserver.yaml::
+   The first serves the metrics as a part of the usual web server and can be
+   enabled by adding the "metrics" resource to the existing listener as such::
 
-     metrics_port: 9092
+     resources:
+       - names:
+         - client
+         - metrics
 
-   Also ensure that ``enable_metrics`` is set to ``True``.
-  
-   Restart synapse.
+   This provides a simple way of adding metrics to your Synapse installation,
+   and serves under ``/_synapse/metrics``. If you do not wish your metrics be
+   publicly exposed, you will need to either filter it out at your load
+   balancer, or use the second method.
 
-3. Add a prometheus target for synapse.
+   The second method runs the metrics server on a different port, in a
+   different thread to Synapse. This can make it more resilient to heavy load
+   meaning metrics cannot be retrieved, and can be exposed to just internal
+   networks easier. The served metrics are available over HTTP only, and will
+   be available at ``/``.
+
+   Add a new listener to homeserver.yaml::
+
+     listeners:
+       - type: metrics
+         port: 9000
+         bind_addresses:
+           - '0.0.0.0'
+
+   For both options, you will need to ensure that ``enable_metrics`` is set to
+   ``True``.
+
+   Restart Synapse.
+
+3. Add a Prometheus target for Synapse.
 
    It needs to set the ``metrics_path`` to a non-default value (under ``scrape_configs``)::
 
@@ -28,10 +50,100 @@ How to monitor Synapse metrics using Prometheus
       static_configs:
         - targets: ["my.server.here:9092"]
 
-   If your prometheus is older than 1.5.2, you will need to replace 
+   If your prometheus is older than 1.5.2, you will need to replace
    ``static_configs`` in the above with ``target_groups``.
-   
-   Restart prometheus.
+
+   Restart Prometheus.
+
+
+Removal of deprecated metrics & time based counters becoming histograms in 0.31.0
+---------------------------------------------------------------------------------
+
+The duplicated metrics deprecated in Synapse 0.27.0 have been removed.
+
+All time duration-based metrics have been changed to be seconds. This affects:
+
++----------------------------------+
+| msec -> sec metrics              |
++==================================+
+| python_gc_time                   |
++----------------------------------+
+| python_twisted_reactor_tick_time |
++----------------------------------+
+| synapse_storage_query_time       |
++----------------------------------+
+| synapse_storage_schedule_time    |
++----------------------------------+
+| synapse_storage_transaction_time |
++----------------------------------+
+
+Several metrics have been changed to be histograms, which sort entries into
+buckets and allow better analysis. The following metrics are now histograms:
+
++-------------------------------------------+
+| Altered metrics                           |
++===========================================+
+| python_gc_time                            |
++-------------------------------------------+
+| python_twisted_reactor_pending_calls      |
++-------------------------------------------+
+| python_twisted_reactor_tick_time          |
++-------------------------------------------+
+| synapse_http_server_response_time_seconds |
++-------------------------------------------+
+| synapse_storage_query_time                |
++-------------------------------------------+
+| synapse_storage_schedule_time             |
++-------------------------------------------+
+| synapse_storage_transaction_time          |
++-------------------------------------------+
+
+
+Block and response metrics renamed for 0.27.0
+---------------------------------------------
+
+Synapse 0.27.0 begins the process of rationalising the duplicate ``*:count``
+metrics reported for the resource tracking for code blocks and HTTP requests.
+
+At the same time, the corresponding ``*:total`` metrics are being renamed, as
+the ``:total`` suffix no longer makes sense in the absence of a corresponding
+``:count`` metric.
+
+To enable a graceful migration path, this release just adds new names for the
+metrics being renamed. A future release will remove the old ones.
+
+The following table shows the new metrics, and the old metrics which they are
+replacing.
+
+==================================================== ===================================================
+New name                                             Old name
+==================================================== ===================================================
+synapse_util_metrics_block_count                     synapse_util_metrics_block_timer:count
+synapse_util_metrics_block_count                     synapse_util_metrics_block_ru_utime:count
+synapse_util_metrics_block_count                     synapse_util_metrics_block_ru_stime:count
+synapse_util_metrics_block_count                     synapse_util_metrics_block_db_txn_count:count
+synapse_util_metrics_block_count                     synapse_util_metrics_block_db_txn_duration:count
+
+synapse_util_metrics_block_time_seconds              synapse_util_metrics_block_timer:total
+synapse_util_metrics_block_ru_utime_seconds          synapse_util_metrics_block_ru_utime:total
+synapse_util_metrics_block_ru_stime_seconds          synapse_util_metrics_block_ru_stime:total
+synapse_util_metrics_block_db_txn_count              synapse_util_metrics_block_db_txn_count:total
+synapse_util_metrics_block_db_txn_duration_seconds   synapse_util_metrics_block_db_txn_duration:total
+
+synapse_http_server_response_count                   synapse_http_server_requests
+synapse_http_server_response_count                   synapse_http_server_response_time:count
+synapse_http_server_response_count                   synapse_http_server_response_ru_utime:count
+synapse_http_server_response_count                   synapse_http_server_response_ru_stime:count
+synapse_http_server_response_count                   synapse_http_server_response_db_txn_count:count
+synapse_http_server_response_count                   synapse_http_server_response_db_txn_duration:count
+
+synapse_http_server_response_time_seconds            synapse_http_server_response_time:total
+synapse_http_server_response_ru_utime_seconds        synapse_http_server_response_ru_utime:total
+synapse_http_server_response_ru_stime_seconds        synapse_http_server_response_ru_stime:total
+synapse_http_server_response_db_txn_count            synapse_http_server_response_db_txn_count:total
+synapse_http_server_response_db_txn_duration_seconds synapse_http_server_response_db_txn_duration:total
+==================================================== ===================================================
+
 
 Standard Metric Names
 ---------------------
@@ -42,7 +154,7 @@ have been changed to seconds, from miliseconds.
 
 ================================== =============================
 New name                           Old name
----------------------------------- -----------------------------
+================================== =============================
 process_cpu_user_seconds_total     process_resource_utime / 1000
 process_cpu_system_seconds_total   process_resource_stime / 1000
 process_open_fds (no 'type' label) process_fds
@@ -52,8 +164,8 @@ The python-specific counts of garbage collector performance have been renamed.
 
 =========================== ======================
 New name                    Old name
---------------------------- ----------------------
-python_gc_time              reactor_gc_time      
+=========================== ======================
+python_gc_time              reactor_gc_time
 python_gc_unreachable_total reactor_gc_unreachable
 python_gc_counts            reactor_gc_counts
 =========================== ======================
@@ -62,7 +174,7 @@ The twisted-specific reactor metrics have been renamed.
 
 ==================================== =====================
 New name                             Old name
------------------------------------- ---------------------
+==================================== =====================
 python_twisted_reactor_pending_calls reactor_pending_calls
 python_twisted_reactor_tick_time     reactor_tick_time
 ==================================== =====================