From 826e6ec3bdfca92a5815f032c3246fab9a2aef88 Mon Sep 17 00:00:00 2001 From: Jorik Schellekens Date: Mon, 22 Jul 2019 11:15:21 +0100 Subject: Opentracing Documentation (#5703) * Opentracing survival guide * Update decorator names in doc * Doc cleanup These are all alterations as a result of comments in #5703, it includes mostly typos and clarifications. The most interesting changes are: - Split developer and user docs into two sections - Add a high level description of OpenTracing * newsfile * Move contributer specific info to docstring. * Sample config. * Trailing whitespace. * Update 5703.misc * Apply suggestions from code review Mostly just rewording parts of the docs for clarity. Co-Authored-By: Richard van der Hoff <1389908+richvdh@users.noreply.github.com> --- docs/opentracing.rst | 100 +++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 100 insertions(+) create mode 100644 docs/opentracing.rst (limited to 'docs/opentracing.rst') diff --git a/docs/opentracing.rst b/docs/opentracing.rst new file mode 100644 index 0000000000..b91a2208a8 --- /dev/null +++ b/docs/opentracing.rst @@ -0,0 +1,100 @@ +=========== +OpenTracing +=========== + +Background +---------- + +OpenTracing is a semi-standard being adopted by a number of distributed tracing +platforms. It is a common api for facilitating vendor-agnostic tracing +instrumentation. That is, we can use the OpenTracing api and select one of a +number of tracer implementations to do the heavy lifting in the background. +Our current selected implementation is Jaeger. + +OpenTracing is a tool which gives an insight into the causal relationship of +work done in and between servers. The servers each track events and report them +to a centralised server - in Synapse's case: Jaeger. The basic unit used to +represent events is the span. The span roughly represents a single piece of work +that was done and the time at which it occurred. A span can have child spans, +meaning that the work of the child had to be completed for the parent span to +complete, or it can have follow-on spans which represent work that is undertaken +as a result of the parent but is not depended on by the parent to in order to +finish. + +Since this is undertaken in a distributed environment a request to another +server, such as an RPC or a simple GET, can be considered a span (a unit or +work) for the local server. This causal link is what OpenTracing aims to +capture and visualise. In order to do this metadata about the local server's +span, i.e the 'span context', needs to be included with the request to the +remote. + +It is up to the remote server to decide what it does with the spans +it creates. This is called the sampling policy and it can be configured +through Jaeger's settings. + +For OpenTracing concepts see +https://opentracing.io/docs/overview/what-is-tracing/. + +For more information about Jaeger's implementation see +https://www.jaegertracing.io/docs/ + +===================== +Seting up OpenTracing +===================== + +To receive OpenTracing spans, start up a Jaeger server. This can be done +using docker like so: + +.. code-block:: bash + + docker run -d --name jaeger + -p 6831:6831/udp \ + -p 6832:6832/udp \ + -p 5778:5778 \ + -p 16686:16686 \ + -p 14268:14268 \ + jaegertracing/all-in-one:1.13 + +Latest documentation is probably at +https://www.jaegertracing.io/docs/1.13/getting-started/ + + +Enable OpenTracing in Synapse +----------------------------- + +OpenTracing is not enabled by default. It must be enabled in the homeserver +config by uncommenting the config options under ``opentracing`` as shown in +the `sample config <./sample_config.yaml>`_. For example: + +.. code-block:: yaml + + opentracing: + tracer_enabled: true + homeserver_whitelist: + - "mytrustedhomeserver.org" + - "*.myotherhomeservers.com" + +Homeserver whitelisting +----------------------- + +The homeserver whitelist is configured using regular expressions. A list of regular +expressions can be given and their union will be compared when propagating any +spans contexts to another homeserver. + +Though it's mostly safe to send and receive span contexts to and from +untrusted users since span contexts are usually opaque ids it can lead to +two problems, namely: + +- If the span context is marked as sampled by the sending homeserver the receiver will + sample it. Therefore two homeservers with wildly different sampling policies + could incur higher sampling counts than intended. +- Sending servers can attach arbitrary data to spans, known as 'baggage'. For safety this has been disabled in Synapse + but that doesn't prevent another server sending you baggage which will be logged + to OpenTracing's logs. + +================== +Configuring Jaeger +================== + +Sampling strategies can be set as in this document: +https://www.jaegertracing.io/docs/1.13/sampling/ -- cgit 1.4.1