Refactor Grafana dashboard to use `server_name` label by MadLittleMods · Pull Request #19337 · element-hq/synapse

MadLittleMods · 2026-01-02T19:43:42Z

Refactor Grafana dashboard to use server_name label:

Update synapse_xxx (server-level) metrics to use server_name="$server_name", instead of instance="$instance"
Add synapse_server_name_info metric to map Synapse server_names to the instances they're hosted on.
For process level metrics, update to use xxx * on (instance, job, index) group_left(server_name) synapse_server_name_info{server_name="$server_name"}

All of the changes here are backwards compatible with whatever people were doing before with their Prometheus/Grafana dashboards.

Previously, the recommendation was to use the instance label to group everything under the same server:

synapse/docs/metrics-howto.md

Lines 93 to 147 in 803e4b4

    
           ## Monitoring workers 
        
           To monitor a Synapse installation using [workers](workers.md), 
        
           every worker needs to be monitored independently, in addition to 
        
           the main homeserver process. This is because workers don't send 
        
           their metrics to the main homeserver process, but expose them 
        
           directly (if they are configured to do so). 
        
           To allow collecting metrics from a worker, you need to add a 
        
           `metrics` listener to its configuration, by adding the following 
        
           under `worker_listeners`: 
        
           ```yaml 
        
             - type: metrics 
        
               bind_address: '' 
        
               port: 9101 
        
           ``` 
        
           The `bind_address` and `port` parameters should be set so that 
        
           the resulting listener can be reached by prometheus, and they 
        
           don't clash with an existing worker. 
        
           With this example, the worker's metrics would then be available 
        
           on `http://127.0.0.1:9101`. 
        
           Example Prometheus target for Synapse with workers: 
        
           ```yaml 
        
             - job_name: "synapse" 
        
               scrape_interval: 15s 
        
               metrics_path: "/_synapse/metrics" 
        
               static_configs: 
        
                 - targets: ["my.server.here:port"] 
        
                   labels: 
        
                     instance: "my.server" 
        
                     job: "master" 
        
                     index: 1 
        
                 - targets: ["my.workerserver.here:port"] 
        
                   labels: 
        
                     instance: "my.server" 
        
                     job: "generic_worker" 
        
                     index: 1 
        
                 - targets: ["my.workerserver.here:port"] 
        
                   labels: 
        
                     instance: "my.server" 
        
                     job: "generic_worker" 
        
                     index: 2 
        
                 - targets: ["my.workerserver.here:port"] 
        
                   labels: 
        
                     instance: "my.server" 
        
                     job: "media_repository" 
        
                     index: 1 
        
           ``` 
        
           Labels (`instance`, `job`, `index`) can be defined as anything. 
        
           The labels are used to group graphs in grafana.

But the instance label actually has a special meaning and we're actually abusing it by using it that way:

instance: The <host>:<port> part of the target's URL that was scraped.

-- https://prometheus.io/docs/concepts/jobs_instances/#automatically-generated-labels-and-time-series

Since #18592 (Synapse v1.139.0), we now have the server_name label to use instead.

Additionally, the assumption that a single process is serving a single server is no longer true with Synapse Pro for small hosts.

Part of https://github.com/element-hq/synapse-small-hosts/issues/106

Motivating use case

Although this change also benefits Synapse Pro for small hosts (https://github.com/element-hq/synapse-small-hosts/issues/106), this is actually spawning from adding Prometheus metrics to our workerized Docker image (#19324, #19336) with a more correct label setup (without instance) and wanting the dashboard to be better.

Testing strategy

Make sure your firewall allows the Docker containers to communicate to the host (host.docker.internal) so they can access exposed ports of other Docker containers. We want to allow Synapse to access the Prometheus container and Grafana to access to the Prometheus container.
- sudo ufw allow in on docker0 comment "Allow traffic from the default Docker network to the host machine (host.docker.internal)"
- sudo ufw allow in on br-+ comment "(from Matrix Complement testing) Allow traffic from custom Docker networks to the host machine (host.docker.internal)"
- Complement firewall docs
Build the Docker image for Synapse: docker build -t matrixdotorg/synapse -f docker/Dockerfile . (docs)

Generate config for Synapse:

docker run -it --rm \
    --mount type=volume,src=synapse-data,dst=/data \
    -e SYNAPSE_SERVER_NAME=my.docker.synapse.server \
    -e SYNAPSE_REPORT_STATS=yes \
    -e SYNAPSE_ENABLE_METRICS=1 \
    matrixdotorg/synapse:latest generate

Start Synapse:

docker run -d --name synapse \
   --mount type=volume,src=synapse-data,dst=/data \
   -p 8008:8008 \
   -p 19090:19090 \
   matrixdotorg/synapse:latest

You should be able to see metrics from Synapse at http://localhost:19090/_synapse/metrics

Create a Prometheus config (prometheus.yml)

global:
  scrape_interval: 15s
  scrape_timeout: 15s
  evaluation_interval: 15s

scrape_configs:
  - job_name: prometheus
    scrape_interval: 15s
    metrics_path: /_synapse/metrics
    scheme: http
    static_configs:
      - targets:
          # This should point to the Synapse metrics listener (we're using `host.docker.internal` because this is from within the Prometheus container)
          - host.docker.internal:19090

Start Prometheus (update the volume bind mount to the config you just saved somewhere):

docker run \
    --detach \
    --name=prometheus \
    --add-host host.docker.internal:host-gateway \
    -p 9090:9090 \
    -v ~/Documents/code/random/prometheus-config/prometheus.yml:/etc/prometheus/prometheus.yml \
    prom/prometheus

Make sure you're seeing some data in Prometheus. On http://localhost:9090/query, search for synapse_build_info

Start Grafana

docker run -d --name=grafana --add-host host.docker.internal:host-gateway -p 3000:3000 grafana/grafana

Visit the Grafana dashboard, http://localhost:3000/ (Credentials: admin/admin)
Connections -> Data Sources -> Add data source -> Prometheus
- Prometheus server URL: http://host.docker.internal:9090
Import the Synapse dashboard: contrib/grafana/synapse.json

To test workers, you can use the testing strategy from #19336 (assumes both changes from this PR and the other PR are combined)

Dev notes

How to stress the deploys annotation:

docker exec -it synapse /bin/bash
vim /usr/local/lib/python3.13/site-packages/synapse/util/__init__.py
Edit SYNAPSE_VERSION
docker stop synapse
docker start synapse

Todo

Add server_name variable to dashboard
Figure out how to better display the process_ metrics
Update deploys annotation

Pull Request Checklist

Pull request is based on the develop branch
Pull request includes a changelog file. The entry should:
- Be a short description of your change which makes sense to users. "Fixed a bug that prevented receiving messages from other servers." instead of "Moved X method from EventStore to EventWorkerStore.".
- Use markdown where necessary, mostly for code blocks.
- End with either a period (.) or an exclamation mark (!).
- Start with a capital letter.
- Feel free to credit yourself, by adding a sentence "Contributed by @github_username." or "Contributed by [Your Name]." to the end of the entry.
Code style is correct (run the linters)

… (`$server_name`)

…name-refactor

MadLittleMods · 2026-01-06T00:43:58Z

contrib/grafana/synapse.json

@@ -195,7 +195,7 @@
          "datasource": {
            "uid": "${DS_PROMETHEUS}"
          },
-          "expr": "sum(rate(synapse_http_server_response_time_seconds_bucket{servlet='RoomSendEventRestServlet',instance=\"$instance\",code=~\"2..\"}[$bucket_size])) by (le)",
+          "expr": "sum(rate(synapse_http_server_response_time_seconds_bucket{servlet='RoomSendEventRestServlet',server_name=\"$server_name\",code=~\"2..\"}[$bucket_size])) by (le)",


Updating all of the synapse_xxx server-level metrics to use $server_name instead of $instance

…nstance_mapping`

contrib/grafana/synapse.json

MadLittleMods · 2026-01-06T01:04:56Z

docs/metrics-howto.md

          index: 2
      - targets: ["my.workerserver.here:port"]
        labels:
-          instance: "my.server"


No longer needed as the recommendation is to rely on the server_name label

…name-refactor

This helps when scraping from the same `instance`. You can see that we already did this kind of thing for `synapse_storage_events_stale_forward_extremities_persisted_bucket` and `synapse_http_httppusher_http_pushes_processed_total`. Example data this helps with: ``` synapse_server_name_to_instance_mapping{instance="host.docker.internal:33074"} synapse_server_name_to_instance_mapping{index="2", instance="host.docker.internal:33074", job="event_persister", server_name="hs1"} 1 synapse_server_name_to_instance_mapping{index="1", instance="host.docker.internal:33074", job="federation_reader", server_name="hs1"} 1 synapse_server_name_to_instance_mapping{index="1", instance="host.docker.internal:33074", job="user_dir", server_name="hs1"} 1 synapse_server_name_to_instance_mapping{index="1", instance="host.docker.internal:33074", job="background_worker", server_name="hs1"} 1 synapse_server_name_to_instance_mapping{index="1", instance="host.docker.internal:33074", job="event_persister", server_name="hs1"} 1 synapse_server_name_to_instance_mapping{index="1", instance="host.docker.internal:33074", job="media_repository", server_name="hs1"} 1 synapse_server_name_to_instance_mapping{index="1", instance="host.docker.internal:33074", job="main", server_name="hs1"} 1 synapse_server_name_to_instance_mapping{index="1", instance="host.docker.internal:33074", job="federation_inbound", server_name="hs1"} 1 synapse_server_name_to_instance_mapping{index="1", instance="host.docker.internal:33074", job="stream_writers", server_name="hs1"} 1 synapse_server_name_to_instance_mapping{index="1", instance="host.docker.internal:33074", job="client_reader", server_name="hs1"} 1 synapse_server_name_to_instance_mapping{index="1", instance="host.docker.internal:33074", job="pusher", server_name="hs1"} 1 synapse_server_name_to_instance_mapping{index="1", instance="host.docker.internal:33074", job="event_creator", server_name="hs1"} 1 synapse_server_name_to_instance_mapping{index="1", instance="host.docker.internal:33074", job="device_lists", server_name="hs1"} 1 synapse_server_name_to_instance_mapping{index="1", instance="host.docker.internal:33074", job="appservice", server_name="hs1"} 1 synapse_server_name_to_instance_mapping{index="2", instance="host.docker.internal:33074", job="device_lists", server_name="hs1"} 1 synapse_server_name_to_instance_mapping{index="1", instance="host.docker.internal:33074", job="synchrotron", server_name="hs1"} 1 synapse_server_name_to_instance_mapping{index="1", instance="host.docker.internal:33074", job="federation_sender", server_name="hs1"} 1 ``` ``` process_cpu_seconds_total{instance="host.docker.internal:33074"} process_cpu_seconds_total{index="2", instance="host.docker.internal:33074", job="event_persister"} 1.68 process_cpu_seconds_total{index="1", instance="host.docker.internal:33074", job="federation_reader"} 1.08 process_cpu_seconds_total{index="1", instance="host.docker.internal:33074", job="user_dir"} 0.97 process_cpu_seconds_total{index="1", instance="host.docker.internal:33074", job="background_worker"} 1.02 process_cpu_seconds_total{index="1", instance="host.docker.internal:33074", job="event_persister"} 1.45 process_cpu_seconds_total{index="1", instance="host.docker.internal:33074", job="media_repository"} 0.69 process_cpu_seconds_total{index="1", instance="host.docker.internal:33074", job="main"} 2.5300000000000002 process_cpu_seconds_total{index="1", instance="host.docker.internal:33074", job="federation_inbound"} 1.11 process_cpu_seconds_total{index="1", instance="host.docker.internal:33074", job="stream_writers"} 1.33 process_cpu_seconds_total{index="1", instance="host.docker.internal:33074", job="client_reader"} 0.94 process_cpu_seconds_total{index="1", instance="host.docker.internal:33074", job="pusher"} 0.67 process_cpu_seconds_total{index="1", instance="host.docker.internal:33074", job="event_creator"} 1.25 process_cpu_seconds_total{index="1", instance="host.docker.internal:33074", job="device_lists"} 0.8099999999999999 process_cpu_seconds_total{index="1", instance="host.docker.internal:33074", job="appservice"} 0.69 process_cpu_seconds_total{index="2", instance="host.docker.internal:33074", job="device_lists"} 0.76 process_cpu_seconds_total{index="1", instance="host.docker.internal:33074", job="synchrotron"} 2.21 process_cpu_seconds_total{index="1", instance="host.docker.internal:33074", job="federation_sender"} 1.3299999999999998 ```

MadLittleMods · 2026-01-13T18:15:25Z

contrib/grafana/synapse.json

@@ -541,7 +541,7 @@
          "datasource": {
            "uid": "${DS_PROMETHEUS}"
          },
-          "expr": "rate(process_cpu_seconds_total{instance=\"$instance\",job=~\"$job\",index=~\"$index\"}[$bucket_size])",
+          "expr": "rate(process_cpu_seconds_total{job=~\"$job\",index=~\"$index\"}[$bucket_size]) * on (instance, job, index) group_left(server_name)\nsynapse_server_name_to_instance_mapping{server_name=\"$server_name\"}",


Updating all of the process-level metrics with this pattern:

xxx * on(instance, job, index) group_left(server_name) synapse_server_name_to_instance_mapping{server_name="$server_name"}

With instance_to_server_name_mapping, looking up the instance that the $server_name lives to match against the process-level metric.

MadLittleMods · 2026-01-13T18:16:13Z

contrib/grafana/synapse.json

@@ -53,7 +53,7 @@
          "uid": "${DS_PROMETHEUS}"
        },
        "enable": true,
-        "expr": "changes(process_start_time_seconds{instance=\"$instance\",job=~\"synapse\"}[$bucket_size]) * on (instance, job) group_left(version) synapse_build_info{instance=\"$instance\",job=\"synapse\"}",
+        "expr": "(\n  changes(process_start_time_seconds{job=\"synapse\"}[$bucket_size]) * on (instance, job, index) group_left(server_name)\n  synapse_server_name_to_instance_mapping{server_name=\"$server_name\"}\n) * on (instance, job, index) group_left(version) synapse_build_info{job=\"synapse\"}",


Skip reviewing the annotation query until the end (because it's the most complicated change)

It's the same basic pattern upgrade as the other process-level metrics but we just have an extra * on ... layer

reivilibre · 2026-01-14T17:31:30Z

changelog.d/19337.feature

@@ -0,0 +1 @@
+Refactor Grafana dashboard to use `server_name` label (instead of `instance`).


But the instance label actually has a special meaning and we're actually abusing it by using it that way:

instance: The : part of the target's URL that was scraped.

-- prometheus.io/docs/concepts/jobs_instances#automatically-generated-labels-and-time-series

Not really material to this actual PR, but just so you know I think that's just a 'sane default' but in my experience it's quite conventional to override it when you know better, e.g. using the blackbox exporter you typically relabel instance to be something more meaningful. (e.g. target_label: instance on https://prometheus.io/docs/guides/multi-target-exporter/)

Thanks for the sharing the link and context!

I think that overriding instance to mean something else is a mistake, with vhosting being the shining example of how this falls apart otherwise. I understand how it can work out fine in simple cases though. It's all conventions anyway.

The 'sane default' outlook on this makes sense as that is what happens by default 🤔 - With the way I'm thinking about this, a more specific definition might be "URL of the server process that the metrics came from". This covers all of the complicated scenarios:

One server per process -> instance points to the process

Multiple servers per process (vhosting) with one metrics endpoint for the process -> instance points to the whole process

Proxying metrics from other servers -> instance points to where the metrics came from

There is one layer of complexity that even this metrics setup doesn't handle yet. For example with Synapse Pro for small hosts, if we instead wanted to pack in multiple workers (for the same homeserver) into the same process (instead of multiple monolith homeservers), we wouldn't have the necessary labels to differentiate them. In addition to server_name, we would additionally need to label all of the server-level metrics with worker_name (or job/index). (I understand this use case doesn't make sense but it might for another person's setup)

I feel like a lot of docs are missing around how to handle metrics with vhosting. I would definitely prefer to lean on conventions but there doesn't seem to be any.

reivilibre

still need to read the big JSON dump :-))

reivilibre · 2026-01-14T17:43:14Z

synapse/metrics/__init__.py

+    labelnames=[SERVER_NAME_LABEL],
+)
+"""
+Maps Synapse `server_name`s to the `instance`s they're hosted on.


so if I'm getting this right, it's not really a map as such (the value is meaningless), just that it's a dummy metric we can rely on to always be there that will always have {instance, server_name} levels.

It seems this type of metric (labels carry data for the length of the process, value is fixed at 1) conventionally has a _info suffix, e.g. see https://opentelemetry.io/docs/specs/otel/compatibility/prometheus_and_openmetrics/#info

Would it make more sense to just call this synapse_server_names_info or something of that ilk.

I suppose in a vhosting model, when virtual-hosts are deregistered, we would remove their label from here?

so if I'm getting this right, it's not really a map as such (the value is meaningless), just that it's a dummy metric we can rely on to always be there that will always have {instance, server_name} levels.

I think you have the correct grasp on it. The labels do allow us to map server_name -> instance though (and "mapping" is the purpose of it) 🤷

It seems this type of metric (labels carry data for the length of the process, value is fixed at 1) conventionally has a _info suffix, e.g. see https://opentelemetry.io/docs/specs/otel/compatibility/prometheus_and_openmetrics/#info

Would it make more sense to just call this synapse_server_names_info or something of that ilk.

Sounds like a good practice to follow 👍

I suppose in a vhosting model, when virtual-hosts are deregistered, we would remove their label from here?

Yes. My current thinking is to remove it when we hs.shutdown() but I think this is better as a follow-up where we can describe it directly and test it.

Probably the final piece for https://github.com/element-hq/synapse-small-hosts/issues/106

Would it make more sense to just call this synapse_server_names_info or something of that ilk.

Renamed to synapse_server_name_info 👍

reivilibre

quite an interesting (if a little boilerplatey) approach, multiplying by some other constant-1 metric to match on more labels.

I think I'm getting it now, I'm happy with this.

I leave the decision around naming / following the convention or not up to you — I don't suspect it matters an awful lot and I don't know if this metric feels like a 'conventional _info metric' beyond matching the basic pattern, or not.

…name-refactor

…ame_info` See #19337 (comment)

MadLittleMods · 2026-01-14T23:57:58Z

Thanks for the review @reivilibre 🐗

These are automatic changes from importing/exporting from Grafana 12.3.1. In order to verify that I'm not sneaking in any changes, you can follow these steps to get the same output. Reproduction instructions: 1. Start [Grafana](https://hub.docker.com/r/grafana/grafana) ``` docker run -d --name=grafana --add-host host.docker.internal:host-gateway -p 3000:3000 grafana/grafana ``` 1. Visit the Grafana dashboard, http://localhost:3000/ (Credentials: `admin`/`admin`) 1. Import the Synapse dashboard: `contrib/grafana/synapse.json` 1. Export the Synapse dashboard. On the dashboard page -> **Export** -> **Export as code** -> Using the **Classic** model -> Check **Export for sharing externally** -> Copy 1. Paste into `contrib/grafana/synapse.json` 1. `git status`/`git diff` to check if there is any diff Sanity checked the dashboard itself by importing the dashboard on https://grafana.matrix.org/ (Grafana 10.4.1 according to https://grafana.matrix.org/api/health). The process-level metrics won't work because #19337 just merged and isn't on `matrix.org` yet. Also just generally, this dashboard works for me locally with the [load-tests](element-hq/synapse-rust-apps#397) I've been doing. ### Motivation There are few fixes I want to make to the Grafana dashboard and it sucks having to manually translate everything back over because we have different formatting. Hopefully after this bulk change, future exports will have exactly what we want to change.

This PR contains the following updates: | Package | Update | Change | |---|---|---| | [element-hq/synapse](https://github.com/element-hq/synapse) | minor | `1.145.0` → `1.146.0` | --- > ⚠️ **Warning** > > Some dependencies could not be looked up. Check the Dependency Dashboard for more information. --- ### Release Notes <details> <summary>element-hq/synapse (element-hq/synapse)</summary> ### [`v1.146.0`](https://github.com/element-hq/synapse/releases/tag/v1.146.0) [Compare Source](element-hq/synapse@v1.145.0...v1.146.0rc1) ### Synapse 1.146.0 (2026-01-27) No significant changes since 1.146.0rc1. #### Deprecations and Removals - [MSC2697](matrix-org/matrix-spec-proposals#2697) (Dehydrated devices) has been removed, as the MSC is closed. Developers should migrate to [MSC3814](matrix-org/matrix-spec-proposals#3814). ([#19346](element-hq/synapse#19346)) - Support for Ubuntu 25.04 (Plucky Puffin) has been dropped. Synapse no longer builds debian packages for Ubuntu 25.04. ### Synapse 1.146.0rc1 (2026-01-20) #### Features - Add a new config option [`enable_local_media_storage`](https://element-hq.github.io/synapse/latest/usage/configuration/config_documentation.html#enable_local_media_storage) which controls whether media is additionally stored locally when using configured `media_storage_providers`. Setting this to `false` allows off-site media storage without a local cache. Contributed by Patrice Brend'amour [@dr](https://github.com/dr).allgood. ([#19204](element-hq/synapse#19204)) - Stabilise support for [MSC4312](matrix-org/matrix-spec-proposals#4312 `m.oauth` User-Interactive Auth stage for resetting cross-signing identity with the OAuth 2.0 API. The old, unstable name (`org.matrix.cross_signing_reset`) is now deprecated and will be removed in a future release. ([#19273](element-hq/synapse#19273)) - Refactor Grafana dashboard to use `server_name` label (instead of `instance`). ([#19337](element-hq/synapse#19337)) #### Bugfixes - Fix joining a restricted v12 room locally when no local room creator is present but local users with sufficient power levels are. Contributed by [@nexy7574](https://github.com/nexy7574). ([#19321](element-hq/synapse#19321)) - Fixed parallel calls to `/_matrix/media/v1/create` being ratelimited for appservices even if `rate_limited: false` was set in the registration. Contributed by [@tulir](https://github.com/tulir) @ Beeper. ([#19335](element-hq/synapse#19335)) - Fix a bug introduced in 1.61.0 where a user's membership in a room was accidentally ignored when considering access to historical state events in rooms with the "shared" history visibility. Contributed by Lukas Tautz. ([#19353](element-hq/synapse#19353)) - [MSC4140](matrix-org/matrix-spec-proposals#4140): Store the JSON content of scheduled delayed events as text instead of a byte array. This fixes the inability to schedule a delayed event with non-ASCII characters in its content. ([#19360](element-hq/synapse#19360)) - Always rollback database transactions when retrying (avoid orphaned connections). ([#19372](element-hq/synapse#19372)) - Fix `InFlightGauge` typing to allow upgrading to `prometheus_client` 0.24. ([#19379](element-hq/synapse#19379)) #### Updates to the Docker image - Add [Prometheus HTTP service discovery](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#http_sd_config) endpoint for easy discovery of all workers when using the `docker/Dockerfile-workers` image (see the [*Metrics* section of our Docker testing docs](docker/README-testing.md#metrics)). ([#19336](element-hq/synapse#19336)) #### Improved Documentation - Remove docs on legacy metric names (no longer in the codebase since 2022-12-06). ([#19341](element-hq/synapse#19341)) - Clarify how the estimated value of room complexity is calculated internally. ([#19384](element-hq/synapse#19384)) #### Internal Changes - Add an internal `cancel_task` API to the task scheduler. ([#19310](element-hq/synapse#19310)) - Tweak docstrings and signatures of `auth_types_for_event` and `get_catchup_room_event_ids`. ([#19320](element-hq/synapse#19320)) - Replace usage of deprecated `assertEquals` with `assertEqual` in unit test code. ([#19345](element-hq/synapse#19345)) - Drop support for Ubuntu 25.04 'Plucky Puffin', add support for Ubuntu 25.10 'Questing Quokka'. ([#19348](element-hq/synapse#19348)) - Revert "Add an Admin API endpoint for listing quarantined media ([#19268](element-hq/synapse#19268))". ([#19351](element-hq/synapse#19351)) - Bump `mdbook` from 0.4.17 to 0.5.2 and remove our custom table-of-contents plugin in favour of the new default functionality. ([#19356](element-hq/synapse#19356)) - Replace deprecated usage of PyGitHub's `GitRelease.title` with `.name` in release script. ([#19358](element-hq/synapse#19358)) - Update the Element logo in Synapse's README to be an absolute URL, allowing it to render on other sites (such as PyPI). ([#19368](element-hq/synapse#19368)) - Apply minor tweaks to v1.145.0 changelog. ([#19376](element-hq/synapse#19376)) - Update Grafana dashboard syntax to use the latest from importing/exporting with Grafana 12.3.1. ([#19381](element-hq/synapse#19381)) - Warn about skipping reactor metrics when using unknown reactor type. ([#19383](element-hq/synapse#19383)) - Add support for reactor metrics with the `ProxiedReactor` used in worker Complement tests. ([#19385](element-hq/synapse#19385)) </details> --- ### Configuration 📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR is behind base branch, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] If you want to rebase/retry this PR, check this box --- This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate).  Reviewed-on: https://gitea.alexlebens.dev/alexlebens/infrastructure/pulls/3533 Co-authored-by: Renovate Bot <renovate-bot@alexlebens.net> Co-committed-by: Renovate Bot <renovate-bot@alexlebens.net>

# Famedly Synapse Release v1.146.0_1 depends on: famedly/complement#10 ## Famedly additions for v1.146.0_1 - feat: trigger CI actions (that are triggered on PRs) in merge queue (FrenchGithubUser) ### Notes for Famedly: #### Deprecations and Removals - matrix-org/matrix-spec-proposals#2697 (Dehydrated devices) has been removed, as the MSC is closed. Developers should migrate to matrix-org/matrix-spec-proposals#3814. (element-hq/synapse#19346) - Support for Ubuntu 25.04 (Plucky Puffin) has been dropped. Synapse no longer builds debian packages for Ubuntu 25.04. #### Updates to the Docker image - Add [Prometheus HTTP service discovery](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#http_sd_config) endpoint for easy discovery of all workers when using the docker/Dockerfile-workers image (see the [Metrics section of our Docker testing docs](https://github.com/famedly/synapse/pull/docker/README-testing.md#metrics)). (element-hq/synapse#19336) #### Features - Add a new config option [enable_local_media_storage](https://element-hq.github.io/synapse/latest/usage/configuration/config_documentation.html#enable_local_media_storage) which controls whether media is additionally stored locally when using configured media_storage_providers. Setting this to false allows off-site media storage without a local cache. Contributed by Patrice Brend'amour @dr.allgood. (element-hq/synapse#19204) - Stabilise support for matrix-org/matrix-spec-proposals#4312 m.oauth User-Interactive Auth stage for resetting cross-signing identity with the OAuth 2.0 API. The old, unstable name (org.matrix.cross_signing_reset) is now deprecated and will be removed in a future release. (element-hq/synapse#19273) - Refactor Grafana dashboard to use server_name label (instead of instance). (element-hq/synapse#19337)

prometheus recording rules are no longer necessary, see: element-hq/synapse#19133, instance label has been removed in favor of builtin server_name label element-hq/synapse#19337 https://github.com/element-hq/synapse/blob/v1.147.1/contrib/grafana/synapse.json Commits: https://github.com/element-hq/synapse/commits/v1.147.1/contrib/grafana/synapse.json

MadLittleMods added the A-Metrics label Jan 2, 2026

MadLittleMods added 4 commits January 2, 2026 21:20

Refactor Grafana dashboard to use server_name label

9236a5f

Introduce instance_to_server_name_mapping metric

0be0c87

Add $server_name variable

7704593

Update process-level metrics to use instance_to_server_name_mapping…

2157fa1

… (`$server_name`)

MadLittleMods force-pushed the madlittlemods/grafana-instance-to-server-name-refactor branch from 1093fa0 to 2157fa1 Compare January 3, 2026 03:20

MadLittleMods added 6 commits January 5, 2026 13:56

Merge branch 'develop' into madlittlemods/grafana-instance-to-server-…

1550d17

…name-refactor

Fix hard-coded server_name in a few spots (use $server_name)

1c2068b

Update deploys annotation

9c53418

Remove $instance variable

77fefc2

Add changelog

5722c09

No longer need to use the instance label in Prometheus config

f966617

MadLittleMods commented Jan 6, 2026

View reviewed changes

Rename instance_to_server_name_mapping -> `synapse_server_name_to_i…

21428d1

…nstance_mapping`

MadLittleMods commented Jan 6, 2026

View reviewed changes

contrib/grafana/synapse.json Outdated Show resolved Hide resolved

MadLittleMods commented Jan 6, 2026

View reviewed changes

contrib/grafana/synapse.json Outdated Show resolved Hide resolved

Fix lints

46e756f

MadLittleMods commented Jan 6, 2026

View reviewed changes

MadLittleMods marked this pull request as ready for review January 6, 2026 02:11

MadLittleMods requested a review from a team as a code owner January 6, 2026 02:11

MadLittleMods added 2 commits January 8, 2026 12:55

Merge branch 'develop' into madlittlemods/grafana-instance-to-server-…

9aa011e

…name-refactor

MadLittleMods commented Jan 13, 2026

View reviewed changes

reivilibre reviewed Jan 14, 2026

View reviewed changes

MadLittleMods requested a review from a team January 14, 2026 17:32

reivilibre reviewed Jan 14, 2026

View reviewed changes

reivilibre self-requested a review January 14, 2026 17:43

reivilibre approved these changes Jan 14, 2026

View reviewed changes

Merge branch 'develop' into madlittlemods/grafana-instance-to-server-…

231194f

…name-refactor

Rename synapse_server_name_to_instance_mapping -> `synapse_server_n…

b7bf07d

…ame_info` See #19337 (comment)

MadLittleMods merged commit 58f59ff into develop Jan 14, 2026
47 checks passed

MadLittleMods deleted the madlittlemods/grafana-instance-to-server-name-refactor branch January 14, 2026 23:57

MadLittleMods mentioned this pull request Jan 15, 2026

Latest changes from importing/exporting from Grafana 12.3.1 #19381

Merged

3 tasks

This was referenced Feb 5, 2026

Famedly release/v1.146 famedly/synapse#232

Closed

Famedly release/v1.146 famedly/synapse#236

Merged

	## Monitoring workers

	To monitor a Synapse installation using [workers](workers.md),
	every worker needs to be monitored independently, in addition to
	the main homeserver process. This is because workers don't send
	their metrics to the main homeserver process, but expose them
	directly (if they are configured to do so).

	To allow collecting metrics from a worker, you need to add a
	`metrics` listener to its configuration, by adding the following
	under `worker_listeners`:

	```yaml
	- type: metrics
	bind_address: ''
	port: 9101
	```

	The `bind_address` and `port` parameters should be set so that
	the resulting listener can be reached by prometheus, and they
	don't clash with an existing worker.
	With this example, the worker's metrics would then be available
	on `http://127.0.0.1:9101`.

	Example Prometheus target for Synapse with workers:

	```yaml
	- job_name: "synapse"
	scrape_interval: 15s
	metrics_path: "/_synapse/metrics"
	static_configs:
	- targets: ["my.server.here:port"]
	labels:
	instance: "my.server"
	job: "master"
	index: 1
	- targets: ["my.workerserver.here:port"]
	labels:
	instance: "my.server"
	job: "generic_worker"
	index: 1
	- targets: ["my.workerserver.here:port"]
	labels:
	instance: "my.server"
	job: "generic_worker"
	index: 2
	- targets: ["my.workerserver.here:port"]
	labels:
	instance: "my.server"
	job: "media_repository"
	index: 1
	```

	Labels (`instance`, `job`, `index`) can be defined as anything.
	The labels are used to group graphs in grafana.

		@@ -0,0 +1 @@
		Refactor Grafana dashboard to use `server_name` label (instead of `instance`).

Conversation

MadLittleMods commented Jan 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivating use case

Testing strategy

Dev notes

Todo

Pull Request Checklist

Uh oh!

MadLittleMods Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

MadLittleMods Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

MadLittleMods Jan 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

MadLittleMods Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

reivilibre Jan 14, 2026

Choose a reason for hiding this comment

Uh oh!

MadLittleMods Jan 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

reivilibre left a comment

Choose a reason for hiding this comment

Uh oh!

reivilibre Jan 14, 2026

Choose a reason for hiding this comment

Uh oh!

MadLittleMods Jan 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

MadLittleMods Jan 14, 2026

Choose a reason for hiding this comment

Uh oh!

reivilibre left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

MadLittleMods commented Jan 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

MadLittleMods commented Jan 2, 2026 •

edited

Loading

MadLittleMods Jan 13, 2026 •

edited

Loading

MadLittleMods Jan 14, 2026 •

edited

Loading

MadLittleMods Jan 14, 2026 •

edited

Loading