Skip to content

Fix Complement CI not running against the code from our PRs (remote images being chosen over local)#19475

Merged
MadLittleMods merged 8 commits intodevelopfrom
madlittlemods/use-local-complement-built-images
Feb 18, 2026
Merged

Fix Complement CI not running against the code from our PRs (remote images being chosen over local)#19475
MadLittleMods merged 8 commits intodevelopfrom
madlittlemods/use-local-complement-built-images

Conversation

@MadLittleMods
Copy link
Contributor

@MadLittleMods MadLittleMods commented Feb 17, 2026

Fix remote images being chosen over the local ones we just built with Complement in CI (any Docker environment using the containerd image store). This problem means that Complement jobs in CI don't actually test against the code from the PR (since 2026-02-10).

This PR approaches the problem the same way that @AndrewFerr proposed in #18210. This is better than the alternative listed below as we can just make our code compatible with whatever image store is being used.

Problem

Spawning from #19460 (comment) where we found that our Complement jobs in CI don't actually test against the code from the PR at the moment.

This is caused by a change in Docker Engine 29.0.0:

containerd image store is now the default for fresh installs. This doesn't apply to daemons configured with userns-remap (see moby#47377).

-- 29.0.0 (2025-11-10), https://docs.docker.com/engine/release-notes/29/#2900

And our ubuntu-latest GitHub runner (Current runner version: '2.331.0') points to using Docker client/server 29.1.5 🎯

This Docker version bump happened on actions/runner-images@416418d (2026-02-10) (28.0.4 -> 29.1.5). Specific PR: actions/runner-images#13633


I found this because I reviewed and remembered #18210 was a thing that @AndrewFerr ran into. And then running dockers system prune also revealed the problematic containerd in CI. Checking the Docker changelogs, I found the new default culprit and then could trace down where the GitHub runners made the dependency update.

Alternatives

Use non-published tags

We could use tags that aren't published like matrixdotorg/synapse:testing.

Disable the containerd image store

Looking at the PR's cross-linked from actions/runner-images#13633, I saw NixOS/nix#15252 which disables the containerd image store. Seems like it could work but we might as well update our code to be compatible with whatever image store is being used (especially when someone has already put in the work to figure it out).

      - name: Disable containerd image store
        run: |
          echo '{"features":{"containerd-snapshotter":false}}' | sudo tee /etc/docker/daemon.json > /dev/null
          sudo systemctl restart docker

Dev notes

I've given a heads-up to other developers about the breaking change, actions/runner-images#13633 (comment)

Todo

Pull Request Checklist

  • Pull request is based on the develop branch
  • Pull request includes a changelog file. The entry should:
    • Be a short description of your change which makes sense to users. "Fixed a bug that prevented receiving messages from other servers." instead of "Moved X method from EventStore to EventWorkerStore.".
    • Use markdown where necessary, mostly for code blocks.
    • End with either a period (.) or an exclamation mark (!).
    • Start with a capital letter.
    • Feel free to credit yourself, by adding a sentence "Contributed by @github_username." or "Contributed by [Your Name]." to the end of the entry.
  • Code style is correct (run the linters)

@MadLittleMods MadLittleMods changed the title Namespace with local dummy registry Fix remote images being chosen over the local ones we just built with Complement Feb 17, 2026
@MadLittleMods MadLittleMods changed the title Fix remote images being chosen over the local ones we just built with Complement Fix remote images being chosen over the local ones we just built with Complement (Complement jobs in CI don't actually test against the code from the PR) Feb 17, 2026
@MadLittleMods MadLittleMods changed the title Fix remote images being chosen over the local ones we just built with Complement (Complement jobs in CI don't actually test against the code from the PR) Fix remote images being chosen over the local ones we just built with Complement in CI Feb 17, 2026
@MadLittleMods MadLittleMods changed the title Fix remote images being chosen over the local ones we just built with Complement in CI Fix Complement CI not running against the code from our PRs (remote images being chosen over local) Feb 17, 2026
That way we don't have to think about future
images being published and conflicting
)

export COMPLEMENT_BASE_IMAGE=complement-synapse
export COMPLEMENT_BASE_IMAGE="$LOCAL_IMAGE_NAMESPACE/$COMPLEMENT_SYNAPSE_IMAGE_PATH"
Copy link
Contributor Author

@MadLittleMods MadLittleMods Feb 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One difference from #18210 is that I've updated everything we build to use the local dummy registry namespace.

This way we don't have to worry about the other images being published or not at some point in the future. It's all just namespaced and pulls from the same place across the board.

@@ -0,0 +1 @@
Fix Complement CI not running against the code from our PRs.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally, to detect this kind of problem, we'd have an in-repo Complement test that checks if the current git commit sha matches the one that Synapse is using for Complement (via /version endpoint). I'm tackling this separately as #19476


[guideComplementSh]: https://element-hq.github.io/synapse/latest/development/contributing_guide.html#run-the-integration-tests-complement

## Building and running the images manually
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, this docs section doesn't seem that useful. To make this PR whole, I'm updating the docs but I think it's worth ripping this all out (just point to complement.sh and those docs) in a follow-up PR.

Copy link
Contributor Author

@MadLittleMods MadLittleMods Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related discussion below with some favor on removing this section, #19475 (comment)

@MadLittleMods MadLittleMods marked this pull request as ready for review February 18, 2026 03:24
@MadLittleMods MadLittleMods requested a review from a team as a code owner February 18, 2026 03:24
--build-arg TEST_ONLY_IGNORE_POETRY_LOCKFILE \
-f "docker/Dockerfile" .
$CONTAINER_RUNTIME build \
-t "$SYNAPSE_IMAGE_PATH" \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth just dropping this 'global' tag here and keeping only the local tag?

Since it's a build that was never published, it feels almost preferable to not give it the global tag

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated everything to just use localhost/... straight-up.

We do publish the complement-synapse image built from complement.sh but if I squint at the CI code there, I think it still works:

- name: Work out labels for complement image
id: meta
uses: docker/metadata-action@c299e40c65443455700f0fdfc63efafe5b349051 # v5.10.0
with:
images: ghcr.io/${{ github.repository }}/complement-synapse
tags: |
type=schedule,pattern=nightly,enable=${{ github.event_name == 'schedule'}}
type=raw,value=develop,enable=${{ github.event_name == 'schedule' || inputs.branch == 'develop' }}
type=raw,value=latest,enable=${{ github.event_name == 'push' || inputs.branch == 'master' }}
type=sha,format=long
- name: Run scripts-dev/complement.sh to generate complement-synapse:latest image.
run: scripts-dev/complement.sh --build-only
- name: Tag and push generated image
run: |
for TAG in ${{ join(fromJson(steps.meta.outputs.json).tags, ' ') }}; do
echo "tag and push $TAG"
docker tag complement-synapse $TAG
docker push $TAG
done

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, would this need to say

docker tag localhost/complement-synapse $TAG 

for that to work?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just seeing this now (not sure how I missed).

Good spot! I think the answer is yes based on these CI failures:

https://github.com/element-hq/synapse/actions/runs/22609655282/job/65509315002#step:8:39

Error response from daemon: No such image: complement-synapse:latest

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in #19523

Copy link
Contributor

@reivilibre reivilibre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Some minor points but I'm not fussy about them

@MadLittleMods MadLittleMods force-pushed the madlittlemods/use-local-complement-built-images branch from 09ffece to 15b0210 Compare February 18, 2026 16:27
@MadLittleMods MadLittleMods merged commit b2778da into develop Feb 18, 2026
70 of 73 checks passed
@MadLittleMods MadLittleMods deleted the madlittlemods/use-local-complement-built-images branch February 18, 2026 17:37
@MadLittleMods
Copy link
Contributor Author

Thanks for the review @reivilibre 🦜

@MadLittleMods MadLittleMods mentioned this pull request Feb 18, 2026
3 tasks
MadLittleMods added a commit that referenced this pull request Feb 19, 2026
sandhose pushed a commit that referenced this pull request Mar 3, 2026
*This PR was originally only to enable
[MSC4222](matrix-org/matrix-spec-proposals#4222)
Complement tests (`/sync` `state_after`) but after merging the [fix
PR](#19463), we discovered
that while the tests pass locally, [fail in
CI](#19460 (comment)).
To unblock the RC, we decided to revert the fix PR (see
#19474 (comment)
for more info). To better ensure tests actually pass in CI, we're
re-introducing the fix here in the same PR that we enable the tests in.*

---

Fix `/sync` missing membership in `state_after`.

This applies to any scenario where the first membership has a different
`sender` compared to the `state_key` and then the second membership has
the same `sender`/`state_key`. Like someone inviting another person and
then them joining. Or someone being kicked and then they leave.

This bug has been present since the MSC4222 implementation was
introduced into the codebase
(#17888).

---

Fix #19455
Fix element-hq/customer-success#656

I have a feeling, this might also fix these issues (will close and see
how people report back):

Fix #18182
Fix #19478

 ### Testing strategy

Complement tests: matrix-org/complement#842

We will need #19460 to merge
in order to enable the Complement tests in Synapse but this PR should be
merged first so they pass in the first place. I've tested locally that
the Complement tests pass with this fix.




### Dev notes


[MSC4222](matrix-org/matrix-spec-proposals#4222)
has already been merged into the spec and is already part of Matrix
v1.16 but we haven't [stabilized support in Synapse
yet](#19414).

---

In the same ballpark:

 - #19455
 - #17050
 - #17430
 - #16940
 - #18182
 - #18793
 - #19478

---

Docker builds preferring remote image over the local image we just
built,
#19460 (comment)

`containerd` image store (storage driver, driver type)

-> #19475


### Todo

- [x] Wait for #19463 to merge
so the Complement tests all pass
- [x] Wait for #19475 to merge

### Pull Request Checklist

<!-- Please read
https://element-hq.github.io/synapse/latest/development/contributing_guide.html
before submitting your pull request -->

* [x] Pull request is based on the develop branch
* [x] Pull request includes a [changelog
file](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#changelog).
The entry should:
- Be a short description of your change which makes sense to users.
"Fixed a bug that prevented receiving messages from other servers."
instead of "Moved X method from `EventStore` to `EventWorkerStore`.".
  - Use markdown where necessary, mostly for `code blocks`.
  - End with either a period (.) or an exclamation mark (!).
  - Start with a capital letter.
- Feel free to credit yourself, by adding a sentence "Contributed by
@github_username." or "Contributed by [Your Name]." to the end of the
entry.
* [x] [Code
style](https://element-hq.github.io/synapse/latest/code_style.html) is
correct (run the
[linters](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#run-the-linters))

---------

Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>
Co-authored-by: Andrew Ferrazzutti <andrewf@element.io>
alexlebens pushed a commit to alexlebens/infrastructure that referenced this pull request Mar 10, 2026
This PR contains the following updates:

| Package | Update | Change |
|---|---|---|
| [element-hq/synapse](https://github.com/element-hq/synapse) | minor | `v1.148.0` → `v1.149.0` |

---

### Release Notes

<details>
<summary>element-hq/synapse (element-hq/synapse)</summary>

### [`v1.149.0`](https://github.com/element-hq/synapse/releases/tag/v1.149.0)

[Compare Source](element-hq/synapse@v1.148.0...v1.149.0)

### Synapse 1.149.0 (2026-03-10)

No significant changes since 1.149.0rc1.

### Synapse 1.149.0rc1 (2026-03-03)

#### Features

- Add experimental support for [MSC4388: Secure out-of-band channel for sign in with QR](matrix-org/matrix-spec-proposals#4388). ([#&#8203;19127](element-hq/synapse#19127))
- Add stable support for [MSC4380](matrix-org/matrix-spec-proposals#4380) invite blocking. ([#&#8203;19431](element-hq/synapse#19431))

#### Bugfixes

- Fix the 'Login as a user' Admin API not checking if the user exists before issuing an access token. ([#&#8203;18518](element-hq/synapse#18518))
- Fix `/sync` missing membership event in `state_after` (experimental [MSC4222](matrix-org/matrix-spec-proposals#4222) implementation) in some scenarios. ([#&#8203;19460](element-hq/synapse#19460))

#### Internal Changes

- Add log to explain when and why we freeze objects in the garbage collector. ([#&#8203;19440](element-hq/synapse#19440))
- Better instrument `JoinRoomAliasServlet` with tracing. ([#&#8203;19461](element-hq/synapse#19461))
- Fix Complement CI not running against the code from our PRs. ([#&#8203;19475](element-hq/synapse#19475))
- Log `docker system info` in CI so we have a plain record of how GitHub runners evolve over time. ([#&#8203;19480](element-hq/synapse#19480))
- Rename the `test_disconnect` test helper so that pytest doesn't see it as a test. ([#&#8203;19486](element-hq/synapse#19486))
- Add a log line when we delete devices. Contributed by [@&#8203;bradtgmurray](https://github.com/bradtgmurray) @&#8203; Beeper. ([#&#8203;19496](element-hq/synapse#19496))
- Pre-allocate the buffer based on the expected `Content-Length` with the Rust HTTP client. ([#&#8203;19498](element-hq/synapse#19498))
- Cancel long-running sync requests if the client has gone away. ([#&#8203;19499](element-hq/synapse#19499))
- Try and reduce reactor tick times when under heavy load. ([#&#8203;19507](element-hq/synapse#19507))
- Simplify Rust HTTP client response streaming and limiting. ([#&#8203;19510](element-hq/synapse#19510))
- Replace deprecated collection import locations with current locations. ([#&#8203;19515](element-hq/synapse#19515))
- Bump most locked Python dependencies to their latest versions. ([#&#8203;19519](element-hq/synapse#19519))

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

♻ **Rebasing**: Whenever PR is behind base branch, or you tick the rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box

---

This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate).
<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiI0My41OS4yIiwidXBkYXRlZEluVmVyIjoiNDMuNTkuMiIsInRhcmdldEJyYW5jaCI6Im1haW4iLCJsYWJlbHMiOlsiaW1hZ2UiXX0=-->

Reviewed-on: https://gitea.alexlebens.dev/alexlebens/infrastructure/pulls/4580
Co-authored-by: Renovate Bot <renovate-bot@alexlebens.net>
Co-committed-by: Renovate Bot <renovate-bot@alexlebens.net>
MadLittleMods added a commit that referenced this pull request Mar 11, 2026
…it checkout (#19476)

This way we actually detect problems like
#19475 as they happen instead
of being invisible until something breaks.

Sanity check that Complement is testing against your code changes
(whether it be local or from the PR in CI).

```
COMPLEMENT_DIR=../complement ./scripts-dev/complement.sh --in-repo -run TestSynapseVersion
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants