Skip to content

[issue-5397] [BE] Make ClickHouse health check timeout configurable#5414

Merged
dsblank merged 6 commits intomainfrom
ollie/issue-5397-custom-healthcheck-timeout-110903
Apr 7, 2026
Merged

[issue-5397] [BE] Make ClickHouse health check timeout configurable#5414
dsblank merged 6 commits intomainfrom
ollie/issue-5397-custom-healthcheck-timeout-110903

Conversation

@dsblank
Copy link
Copy Markdown
Contributor

@dsblank dsblank commented Feb 26, 2026

Details

The ClickHouse health check had a hardcoded 1-second timeout, causing pod restarts in environments where the round-trip to ClickHouse over HTTPS exceeds that threshold.

This PR introduces a new environment variable ANALYTICS_DB_HEALTH_CHECK_TIMEOUT_SECONDS (default: 1) so operators can tune the timeout without code changes.

Changes:

  • DatabaseAnalyticsFactory.java: Added healthCheckTimeoutSeconds field (default 1) mapped from config
  • config.yml: Added healthCheckTimeoutSeconds: ${ANALYTICS_DB_HEALTH_CHECK_TIMEOUT_SECONDS:-1} under databaseAnalytics
  • DatabaseAnalyticsModule.java: Exposes the value as a @Named Guice binding
  • ClickHouseHealthyCheck.java: Injects the configured timeout and uses it in place of the hardcoded Duration.ofSeconds(1)

Change checklist

  • User facing
  • Documentation update

Issues

Testing

To verify the new default behaviour is unchanged, start the backend normally — the health check continues to use a 1-second timeout.

To test a custom timeout, set the env var before starting the backend:

export ANALYTICS_DB_HEALTH_CHECK_TIMEOUT_SECONDS=10

Then check the /health-check endpoint; the ClickHouse check should now allow up to 10 seconds before reporting unhealthy.

Documentation

The new environment variable should be added to the deployment / environment variable reference docs:

Variable Default Description
ANALYTICS_DB_HEALTH_CHECK_TIMEOUT_SECONDS 1 Timeout in seconds for the ClickHouse readiness health check query

@github-actions github-actions bot added the java Pull requests that update Java code label Feb 26, 2026
@github-actions
Copy link
Copy Markdown
Contributor

Backend Tests Results

  435 files    435 suites   1h 3m 22s ⏱️
6 854 tests 6 841 ✅ 13 💤 0 ❌
6 746 runs  6 733 ✅ 13 💤 0 ❌

Results for commit f9440e1.

@github-actions
Copy link
Copy Markdown
Contributor

Backend Tests - Unit Tests

1 433 tests   1 431 ✅  55s ⏱️
  170 suites      2 💤
  170 files        0 ❌

Results for commit 781f4f4.

@github-actions
Copy link
Copy Markdown
Contributor

Backend Tests - Integration Group 7

244 tests   244 ✅  2m 1s ⏱️
 24 suites    0 💤
 24 files      0 ❌

Results for commit 781f4f4.

@github-actions
Copy link
Copy Markdown
Contributor

Backend Tests - Integration Group 14

185 tests   185 ✅  1m 40s ⏱️
 19 suites    0 💤
 19 files      0 ❌

Results for commit 781f4f4.

@github-actions
Copy link
Copy Markdown
Contributor

Backend Tests - Integration Group 5

113 tests   113 ✅  2m 11s ⏱️
 22 suites    0 💤
 22 files      0 ❌

Results for commit 781f4f4.

@github-actions
Copy link
Copy Markdown
Contributor

Backend Tests - Integration Group 11

159 tests   157 ✅  2m 35s ⏱️
 21 suites    2 💤
 21 files      0 ❌

Results for commit 781f4f4.

@github-actions
Copy link
Copy Markdown
Contributor

Backend Tests - Integration Group 12

186 tests   185 ✅  3m 22s ⏱️
 32 suites    1 💤
 32 files      0 ❌

Results for commit 781f4f4.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 13, 2026

Backend Tests - Integration Group 15

273 tests   273 ✅  4m 43s ⏱️
 19 suites    0 💤
 19 files      0 ❌

Results for commit a29249d.

♻️ This comment has been updated with latest results.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 13, 2026

Backend Tests - Integration Group 16

227 tests   225 ✅  5m 23s ⏱️
 15 suites    2 💤
 15 files      0 ❌

Results for commit 3ccc041.

♻️ This comment has been updated with latest results.

@github-actions
Copy link
Copy Markdown
Contributor

Backend Tests - Integration Group 8

288 tests   288 ✅  4m 1s ⏱️
 24 suites    0 💤
 24 files      0 ❌

Results for commit 781f4f4.

@github-actions
Copy link
Copy Markdown
Contributor

Backend Tests - Integration Group 6

1 129 tests   1 129 ✅  6m 40s ⏱️
    7 suites      0 💤
    7 files        0 ❌

Results for commit 781f4f4.

@github-actions
Copy link
Copy Markdown
Contributor

Backend Tests - Integration Group 4

    5 files      5 suites   3m 5s ⏱️
1 361 tests 1 361 ✅ 0 💤 0 ❌
1 272 runs  1 272 ✅ 0 💤 0 ❌

Results for commit 781f4f4.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 13, 2026

Backend Tests - Integration Group 1

413 tests   413 ✅  13m 53s ⏱️
 23 suites    0 💤
 23 files      0 ❌

Results for commit a29249d.

♻️ This comment has been updated with latest results.

@github-actions
Copy link
Copy Markdown
Contributor

Backend Tests - Integration Group 9

326 tests   322 ✅  8m 40s ⏱️
 34 suites    4 💤
 34 files      0 ❌

Results for commit 781f4f4.

@github-actions
Copy link
Copy Markdown
Contributor

Backend Tests - Integration Group 3

307 tests   307 ✅  9m 32s ⏱️
 28 suites    0 💤
 28 files      0 ❌

Results for commit 781f4f4.

@github-actions
Copy link
Copy Markdown
Contributor

Backend Tests - Integration Group 10

 22 files   22 suites   6m 29s ⏱️
220 tests 218 ✅ 2 💤 0 ❌
182 runs  180 ✅ 2 💤 0 ❌

Results for commit 781f4f4.

@github-actions
Copy link
Copy Markdown
Contributor

Backend Tests - Integration Group 13

430 tests   428 ✅  7m 53s ⏱️
 12 suites    2 💤
 12 files      0 ❌

Results for commit 781f4f4.

@github-actions
Copy link
Copy Markdown
Contributor

Backend Tests - Integration Group 2

256 tests   256 ✅  17m 38s ⏱️
 19 suites    0 💤
 19 files      0 ❌

Results for commit 781f4f4.

@github-actions
Copy link
Copy Markdown
Contributor

TS SDK E2E Tests - Node 22

236 tests   234 ✅  19m 7s ⏱️
 25 suites    2 💤
  1 files      0 ❌

Results for commit 781f4f4.

@github-actions
Copy link
Copy Markdown
Contributor

TS SDK E2E Tests - Node 20

236 tests   234 ✅  19m 24s ⏱️
 25 suites    2 💤
  1 files      0 ❌

Results for commit 781f4f4.

@github-actions
Copy link
Copy Markdown
Contributor

TS SDK E2E Tests - Node 18

236 tests   234 ✅  22m 21s ⏱️
 25 suites    2 💤
  1 files      0 ❌

Results for commit 781f4f4.

@github-actions
Copy link
Copy Markdown
Contributor

Python SDK E2E Tests Results (Python 3.13)

243 tests   241 ✅  8m 53s ⏱️
  1 suites    2 💤
  1 files      0 ❌

Results for commit 781f4f4.

@github-actions
Copy link
Copy Markdown
Contributor

Python SDK E2E Tests Results (Python 3.10)

243 tests   241 ✅  9m 15s ⏱️
  1 suites    2 💤
  1 files      0 ❌

Results for commit 781f4f4.

@github-actions
Copy link
Copy Markdown
Contributor

Python SDK E2E Tests Results (Python 3.12)

243 tests   241 ✅  8m 38s ⏱️
  1 suites    2 💤
  1 files      0 ❌

Results for commit 781f4f4.

@github-actions
Copy link
Copy Markdown
Contributor

Python SDK E2E Tests Results (Python 3.14)

243 tests   241 ✅  8m 42s ⏱️
  1 suites    2 💤
  1 files      0 ❌

Results for commit 781f4f4.

@github-actions
Copy link
Copy Markdown
Contributor

Python SDK E2E Tests Results (Python 3.11)

243 tests   241 ✅  9m 10s ⏱️
  1 suites    2 💤
  1 files      0 ❌

Results for commit 781f4f4.

@dsblank dsblank marked this pull request as ready for review March 13, 2026 13:34
@dsblank dsblank requested a review from a team as a code owner March 13, 2026 13:34
Copy link
Copy Markdown
Member

@andrescrz andrescrz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, all optional comments, except the missing new param in the test config file. Please add that before moving forward.

@ew0s
Copy link
Copy Markdown

ew0s commented Apr 2, 2026

Hi!

Any updates on this PR ?

We are waiting so much 🙏

@dsblank dsblank force-pushed the ollie/issue-5397-custom-healthcheck-timeout-110903 branch from 781f4f4 to 1d6c12b Compare April 2, 2026 12:11
@github-actions github-actions bot added the tests Including test files, or tests related like configuration. label Apr 2, 2026
Douglas Blank and others added 4 commits April 2, 2026 09:38
…able

Add ANALYTICS_DB_HEALTH_CHECK_TIMEOUT_SECONDS env var (default: 1) to
allow overriding the hardcoded 1-second timeout in ClickHouseHealthyCheck.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…heckTimeout

- Change healthCheckTimeout field type from int to Dropwizard Duration in
  DatabaseAnalyticsFactory, allowing flexible config values (e.g. 1s, 500ms)
- Rename @nAmed binding to snake_case "health_check_timeout"
- Update config.yml and config-test.yml to use Duration string format
- Env var renamed from ANALYTICS_DB_HEALTH_CHECK_TIMEOUT_SECONDS to
  ANALYTICS_DB_HEALTH_CHECK_TIMEOUT

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@dsblank dsblank force-pushed the ollie/issue-5397-custom-healthcheck-timeout-110903 branch from 63e64f5 to e035e44 Compare April 2, 2026 13:38
@dsblank
Copy link
Copy Markdown
Contributor Author

dsblank commented Apr 2, 2026

@andrescrz I think your review comments have been addressed.

andrescrz
andrescrz previously approved these changes Apr 7, 2026
Copy link
Copy Markdown
Member

@andrescrz andrescrz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just a couple of minor comments to follow-up. The one about the injector naming is highly recommended.

…nversion

- Rename @nAmed binding from "health_check_timeout" to "clickhouse_health_check_timeout" for clarity (per andrescrz)
- Replace manual Duration.ofMillis(toMilliseconds()) with toJavaDuration() (per andrescrz)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@dsblank dsblank merged commit 35545ee into main Apr 7, 2026
76 checks passed
@dsblank dsblank deleted the ollie/issue-5397-custom-healthcheck-timeout-110903 branch April 7, 2026 17:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Backend java Pull requests that update Java code Made by Ollie 🦉 tests Including test files, or tests related like configuration.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FR]: Support custom clickhouse health check timeout

3 participants