Skip to content

Kernel config changes and resource usage monitoring#1318

Merged
troglobit merged 2 commits intomainfrom
rusage
Dec 12, 2025
Merged

Kernel config changes and resource usage monitoring#1318
troglobit merged 2 commits intomainfrom
rusage

Conversation

@troglobit
Copy link
Contributor

Description

This PR enables OOPS-on-panic, soft/hard lockup panic, hung-task panic, and extra workqueue watchdog reporting in the kernel. This makes latent stalls visible instead of silently freezing, improving diagnosis of issues like the recent resource-pressure lockup.

Additionally, watchdogd has been tasked to log disk/mem/filenr resource usage every hour:

Dec  8 15:22:44 ix-00-00-00 watchdogd[2599]: Memory usage: 195036 kB, cached: 69740 kB, total: 423628 kB
Dec  8 15:22:44 ix-00-00-00 watchdogd[2599]: File system /var usage: blocks 4710/52564 inodes 80/65456
Dec  8 15:22:44 ix-00-00-00 watchdogd[2599]: File descriptor usage: 640/34603

Checklist

Tick relevant boxes, this PR is-a or has-a:

  • Bugfix
    • Regression tests
    • ChangeLog updates (for next release)
  • Feature
    • YANG model change => revision updated?
    • Regression tests added?
    • ChangeLog updates (for next release)
    • Documentation added?
  • Test changes
    • Checked in changed Readme.adoc (make test-spec)
    • Added new test to group Readme.adoc and yaml file
  • Code style update (formatting, renaming)
  • Refactoring (please detail in commit messages)
  • Build related changes
  • Documentation content changes
    • ChangeLog updated (for major changes)
  • Other (please describe): observability, error detection

Dec  8 15:22:44 ix-00-00-00 watchdogd[2599]: Memory usage: 195036 kB, cached: 69740 kB, total: 423628 kB
Dec  8 15:22:44 ix-00-00-00 watchdogd[2599]: File system /var usage: blocks 4710/52564 inodes 80/65456
Dec  8 15:22:44 ix-00-00-00 watchdogd[2599]: File descriptor usage: 640/34603

Signed-off-by: Joachim Wiberg <troglobit@gmail.com>
Turn on OOPS-to-panic, soft/hard lockup panic, hung-task panic, and
extra workqueue watchdog reporting. This makes latent stalls visible
instead of silently freezing, improving diagnosis of issues like the
recent resource-pressure lockup.

Signed-off-by: Joachim Wiberg <troglobit@gmail.com>
@troglobit troglobit requested a review from wkz December 12, 2025 09:51
Copy link
Contributor

@wkz wkz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super! 🚀

@troglobit troglobit merged commit 231bf1e into main Dec 12, 2025
7 checks passed
@troglobit troglobit deleted the rusage branch December 12, 2025 10:55
@troglobit troglobit mentioned this pull request Dec 18, 2025
17 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants