Skip to content

Fix disk resource job to fetch SMART status (BugFix)#940

Merged
kissiel merged 6 commits intomainfrom
fix_block_device_smart
Jan 17, 2024
Merged

Fix disk resource job to fetch SMART status (BugFix)#940
kissiel merged 6 commits intomainfrom
fix_block_device_smart

Conversation

@Hook25
Copy link
Copy Markdown
Collaborator

@Hook25 Hook25 commented Jan 16, 2024

Description

Currently the disk_block_device_resource script doesn't correctly fetch the SMART support status of some devices. The reason is that some devices (namely, nvme drives, although more may exist) fail to be reported as supporting SMART by smartctl even though they do. This is due to the fact that this support is not reported in smartctl -i but it is there in smartctl -x, although in a slightly different format.

This changes the command that the resource job runs. This also includes some new unit tests for this resource and a small refactoring.

Resolved issues

Fixes: #102

Documentation

N/A

Tests

I have an nvme drive that was wrongly reported by this resource as not supporting SMART. After these changes it correctly reports the SMART status.

@codecov
Copy link
Copy Markdown

codecov bot commented Jan 16, 2024

Codecov Report

Attention: 4 lines in your changes are missing coverage. Please review.

Comparison is base (228b716) 38.98% compared to head (ded7389) 39.18%.
Report is 2 commits behind head on main.

Files Patch % Lines
providers/resource/bin/block_device_resource.py 93.10% 3 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #940      +/-   ##
==========================================
+ Coverage   38.98%   39.18%   +0.19%     
==========================================
  Files         315      315              
  Lines       34893    34899       +6     
  Branches     5971     5972       +1     
==========================================
+ Hits        13604    13676      +72     
+ Misses      20678    20610      -68     
- Partials      611      613       +2     
Flag Coverage Δ
provider-resource 20.18% <93.10%> (+8.17%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Copy Markdown
Contributor

@kissiel kissiel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The version on main doesn't crash on problems with smartctl. With this patch it crashes on my system.

@kissiel
Copy link
Copy Markdown
Contributor

kissiel commented Jan 16, 2024

The "crash-y" setup I've got:

➜  bin git:(main) ✗ ll /sys/block/*/device            

lrwxrwxrwx 1 root root 0 sty 16 18:36 /sys/block/nvme0n1/device -> ../../nvme0
lrwxrwxrwx 1 root root 0 sty 16 18:36 /sys/block/nvme1n1/device -> ../../nvme1
lrwxrwxrwx 1 root root 0 sty 16 18:36 /sys/block/nvme2c2n1/device -> ../../nvme2
lrwxrwxrwx 1 root root 0 sty 16 18:36 /sys/block/nvme2n1/device -> ../../nvme-subsys2
lrwxrwxrwx 1 root root 0 sty 16 18:36 /sys/block/nvme3n1/device -> ../../nvme3
lrwxrwxrwx 1 root root 0 sty 16 18:36 /sys/block/sda/device -> ../../../5:0:0:0
lrwxrwxrwx 1 root root 0 sty 16 18:36 /sys/block/sdb/device -> ../../../8:0:0:0
lrwxrwxrwx 1 root root 0 sty 16 21:43 /sys/block/sdc/device -> ../../../9:0:0:0

The nvme2c2n1 doesn't exist at /dev/ with the same exact name.

The c2 part is thrown in for ssd with multipath enabled, but the same thing is not added in /dev/
More context is in the nvme_alloc_ns function in the drivers/nvme/host/core.c

@Hook25 Hook25 requested a review from kissiel January 17, 2024 09:52
Copy link
Copy Markdown
Contributor

@kissiel kissiel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TYVM for the improvements!
Works better now!

@kissiel kissiel merged commit d90cf60 into main Jan 17, 2024
@kissiel kissiel deleted the fix_block_device_smart branch January 17, 2024 10:56
LiaoU3 pushed a commit to LiaoU3/checkbox that referenced this pull request Mar 20, 2024
* Black block_device_resource.py

* Add new validator and change flags to smartctl

* Small refactor, remove some dead code

* Add unittests for block_device_resource.py

* Remove warning inhibitor

* Handle errors and test the failure
binli pushed a commit to binli/checkbox that referenced this pull request Mar 22, 2024
* Black block_device_resource.py

* Add new validator and change flags to smartctl

* Small refactor, remove some dead code

* Add unittests for block_device_resource.py

* Remove warning inhibitor

* Handle errors and test the failure
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

LP1816445: SMART resource should use -x instead of -i for NVME

2 participants