Skip to content

Error: Failed to get SMART data #1

@Jay2k1

Description

@Jay2k1

Hi,

I am trying to deploy the plugin to a host and test it there (run the python plugin file manually). I get:

<<<oposs_smart_error:sep(124)>>>
/dev/sdc|ERROR|Failed to get SMART data

The host has four disks (sda through sdd), two are SATA SSDs and two are SATA HDDs. All disks are connected to the (Supermicro) mainboard (no RAID controller, no HBA).

I looked through the code and ran the commands manually. What I've noticed is that it allows smartctl exit codes of 0, 1, 2 and 4. However, running it against sdc exits with 64. This is the output:

# smartctl --info --log=error /dev/sdc; echo $?
smartctl 7.3 2022-02-28 r5338 [x86_64-linux-6.8.12-14-pve] (local build)
Copyright (C) 2002-22, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Hitachi/HGST Ultrastar 7K4000
Device Model:     HGST HUS724040ALA640
Serial Number:    redacted
LU WWN Device Id: redacted
Firmware Version: MFAOAA70
User Capacity:    4,000,787,030,016 bytes [4.00 TB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    7200 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database 7.3/5319
ATA Version is:   ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Thu Sep  4 11:32:57 2025 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART Error Log Version: 1
ATA Error Count: 1
	CR = Command Register [HEX]
	FR = Features Register [HEX]
	SC = Sector Count Register [HEX]
	SN = Sector Number Register [HEX]
	CL = Cylinder Low Register [HEX]
	CH = Cylinder High Register [HEX]
	DH = Device/Head Register [HEX]
	DC = Device Command Register [HEX]
	ER = Error register [HEX]
	ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 1 occurred at disk power-on lifetime: 34583 hours (1440 days + 23 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 51 01 a3 af 3c 0c  Error: ICRC, ABRT at LBA = 0x0c3cafa3 = 205303715

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  61 80 28 24 af 3c 40 00      06:38:30.297  WRITE FPDMA QUEUED
  61 80 38 a4 ae 3c 40 00      06:38:30.297  WRITE FPDMA QUEUED
  61 80 20 24 ae 3c 40 00      06:38:30.297  WRITE FPDMA QUEUED
  61 80 18 a4 ad 3c 40 00      06:38:30.296  WRITE FPDMA QUEUED
  61 80 10 24 ad 3c 40 00      06:38:30.296  WRITE FPDMA QUEUED

64

So I thought I'd add 64 to the "allowed exit codes", but then the check just doesn't output anything for my disks.

Did I miss something?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions