HD , Network , FC – 檢查 Error count

測試環境為 Ubuntu14.04

HD 硬碟, Network 網路, FC (Fiber Channel)光纖 這幾種常見的 I/O 介面要如何檢查是不是有產生 Error count.

HD (SATA , SAS)

透過硬碟本身所提供的 S.M.A.R.T. (Self-Monitoring Analysis and Reporting Technology )技術 ,就可以檢測該顆硬碟的 Error count (Errors Corrected by ECC fast | delayed , Errors Corrected by rereads / rewrites , Total errors corrected , Correction algorithm invocations , Total uncorrected errors)

root@ubuntu:~# smartctl -l error /dev/sdb
smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.13.0-24-generic] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
Error counter log:
           Errors Corrected by           Total   Correction     Gigabytes    Total
               ECC          rereads/    errors   algorithm      processed    uncorrected
           fast | delayed   rewrites  corrected  invocations   [10^9 bytes]  errors
read:          0        0         0         0          0     123263.011           0
write:         0        0         0         0          0       5218.671           0
verify:        0        0         0         0          0         24.721           0

Non-medium error count:      148
[root@localhost ~]# smartctl -A /dev/sdb
smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.10.0-514.el7.x86_64] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
 
=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   100   100   050    Pre-fail  Always       -       13
  5 Reallocated_Sector_Ct   0x0032   100   100   001    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   100   100   001    Old_age   Always       -       1586
 12 Power_Cycle_Count       0x0032   100   100   001    Old_age   Always       -       554
170 Unknown_Attribute       0x0033   100   100   010    Pre-fail  Always       -       44
171 Unknown_Attribute       0x0032   100   100   001    Old_age   Always       -       0
172 Unknown_Attribute       0x0032   100   100   001    Old_age   Always       -       0
173 Unknown_Attribute       0x0033   100   100   000    Pre-fail  Always       -       39
174 Unknown_Attribute       0x0032   100   100   001    Old_age   Always       -       466
184 End-to-End_Error        0x0033   100   100   050    Pre-fail  Always       -       0
187 Reported_Uncorrect      0x0032   100   100   001    Old_age   Always       -       0
188 Command_Timeout         0x0032   100   100   001    Old_age   Always       -       1
194 Temperature_Celsius     0x0022   078   065   000    Old_age   Always       -       22 (Min/Max 14/35)
195 Hardware_ECC_Recovered  0x003a   100   100   001    Old_age   Always       -       1908
197 Current_Pending_Sector  0x0032   100   100   001    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   100   100   001    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   100   100   001    Old_age   Always       -       0
202 Unknown_SSD_Attribute   0x0018   100   100   001    Old_age   Offline      -       0
206 Unknown_SSD_Attribute   0x000e   100   100   001    Old_age   Always       -       0
247 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       928166108
248 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       21320209

更多關於 smartctl 的使用請參考

HD (NVMe)

NVMe 的儲存裝置,是不是也可以透過 smartctl 來檢視資料呢!雖然在 smartctl 官網有提到 https://www.smartmontools.org/wiki/NVMe_Support 但透過 smartctl 看 nvme 所得到的資訊卻很不多,官網也建議使用 nvme 指令 (由 nvme-cli 套件提供) ,請參考 http://benjr.tw/98887 .

[root@localhost ~]# smartctl -A /dev/nvme0
smartctl 6.6 2017-11-05 r4594 [x86_64-linux-3.10.0-514.el7.x86_64] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org
 
=== START OF SMART DATA SECTION ===
Read NVMe SMART/Health Information failed: NVMe Status 0x04
[root@localhost ~]# nvme error-log /dev/nvme0

Network

指令 #ip 加入參數 -s (statistics) 就可以看到 RX (Receive) , TX (Transmit) packets , errors , dropped , overrun , mcast , arrier , collsns 等統計資料.

root@ubuntu:~# ip -s link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    RX: bytes  packets  errors  dropped overrun mcast
    0          0        0       0       0       0
    TX: bytes  packets  errors  dropped carrier collsns
    0          0        0       0       0       0
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT qlen 1000
    link/ether 08:00:27:cb:a9:8b brd ff:ff:ff:ff:ff:ff
    RX: bytes  packets  errors  dropped overrun mcast
    2756198    8601     0       0       0       0
    TX: bytes  packets  errors  dropped carrier collsns
    682123     7551     0       0       0       0

也可以直接查看 /sys/class/net 網路裝置下的錯誤統計.

root@ubuntu:~# ls /sys/class/net/eth0/statistics/
collisions           rx_dropped           rx_missed_errors     tx_carrier_errors    tx_heartbeat_errors
multicast            rx_errors            rx_over_errors       tx_compressed        tx_packets
rx_bytes             rx_fifo_errors       rx_packets           tx_dropped           tx_window_errors
rx_compressed        rx_frame_errors      tx_aborted_errors    tx_errors            
rx_crc_errors        rx_length_errors     tx_bytes             tx_fifo_errors 

其他關於 Network Error 檢查請參考 http://benjr.tw/94371

FC

可以直接查看 /sys/class/fc_host 網路裝置下的錯誤統計.

root@ubuntu:~# ls /sys/class/fc_host/host2/statistics/
dumped_frames                fcp_output_requests          loss_of_sync_count           seconds_since_last_reset
error_frames                 invalid_crc_count            nos_count                    tx_frames
fcp_control_requests         invalid_tx_word_count        prim_seq_protocol_err_count  tx_words
fcp_input_megabytes          link_failure_count           reset_statistics             
fcp_input_requests           lip_count                    rx_frames                    
fcp_output_megabytes         loss_of_signal_count         rx_words                   

發表迴響

你的電子郵件位址並不會被公開。 必要欄位標記為 *

這個網站採用 Akismet 服務減少垃圾留言。進一步瞭解 Akismet 如何處理網站訪客的留言資料