Linux Server SCSI / SATA Hard Disk Failure check
I/O errors in /var/log/messages indicates that something is wrong with the hard disk and it may be failing. You can check hard disk for errors using smartctl command, which is control and monitor utility for SMART disks under Linux / UNIX like operating systems
smartctl for servers
smartctl is a command line utility designed to perform SMART tasks such as printing the SMART self-test and error logs, enabling and disabling SMART automatic testing, and initiating device self-tests. First, make sure S.M.A.R.T. support is enabled in the BIOS.Next, run the following command to see if your hard disks support S.M.A.R.T technology or not:
# smartctl -i /dev/sdb
To enable SMART, run:
# smartctl -s on -d ata /dev/sdb
Sample outputs:
smartctl version 5.33 [x86_64-redhat-linux-gnu] Copyright (C) 2002-4 Bruce Allen Home page is http://smartmontools.sourceforge.net/ === START OF ENABLE/DISABLE COMMANDS SECTION === SMART Enabled.
Run overall-health self-assessment test, enter:
# smartctl -d ata -H /dev/sdb
Sample outputs:
smartctl version 5.33 [x86_64-redhat-linux-gnu] Copyright (C) 2002-4 Bruce Allen Home page is http://smartmontools.sourceforge.net/ === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED
Sample failing hard Disk detailed report
# smartctl -a /dev/sda
smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen Home page is http://smartmontools.sourceforge.net/ === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED Please note the following marginal Attributes: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 190 Airflow_Temperature_Cel 0x0022 044 033 045 Old_age Always FAILING_NOW 56 (96 110 58 25)
The following will provide even more information about failing hard disk:
# smartctl --attributes --log=selftest /dev/sda
You can read more data from hard disk by typing the following command:
# smartctl -d ata -a /dev/sdb
A note about RAID controller
To look at ATA disks behind 3ware SCSI RAID controllers, the syntax is:# smartctl -a -d 3ware,2 /dev/sda
# smartctl -a -d 3ware,0 /dev/twe0
SATA Health Check Disk Syntax
# smartctl -d sat --all /dev/sgX
# smartctl -d sat --all /dev/sg1
Run test:
# smartctl -d sat --all /dev/sg1 -H
For SAS disk use the following syntax:
# smartctl -d scsi --all /dev/sgX
# smartctl -d scsi --all /dev/sg1
# smartctl -d scsi --all /dev/sg1 -H
Configure SMARTD
Red Hat Linux- Install smartd #yum install smartd*
- Enable smart by editing /etc/smartd.conf file.
- Smart Configuration file: /etc/smartd.conf
- Start/Stop smart: /etc/init.d/smartd start | stop
Example
You can put following directives in Smart Configuration file:(a) Send an email to alert@nixcraft.in for /dev/sdb:
/dev/sdb -m alert@nixcraft.in
(b) Read error log:
# smartctl -l error /dev/hdb
(c) Testing hard disk (short or long test):
# smartctl -t short /dev/hdb
# smartctl -t long /dev/hdb
Source : http://sourceforge.net/apps/trac/smartmontools/wiki
No comments:
Post a Comment