Today, I've decided to load the smartmontools package on one of my Solaris 10 file servers. This toolset allows an administrator to make use of the extra features in all S.M.A.R.T. capable hard drives. What I am interested in doing is configuring a storage server to run the smartd daemon and email me when a disk is throwing errors. Hopefully, this will help me to preempively replace disks before an issue arises.
Before we begin, I have to give credit to "Matty" for both of these posts: Blog O' Matty #1 and Blog O' Matty #2. Without them, I would probably still be trying to figure this out.
Here are the refined steps I used to set this up on one of my Solaris 10 storage systems. Enjoy.
Installing Smartmon on Solaris 10
Download smartmontools from here.
# wget http://downloads.sourceforge.net/smartmontools/smartmontools-5.38.tar.gz
# gunzip smartmontools-5.38.tar.gz |tar xvf -
# cd smartmontools-5.38
# ./configure --sbindir=/usr/sbin \
--sysconfdir=/etc \
--mandir=/usr/share/man \
--with-docdir=/usr/share/doc/smartmontools-5.38 \
--with-initscriptdir=/etc/init.d
# make
# su
# make install
Create three service scripts in /usr/local/bin: smartd.start, smartd.stop, and smartd.restart:
#!/bin/sh
/etc/init.d/smartd start
#!/bin/sh
/etc/init.d/smartd stop
#!/bin/sh
/etc/init.d/smartd restart
Now create a new xml file called "smartd.xml":
<?xml version="1.0"?>
<!DOCTYPE service_bundle SYSTEM "/usr/share/lib/xml/dtd/service_bundle.dtd.1">
<service_bundle type='manifest' name='smartd'>
<service
name="application/smartd"
type="service"
version="1">
<create_default_instance enabled="true"/>
<exec_method
type='method'name='start'
exec='/usr/local/bin/smartd.start'
timeout_seconds='3'>
</exec_method>
<exec_method
type='method'
name='stop'
exec='/usr/local/bin/smartd.stop'
timeout_seconds='3'>
</exec_method>
<exec_method
type='method'
name='restart'
exec='/usr/local/bin/smartd.restart'
timeout_seconds='3'>
</exec_method>
</service>
</service_bundle>
Save the file and test it with svccfg:
# svccfg validate smartd.xml
# echo $?
0
If you get the utterly unuseful error "svccfg: couldn't parse document", use xmllint to find the offending portion:
# xmllint -valid smartd.xml
correct any errors and revalidate with svccfg.
Now import the new manifest:
# svccfg import smartd.xml
List the properties of the new service for verification:
# svccfg -s application/smartd listprop
Edit /etc/smartd.conf to your liking, so that it will run whatever tests you require for your environment. For my purposes, I simply added a line for every disk in the server:
/dev/rdsk/c1t0d0 -d scsi -o on -a
/dev/rdsk/c1t1d0 -d scsi -o on -a
/dev/rdsk/c1t2d0 -d scsi -o on -a
/dev/rdsk/c1t3d0 -d scsi -o on -a
/dev/rdsk/c3t0d0 -d scsi -S on -o on -a
/dev/rdsk/c3t1d0 -d scsi -S on -o on -a
/dev/rdsk/c3t2d0 -d scsi -S on -o on -a
/dev/rdsk/c3t3d0 -d scsi -S on -o on -a
/dev/rdsk/c3t4d0 -d scsi -S on -o on -a
/dev/rdsk/c3t5d0 -d scsi -S on -o on -a
/dev/rdsk/c3t8d0 -d scsi -S on -o on -a
/dev/rdsk/c3t9d0 -d scsi -S on -o on -a
/dev/rdsk/c3t10d0 -d scsi -S on -o on -a
/dev/rdsk/c3t11d0 -d scsi -S on -o on -a
/dev/rdsk/c3t12d0 -d scsi -S on -o on -a
/dev/rdsk/c3t13d0 -d scsi -S on -o on -a
/dev/rdsk/c3t14d0 -d scsi -S on -o on -a
/dev/rdsk/c3t15d0 -d scsi -S on -o on -a
/dev/rdsk/c5t0d0 -d scsi -S on -o on -a
/dev/rdsk/c5t1d0 -d scsi -S on -o on -a
/dev/rdsk/c5t2d0 -d scsi -S on -o on -a
/dev/rdsk/c5t3d0 -d scsi -S on -o on -a
/dev/rdsk/c5t4d0 -d scsi -S on -o on -a
/dev/rdsk/c5t5d0 -d scsi -S on -o on -a
/dev/rdsk/c5t8d0 -d scsi -S on -o on -a
/dev/rdsk/c5t9d0 -d scsi -S on -o on -a
/dev/rdsk/c5t10d0 -d scsi -S on -o on -a
/dev/rdsk/c5t11d0 -d scsi -S on -o on -a
/dev/rdsk/c5t12d0 -d scsi -S on -o on -a
/dev/rdsk/c5t13d0 -d scsi -S on -o on -a
/dev/rdsk/c5t14d0 -d scsi -S on -o on -a
/dev/rdsk/c5t15d0 -d scsi -S on -o on -a
Enable the new service and verify that it's running as expected:
# svccfg enable application/smartd
# ps -elf |grep smartd
Originally, I was going to use the "-m" function to send email alerts, but I found that smartd works quite well with syslog. Since I already have a centralized syslog server, I'll just add a swatch statement to watch for smartd entries and then send email alerts from there.
Thursday, February 12, 2009
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment