Promise Pegasus2: Scripting a disk check with promise_disk_check.sh

When you deploy a Promise Pegasus2, you want to run regular disk health checks and send an email notification if there’s a problem. The Promise Utility app can theoretically do this* when there’s someone logged in at the console, but we’re rarely running these in environments where there’s anyone logged at the console.

The solution is to script a check of the disks using the promiseutil command line utility and then create a cronjob to run it at regular intervals.

Here’s an example disk check that parses the output of phydrv, logs each run to system.log and can optionally send email if a problem is found.

#!/bin/bash
#
# promise_disk_check.sh
#
# Checks the phydrv status of a Promise Pegasus, logs and mails the output if there's an issue.
#
# Author: A @ Modest Industries
# Last update: 2014-07-19
# 2014-07-19 - tweaked grep to allow for Media Patrol
#
# Works with Promise Utility for Pegasus2 v3.18.0000.18 (http://www.promise.com)
# Requires sendemail for email alerts (http://caspian.dotconf.net/menu/Software/SendEmail/)

export DATESTAMP=`date +%Y-%m-%d\ %H:%M:%S`

# Editable variables

# Path to sendemail
sendemail_path="/Library/Scripts/Monitoring/sendemail"
# Email alert?
send_email_alert=true

# Variables for sendemail
# Sender's address
alert_sender="[email protected]"

# Recipient's addresses, comma separated.
#alert_recipient='[email protected], [email protected]'
alert_recipient="[email protected]"

# SMTP server to send the messages through
# alert_smtp_server="smtp.example.com:port"
alert_smtp_server="smtp.example.com"

# Subject line of the alert.
alert_subject="Alert: Promise disk problem detected on $HOSTNAME."

# Header line at the top of the alert message 
alert_header="At $DATESTAMP, a problem was detected on this device:\n"

# Pass / Fail messages
pass_msg="Promise disk check successful."
fail_msg=" *** Promise disk check FAILED!!! ***"

# ------------ Do not edit below this line ------------------
# Variables
pass=true
results=""

# Create temp files
unit_ID_tmp=`mktemp "/tmp/$$_ID.XXXX"`
results_tmp=`mktemp "/tmp/$$_results.XXXX"`

# Get header information for this Promise unit. Includes workaround for promiseutil tty issue.
screen -D -m sh -c "promiseutil -C subsys -v >$tmpdir$unit_ID_tmp"
unit_ID=$(<$tmpdir$unit_ID_tmp)

# Get status of the disks.  Includes workaround for promiseutil tty issue.
screen -D -m sh -c "promiseutil -C phydrv >$tmpdir$results_tmp"

# Check each line of the output the test results.
while read -r line
do
        if grep '^[0-9]' <<< "$line" | grep -Eqv 'OK|Media'
        then
                results=$results"BAD DRIVE DETECTED: $line\n\n"
                pass=false
        fi
done < $tmpdir$results_tmp

# Log the results, conditionally send email on failure.
if [ "$pass" = false ] ; then
        results="$alert_header$unit_ID\n\n$results\n$alert_footer"
        echo "$DATESTAMP: $fail_msg\n\n$results" >> /var/log/system.log
        if [ "$send_email_alert" = true ] ; then
                "$sendemail_path" -f $alert_sen:der -t $alert_recipient -u $alert_subject -m "$results" -s $alert_smtp_server
        fi
else
        echo "$DATESTAMP: $pass_msg" >> /var/log/system.log
fi

# Cleanup
rm -f $tmpdir$unit_ID_tmp $tmpdir$results_tmp

Note that the script uses sendemail for sending mail, a very useful little drop in for when the local machine isn’t running mail services.

*I say “theoretically” because configuring email in the Promise Utility is a mess and I’ve yet to see a single successful notification after configuring it.

2014-05-04 – Updated to make the path to sendemail a variable.

2014-07-19 – Changed grep to handle false positive during Media Patrol runs

Leave a Reply