Tag Archives: troubleshooting

Logging time-stamped ping results to a file using Applescript and bash.

I deal with a number of remote workers who, for one reason or the other, don’t work in the company office. Often, they’re using a VPN tunnel to connect to a server back at the company.

Occasionally, we’ll see intermittent connectivity issues from the client. Perhaps it’s their ISP, perhaps it’s the VPN tunnel, perhaps it’s a piece of software triggering IDS on a managed firewall.

In any case, we can triangulate the problem by launching a script on the client’s side that pings endpoints of our choosing to check connectivity. But we also want to time stamp and capture the results of the pings to a text file we can review later.

This is where

tee

is your friend. As the man entry says, tee is a “pipe fitting”.

The tee utility copies standard input to standard output, making a copy in zero or more files.

So, here are our requirements:

  1. Script is user-initiated.
  2. Script gets out of the user’s way.
  3. Timestamps and logs the pings to a text file in a  folder on the Desktop.

This Applescript, which makes a bunch of bash calls, does all of that.

# Simple ping monitor
# A script that pings servers of your choice by IP or DNS name and logs the results to a text file in a folder on the Desktop.
#
# Written by AB @ Modest Industries (modestindustries.com)
#
# 2012-07-25 - AB: First draft.
# 2014-07-25 - AB: Formatting cleanup. 

#Servers to ping. For each server you name here, you'll need to set up a ping statement below.
set server1 to "google.com"
set server2 to "8.8.8.8"
set server3 to "yahoo.com

property the_prefix : space

property the_sep : "-"

# Format a date to use as a datestamp.
on myDate()
    
    set myYear to "" & year of (current date)
    
    set myMth to text -2 thru -1 of ("0" & (month of (current date)) * 1)
    
    set myDay to text -2 thru -1 of ("0" & day of (current date))
    
    set myHours to hours of (current date)
    
    set myMinutes to minutes of (current date)
    
    return {myYear, myMth, myDay, myHours, myMinutes}
    
end myDate

# Check for a folder called Monitoring on the Desktop. If it doesn't exist, make one.
tell application "Finder"
    set the directory to desktop
    if (exists folder "Monitoring") is false then
        make new folder at desktop with properties {name:"Monitoring"}
    end if
    
    set the_path to folder "Monitoring" of desktop
    
    set the_name to (item 1 of my myDate())
    
    set the_name to (the_name & the_sep & item 2 of my myDate())
    
    set the_name to (the_name & the_sep & item 3 of my myDate())
    
    set the_timestamp to item 4 of my myDate() & item 5 of my myDate()
    
    -- set the directory to "Monitoring"
    if (exists folder the_name of folder "Monitoring" of desktop) is false then
        make new folder at the_path with properties {name:the_name}
    end if
    
    set the_path to folder the_name of folder "Monitoring" of desktop as alias
    
    set posixPath to POSIX path of the_path
end tell

# Ping servers of your choice. You'll need one statement for each server named above.

tell application "Terminal" to do script "ping " & server1 & " | while read pong; do echo \"$(date): $pong\"; done | tee " & quoted form of posixPath & the_name & the_sep & the_timestamp & the_sep & server1 & ".txt"

tell application "Terminal" to do script "ping " & server2 & " | while read pong; do echo \"$(date): $pong\"; done | tee " & quoted form of posixPath & the_name & the_sep & the_timestamp & the_sep & server2 & ".txt"

tell application "Terminal" to do script "ping " & server3 & " | while read pong; do echo \"$(date): $pong\"; done | tee " & quoted form of posixPath & the_name & the_sep & the_timestamp & the_sep & server3 & ".txt"

# Hide all the windows.
tell application "System Events" to set visible of process "Terminal" to false

# Tell the user it's running.
display dialog "Ping monitor is running!" buttons {"OK"} default button 1

# Switch back to the Finder.
tell application "Finder" to activate

You might want to tweak the dialogue to tell the user to leave the Terminal app running.

Should this be a bash script? Probably. But this works and can be launched by the user and hides most of the gubbins so that the user can get on with their business.

Promise Pegasus2: The gap between a failing disk and a failed disk.

We were recently called in to diagnose a relatively new Promise Pegasus2 R6 that intermittently refused to mount. The Promise Utility app reported nothing amiss with the RAID or the drives, green lights everywhere, so we used the command line to dig a little deeper.

So let’s run a verbose SMART check on the unit:

promiseutil -C smart -v

The first three drives checked out. Drive 4 indicated that SMART thought everything was fine:

PdId: 4
Model Number: TOSHIBA DT01ACA2
Drive Type: SATA
SMART Status: Enable
SMART Health Status: OK

But then a little further down,  CRC errors:

Error 165 occurred at disk power-on lifetime: 1176 hours (49 days + 0 hours)
  When the command that caused the error occurred,
  the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 51 50 b0 ee 81 0d

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  61 80 a8 80 ee 81 40 00      18:38:48.276  WRITE FPDMA QUEUED
  61 80 a0 00 ee 81 40 00      18:38:48.276  WRITE FPDMA QUEUED
  61 80 98 80 ed 81 40 00      18:38:48.276  WRITE FPDMA QUEUED
  61 80 90 00 ed 81 40 00      18:38:48.276  WRITE FPDMA QUEUED
  61 80 88 80 ec 81 40 00      18:38:48.275  WRITE FPDMA QUEUED

Error 164 occurred at disk power-on lifetime: 1175 hours (48 days + 23 hours)
  When the command that caused the error occurred,
  the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 51 10 f0 ad 6b 0d  Error: ICRC, ABRT 16 sectors at LBA = 0x0d6badf0 = 225160688

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  35 00 80 80 ad 6b 40 00      18:36:07.145  WRITE DMA EXT
  35 00 80 00 ae 6b 40 00      18:36:07.144  WRITE DMA EXT
  35 00 80 00 ad 6b 40 00      18:36:07.144  WRITE DMA EXT
  35 00 80 80 ab 6b 40 00      18:36:07.139  WRITE DMA EXT
  35 00 80 00 ab 6b 40 00      18:36:07.139  WRITE DMA EXT

Error 163 occurred at disk power-on lifetime: 1175 hours (48 days + 23 hours)
  When the command that caused the error occurred,
  the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 51 f0 10 5e 5d 0d  Error: ICRC, ABRT 240 sectors at LBA = 0x0d5d5e10 = 224222736
...

The client confirmed that he’d seen a warning light on drive 4, but that it had “gone away”. We had them back the data up immediately. Promise support subsequently verified that the drive had failed based on the logs and sent a replacement drive out.

If the drive had failed completely, I assume the RAID would have kicked in, taken the bad drive offline and continued spinning, but since the drive hadn’t actually failed, the volume was struggling with a failing member and that was causing boot and performance issues.

The take-away is that there’s a generous gap between a drive that’s beginning to fail and a drive that’s failed enough for the Promise Utility app to detect it. Verbose mode is your friend.