I just installed a new server from HP, a ProLiant DL180 G6. Here are some notes about the setup.

To check the hardware status you need to install the ProLiant Support Package. Running a Debian/Ubuntu you should import the HP PSP mirror in your sources.list . It can be found here, you might include something like:

deb http://downloads.linux.hp.com/SDR/downloads/proliantsupportpack/Debian stable current/non-free

After an aptitude update you’ll find some new packages. I recommend to install hpaclui to speak to your raid-controllers and hp-health to interact with your hardware.

With hpaclui you can ask the raid-controllers for some information:

usr@srv % hpacucli ctrl all show status

Smart Array P123 in Slot 1
   Controller Status: OK
   Cache Status: OK
   Battery/Capacitor Status: OK

usr@srv % hpacucli ctrl slot=1 show config

Smart Array P123 in Slot 1                (sn: SOMESN  )

   array A (SAS, Unused Space: 0 MB)


      logicaldrive 1 (99.99 GB, RAID 1, OK)

      physicaldrive 1I:1:1 (port 1I:box 1:bay 1, SAS, 99 GB, OK)
      physicaldrive 1I:1:2 (port 1I:box 1:bay 2, SAS, 99 GB, OK)
      [...]

   array B (SAS, Unused Space: 0 MB)


      logicaldrive 2 (99.99 TB, RAID 5, OK)

      physicaldrive 1I:1:3 (port 1I:box 1:bay 3, SAS, 99 TB, OK)
      physicaldrive 1I:1:4 (port 1I:box 1:bay 4, SAS, 99 TB, OK)
      [...]

   Expander 250 (WWID: SOMESN, Port: 1I, Box: 1)

   Enclosure SEP (Vendor ID HP, Model SOMEMD) 248 (WWID: SOMESN, Port: 1I, Box: 1)

   SEP (Vendor ID SOMEVNDR, Model  SOMEMD) 249 (WWID: SOMESN)

So you get an idea of your storage.

The hp-health packages comes with a tool called hpasmcli . It’s used to query all the hardware states:

usr@srv % hpasmcli -s "SHOW"

Invalid Arguments
         SHOW ASR
         SHOW DIMM
         SHOW FANS
         SHOW HT
         SHOW NAME
         SHOW PORTMAP
         SHOW POWERMETER
         SHOW POWERSUPPLY
         SHOW SEL
         SHOW SERVER
         SHOW TEMP
         SHOW TPM
         SHOW UID

usr@srv % hpasmcli -s "SHOW POWERSUPPLY"

Power supply #1
        Present  : Yes
        Redundant: Yes
        Condition: Ok
        Hotplug  : Not supported
Power supply #2
        Present  : Yes
        Redundant: Yes
        Condition: Ok
        Hotplug  : Not supported

Both tools are very easy to use and give a great overview about the health. So I immediately developed a monitoring plugin that parses the output of those runs. I came to the point, that I wasn’t able to find some documentation about the hpasmcli tool. Most of its output was clear, but I don’t know what happens if a fan breaks. The output with working fans looks like:

usr@srv % hpasmcli -s "SHOW FANS"

Fan  Location        Present Speed  of max  Redundant  Partner  Hot-pluggable
---  --------        ------- -----  ------  ---------  -------  -------------
#1   SYSTEM          Yes     NORMAL  45%     Yes        0        No            
#2   SYSTEM          Yes     NORMAL  43%     Yes        0        No            
#3   SYSTEM          Yes     HIGH    100%    Yes        0        No            
#4   SYSTEM          Yes     HIGH    100%    Yes        0        No            
#5   SYSTEM          Yes     NORMAL  22%     Yes        0        No            
#6   SYSTEM          Yes     NORMAL  21%     Yes        0        No            
#7   SYSTEM          Yes     NORMAL  47%     Yes        0        No            
#8   SYSTEM          Yes     NORMAL  46%     Yes        0        No

So what if a fan is broken? Is it still Present and the Speed -string just changes to NONE or something like that? I send a support request to HP, but all they respond was a premium-rate number to call. Seems that my understanding of service differs from theirs. Since I don’t know how the output looks like in an error case (I don’t want to stick pencils into new machines) the plugin can’t decide whether the fans are OK. If you want to use my plugin you need to skip fan-checks until HP publishes a document with possible values. IMHO a public tool should be open source, so I can get those information on my own, or at least well documented!

Btw. HP if you read this, please include some permanent links to your web interface ;-)


Martin Scharm

stuff. just for the records.


2 Comments

Andreas N | Permalink | 2011-08-03 22:42:39

Seems the URL got swallowed by Wordpress. I meant to say:

Have a look at http://labs.consol.de/lang/en/nagios/check_hpasm/ for a well-working Nagios plugin to check the health of an HP server.

Martin Scharm | Permalink | 2011-08-08 00:43:34

Hi Andreas,

thanks for the link, looks like my previous searches were too weak. I’ll include it to the plugin-site.

Post a comment

read more about submitting comments