TomTheGeek

All the geeky stuff that gets me hot.

Saturday, November 22, 2008

Pop up notification of software RAID problems in Ubuntu

I finally was able to setup a software RAID in my last install of Ubuntu 8.10 and I wanted to setup monitoring so I would know about any problems immediately. Ubuntu sets up monitoring automatically when you create a MDADM arrary but by default it only sends an email to the local root account. That would be fine for a server but on a desktop system a visual notification would be better. Here are the steps I took to enable a pop up notification whenever something bad happens to my RAID array.

To make the notification bubble I used notify-send. Notify-send is part of the notification-daemon package so install it with this command:
sudo apt-get install notification-daemon
In order to make sure our notification are working we need to be able to send a test notification. MDADM has a testing mode built right in, to activate it edit /etc/default/mdadm and add " --test" to the end of the DAEMON_OPTIONS string. This will tell MDADM to send a test message every time it starts.

Now we need to configure MDADM to run a program when there's a problem as well as setting the email address of where we want the notifications sent. Edit /etc/mdadm/mdadm.conf and set the MAILADDR and MAILFROM email addresses. MAILADDR is where you want the email sent, MAILFROM is what shows up in the From field in the email. You may have to add the MAILFROM line. To run our program add a PROGRAM line underneath the MAIL line with the following:
# display local notification
PROGRAM /usr/sbin/mdadm-notify.sh
This tells MDADM to run our script when there's a problem. Now we just need to create /usr/sbin/mdadm-notify.sh and add the following:
#!/bin/bash

#get dbus session
eval `dbus-launch --sh-syntax --exit-with-session`

#show alert
/usr/bin/notify-send -u critical -c device.error -i /usr/share/icons/Human/32x32/status/dialog-warning.png -t 0 "RAID Status" "<b>$2</b>: $1" -h int:x:1250 -h int:y:20
There are a couple settings you'll want to change on the last line of the script. Most of all you'll have to alter the int:x and int:y values so they work on your system. I have a triple screen setup so my settings will likely put the notification off screen if you don't change them. Also the warning icon I used should be standard for Ubuntu 8.10 but if it's not showing up you'll have to set the path to an icon on your system. Finally be sure to run "sudo set chmod +x /usr/sbin/mdadm-notify.sh" to make the script executable.

Now we are ready to test. Run "sudo /etc/init.d/mdadm restart" to restart MDADM and send the test message. Adjust the int:x and int:y values if the pop up isn't where you want it or the icon isn't showing up and test again. Once you are satisfied with the appearance of the notification go back into /etc/default/mdadm and remove the " --test" from the DAEMON_OPTIONS string so you don't keep getting the test message.

That's it! Now if something happens to the RAID array you'll get a nice visual notification of the problem.

Labels: , , ,

7 Comments:

  • Great! Just what I was looking for. I set notification coordinates to -h int:x:200 -h int:y:20 which (for me) sets the notification bubble under the (gnome) System menu. I had to install libnotify-bin separately as notify-send wasn't in the system although notification-daemon was installed. I placed the mdadm-notify.sh script into /usr/local/sbin (and changed paths accordingly). A nice time saver this was. Thank you.

    By Anonymous Anonymous, at 3:54 AM CST  

  • hi,

    Google is always your friend. Was looking for this exact thing and found your page.

    I have issues with it showing up doing the /etc/init.d/restart.

    The script by itself works, just not in combo with the restart option... i wonder if my --test is in the correct place.

    Thanks for the doing the hard yards on this for me!

    Cheers.

    By Anonymous Anonymous, at 4:13 AM CST  

  • You have not thought this through at all. Let's dissect it:

    * "eval `dbus-launch --sh-syntax --exit-with-session`":
    This creates a NEW DBUS session, it DOES NOT CONNECT TO THE EXISTING GRAPHICAL SHELL DBUS. So what your script is in fact doing is creating an invalid DBUS, then running notify-send, which in turn will see that the DBUS is invalid and will instead use the XAUTHORITY environment variables to send the message.

    * " "$2: $1"":
    This is terrible. Mdadm has about 8 events, and about 4 of them have three parameters. You're throwing away the third parameter like a moron. That third parameter contains stuff like: "Fail /dev/md0 /dev/sda1", i.e. the affected disk that died in the array.

    * No error checking:
    This one is self-explanatory.

    And lastly (and by far the worst):
    * "PROGRAM /usr/sbin/mdadm-notify.sh"
    You're telling mdadm to immediately run this script on errors. Well, if you boot a system with degraded/inactive arrays, the mdadm monitor will trigger this script multiple times before X11 has even started. Those messages will not be shown in any way.

    Or the short version of what I've just written:
    This script doesn't work properly, has no error checking, throws away important parameters, and breaks on boot.

    By Anonymous Anonymous, at 1:49 AM CDT  

  • It's a pity you didn't leave a return number or a suggestion on how to fix these egregious errors.

    > "eval `dbus-launch --sh-syntax --exit-with-session`":
    > This creates a NEW DBUS session, it DOES NOT CONNECT TO THE EXISTING GRAPHICAL SHELL DBUS.

    So what's the correct way to do this then? Why go through all the effort to tell me I'm wrong yet leave me wondering what the correct solution is? As it is, it works. So there.

    > * " "$2: $1"":
    > This is terrible. Mdadm has about 8 events, and about 4 of them have three parameters. You're throwing away the third parameter like a moron.

    I extracted the information I was looking for. You may have different goals for your raid notifications, if so you're welcome to modify this to suit your needs.

    * No error checking:
    This one is self-explanatory.

    It's a simple notification script, how robust does it need to be? Again your comment goes from constructive criticism to juvenile harassment because you failed to offer a solution to the problem. What things should we be error checking?

    > And lastly (and by far the worst):
    > * "PROGRAM /usr/sbin/mdadm-notify.sh"
    > You're telling mdadm to immediately run this script on errors. Well, if you boot a system with degraded/inactive arrays, the mdadm monitor will trigger this script multiple times before X11 has even started. Those messages will not be shown in any way.

    Maybe you missed the title of this post, "Pop up notification of software RAID problems in Ubuntu". "Pop up" implies a graphical element. Messages being discarded during boot is perfectly fine. I only needed notification in the X11 session because the OS will spit out warnings during boot but I needed notification of a problem while I was using the computer.

    > Or the short version of what I've just written:
    > This script doesn't work properly, has no error checking, throws away important parameters, and breaks on boot.

    "Doesn't work properly" - False. It works exactly the way I wanted it to.

    "Has no error checking" - True. I'm not sure what kind of errors I should be looking for though.

    "throws away important parameters" - False. I use the parameters I need. My needs are not the same as yours.

    "Breaks on boot" - False. It was never intended to work during boot.

    By Blogger TomTheGeek, at 11:20 AM CDT  

  • Wow, I've woken you up from being an inactive blogger, cool! Alright you want me to post some solutions. Fine.

    The correct way to get the DBUS session of the graphical user interface is to find the process ID of the graphical shell, then read its process data, this contains a file called "environ" which houses all of its environment variables. Here's an example for Gnome:

    PSGNOME=$(ps -e -o user:30,pid,cmd | grep -G "[0-9] gnome-session" | head -1 | tr -s " ")
    if [[ -z "$PSGNOME" ]]; then
    exit ##error: no gnome session running
    fi
    ACTUALUSER=$(echo "$PSGNOME" | cut -d " " -f 1)
    GNOMEPID=$(echo "$PSGNOME" | cut -d " " -f 2)

    export DISPLAY=$(grep -z "DISPLAY=" /proc/${GNOMEPID}/environ | sed -e "s/DISPLAY=//")
    export DBUS_SESSION_BUS_ADDRESS=$(grep -z "DBUS_SESSION_BUS_ADDRESS=" /proc/${GNOMEPID}/environ | sed -e "s/DBUS_SESSION_BUS_ADDRESS=//")

    Now you know the user that the Gnome shell runs under, the DISPLAY, and the DBUS session. Now simply issue this command to display the alert:

    sudo -u $ACTUALUSER /usr/bin/notify-send --urgency=critical --category=device.error --icon=gtk-stop "RAID Warning" "$MESSAGE"

    Have fun.

    By Anonymous Anonymous, at 5:36 AM CDT  

  • Note that the above is suitable for inclusion in a root cron job, or the mdadm monitor (which also runs as root), as it is able to find the desktop and logged-in user effortlessly and without relying on any hacks (as in, I've seen others do stuff like greping the list of logged in users and finding who uses DISPLAY ":0", not thinking about how arbitrary that is (the graphical shell might be running as another display)).

    By Anonymous Anonymous, at 5:38 AM CDT  

  • Thank you, that looks great. I'll test it on my system then update the script in this post. I'll also add some info about the other variables available from Mdadm.

    By Blogger TomTheGeek, at 10:59 AM CDT  

Post a Comment

<< Home