Thursday, February 9, 2012

Using logwarn to manage logs along with Nagios

I needed to check on a reject file and notify Nagios right away. The reject file rotate everyday with the following format: "C"yyyymmdd.rej (C20120209.rej). The file looks like this:


The objectives were the following:
- Check file every 1 minute from 9:30 am - 4:00 pm
- The first line contains "ACMEFixLogs" and it is not a reject payload
- Notify the number of rejects and the name of the file
- If after the notification, there are no rejects, do not notify

The simplest way is to leverage some type of log checkers that reads the content and saves the last line in a temporary file. This way, the next check will start from the saved line and the content will not be repeated. I found logwarn and I really like it. Mostly because it is has a Nagios plugin and this is my notification engine. Also, logwarn has a regex filter that I could use for my file.

Here is how I solve the problem:
- Create a shell script as a Nagios plugin (exit 0 if everything is "OK", exit 2 if there are any errors.
- Leverage the capabilities of logwarn to handle the reading of the file
- Use the regex so that I disregard the first line "ACMEFixLogs" using the logwarn "!"
- Use Linux's word count to find out how many rejected lines there are
- If there are any reject payloads notify (exit 2)
- If there are none, notify that everything is fine (exit 0)

Here is the script:

3 comments:

  1. Hi Marcelo,
    your solution sounds interesting.
    Can you estimate how much of a performance impact the log monitoring was?
    I'm thinking of doing it myself but am a bit afraid of the overhead

    ReplyDelete
    Replies
    1. Hey Ittai, so far, not much and I'm using a pretty big log (sometimes it goes over 10+ GB).

      Delete
  2. This comment has been removed by the author.

    ReplyDelete