Error message checking script - mhulse/mhulse.github.io GitHub Wiki

Note: This below information is from our head/senior DBA/SA/NA.


The script I use to check /var/adm/messages file for error lines is called msgscheck:

#!/bin/bash
#
# Checks the /var/adm/messages file for error lines and forwards and
# email if it finds any.

grep -n "`date +%b\ %d`" /var/adm/messages | grep -i error > /tmp/msgerrs.today

mailx -s"Errors" [email protected] [email protected] < /tmp/msgerrs.today

It is run via a crontab on our Caché data server. I run it three times a day. At 07:55 to get any error occurring since midnight; at 15:55 to get any that popped up during the day; and finally, at 23:55 to get a summary of all of the errors occurring during the day.

About the script:

  • #!/bin/bash determines the shell used to run the script. In this case the bash shell.
  • grep -n "``date +%b\ %d``" /var/adm/messages returns all lines in the file with the current day's date.
    • The grep command searches the files for the string that follows.
    • The "``date +%b\ %d``" concatenates the Month and the Day from the date command into the search string. The "-n" flag puts line numbers in front of the error line in case we want to go back and see what lines are around the error line.
  • The result is piped to grep -i error; this Returns all lines that have the current days date that also contain the word "Error."
    • The -i switch tells grep to ignore the case of the letters so it returns lines containing "Error" or "error"
  • > /temp/msgerrs.today creates a file containing the lines with the error messages.
  • The "mailx" program creates the email.
    • The -s switch adds a subject to the email.
    • The "To:" is handled by the space separated email addresses.
    • The < /tmp/msgerrs.today file feeds the body of the message to the email program.

If there were no errors the msgerrs.today file is empty and you get a blank email. No news is good news when dealing with error messages.

Example output:

3942:Jul 23 16:00:20 XYZ Cache(FOO)[21724]: [ID 392709 user.alert] Skipping KILL due to global ^["^^/FOO/data/Cms/"]oddCOM("csp.rg.templates.stories.full.full04","p","ERRORPAGE",2) already exist for records in /FOO/data/Cms/ addr=1016139856
3955:Jul 23 16:00:20 XYZ Cache(FOO)[21724]: [ID 870028 user.alert] Skipping KILL due to global ^["^^/FOO/data/Cms/"]oddCOM("csp.rg.templates.stories.full.full04","m","ThrowError",2) already exist for records in /FOO/data/Cms/ addr=1016138488
3958:Jul 23 16:00:20 XYZ Cache(FOO)[21724]: [ID 438196 user.alert] Skipping KILL due to global ^["^^/FOO/data/Cms/"]oddCOM("csp.rg.templates.stories.full.full04","m","ShowError",2) already exist for records in /FOO/data/Cms/ addr=1016138176
3975:Jul 23 16:00:21 XYZ Cache(FOO)[21724]: [ID 174594 user.alert] Skipping KILL due to global ^["^^/FOO/data/Cms/"]oddCOM("csp.rg.templates.stories.full.full04","m","OnPageError",2) already exist for records in /FOO/data/Cms/ addr=1016136368
4040:Jul 23 16:00:21 XYZ Cache(FOO)[21724]: [ID 317560 user.alert] Skipping KILL due to global ^["^^/FOO/data/Cms/"]rMAP("csp.rg.templates.stories.full.full04","CLS","INT","ThrowError",3) already exist for records in /FOO/data/Cms/ addr=1016129232
4041:Jul 23 16:00:21 XYZ Cache(FOO)[21724]: [ID 957416 user.alert] Skipping KILL due to global ^["^^/FOO/data/Cms/"]rMAP("csp.rg.templates.stories.full.full04","CLS","INT","ThrowError",1) already exist for records in /FOO/data/Cms/ addr=1016129128
4048:Jul 23 16:00:21 XYZ Cache(FOO)[21724]: [ID 272400 user.alert] Skipping KILL due to global ^["^^/FOO/data/Cms/"]rMAP("csp.rg.templates.stories.full.full04","CLS","INT","ShowError",2) already exist for records in /FOO/data/Cms/ addr=1016128420
4049:Jul 23 16:00:21 XYZ Cache(FOO)[21724]: [ID 141296 user.alert] Skipping KILL due to global ^["^^/FOO/data/Cms/"]rMAP("csp.rg.templates.stories.full.full04","CLS","INT","ShowError",1) already exist for records in /FOO/data/Cms/ addr=1016128320

Important: On a Solaris system, the log files should be located in /var/adm/messages; on a Unix/Linux system, the logs are kept in /var/logs. YMMV.