Error message checking script - mhulse/mhulse.github.io GitHub Wiki
Note: This below information is from our head/senior DBA/SA/NA.
The script I use to check /var/adm/messages
file for error lines is called msgscheck
:
#!/bin/bash
#
# Checks the /var/adm/messages file for error lines and forwards and
# email if it finds any.
grep -n "`date +%b\ %d`" /var/adm/messages | grep -i error > /tmp/msgerrs.today
mailx -s"Errors" [email protected] [email protected] < /tmp/msgerrs.today
It is run via a crontab on our Caché data server. I run it three times a day. At 07:55
to get any error occurring since midnight; at 15:55
to get any that popped up during the day; and finally, at 23:55
to get a summary of all of the errors occurring during the day.
About the script:
#!/bin/bash
determines the shell used to run the script. In this case the bash shell.grep -n "``date +%b\ %d``" /var/adm/messages
returns all lines in the file with the current day's date.- The
grep
command searches the files for the string that follows. - The
"``date +%b\ %d``"
concatenates the Month and the Day from the date command into the search string. The"-n"
flag puts line numbers in front of the error line in case we want to go back and see what lines are around the error line.
- The
- The result is piped to
grep -i error
; this Returns all lines that have the current days date that also contain the word "Error."- The
-i
switch tells grep to ignore the case of the letters so it returns lines containing "Error" or "error"
- The
> /temp/msgerrs.today
creates a file containing the lines with the error messages.- The "mailx" program creates the email.
- The
-s
switch adds a subject to the email. - The "To:" is handled by the space separated email addresses.
- The
< /tmp/msgerrs.today
file feeds the body of the message to the email program.
- The
If there were no errors the msgerrs.today
file is empty and you get a blank email. No news is good news when dealing with error messages.
Example output:
3942:Jul 23 16:00:20 XYZ Cache(FOO)[21724]: [ID 392709 user.alert] Skipping KILL due to global ^["^^/FOO/data/Cms/"]oddCOM("csp.rg.templates.stories.full.full04","p","ERRORPAGE",2) already exist for records in /FOO/data/Cms/ addr=1016139856
3955:Jul 23 16:00:20 XYZ Cache(FOO)[21724]: [ID 870028 user.alert] Skipping KILL due to global ^["^^/FOO/data/Cms/"]oddCOM("csp.rg.templates.stories.full.full04","m","ThrowError",2) already exist for records in /FOO/data/Cms/ addr=1016138488
3958:Jul 23 16:00:20 XYZ Cache(FOO)[21724]: [ID 438196 user.alert] Skipping KILL due to global ^["^^/FOO/data/Cms/"]oddCOM("csp.rg.templates.stories.full.full04","m","ShowError",2) already exist for records in /FOO/data/Cms/ addr=1016138176
3975:Jul 23 16:00:21 XYZ Cache(FOO)[21724]: [ID 174594 user.alert] Skipping KILL due to global ^["^^/FOO/data/Cms/"]oddCOM("csp.rg.templates.stories.full.full04","m","OnPageError",2) already exist for records in /FOO/data/Cms/ addr=1016136368
4040:Jul 23 16:00:21 XYZ Cache(FOO)[21724]: [ID 317560 user.alert] Skipping KILL due to global ^["^^/FOO/data/Cms/"]rMAP("csp.rg.templates.stories.full.full04","CLS","INT","ThrowError",3) already exist for records in /FOO/data/Cms/ addr=1016129232
4041:Jul 23 16:00:21 XYZ Cache(FOO)[21724]: [ID 957416 user.alert] Skipping KILL due to global ^["^^/FOO/data/Cms/"]rMAP("csp.rg.templates.stories.full.full04","CLS","INT","ThrowError",1) already exist for records in /FOO/data/Cms/ addr=1016129128
4048:Jul 23 16:00:21 XYZ Cache(FOO)[21724]: [ID 272400 user.alert] Skipping KILL due to global ^["^^/FOO/data/Cms/"]rMAP("csp.rg.templates.stories.full.full04","CLS","INT","ShowError",2) already exist for records in /FOO/data/Cms/ addr=1016128420
4049:Jul 23 16:00:21 XYZ Cache(FOO)[21724]: [ID 141296 user.alert] Skipping KILL due to global ^["^^/FOO/data/Cms/"]rMAP("csp.rg.templates.stories.full.full04","CLS","INT","ShowError",1) already exist for records in /FOO/data/Cms/ addr=1016128320
Important: On a Solaris system, the log files should be located in /var/adm/messages
; on a Unix/Linux system, the logs are kept in /var/logs
. YMMV.