20141121 convert openfire audit logs to csv - plembo/onemoretech GitHub Wiki
title: Convert Openfire audit logs to csv link: https://onemoretech.wordpress.com/2014/11/21/convert-openfire-audit-logs-to-csv/ author: phil2nc description: post_id: 8907 created: 2014/11/21 17:04:56 created_gmt: 2014/11/21 22:04:56 comment_status: closed post_name: convert-openfire-audit-logs-to-csv status: publish post_type: post
Here's a script that lets you parse Openfire's audit log and export data from it in comma separated values (csv) format. The Openfire audit log contains a lot of useful stuff about client interactions with the server and each other, this includes presence information on what clients are connected to the server at any given time. Because the audit log is written in XML an XML parser is needed to extract data from it. The script is built around perl's XML::Twig module and performs searches on the LDAP directory that users authenticate to for additional business information not included in the log data. In the example below LDAP attribute names that begin with the word "example" represent custom attributes that have been added to the base schema. The script uses a simple on-disk database created with the DB_File module to store the multiple unique resourceId values that are assigned to those who use many different clients types, sometimes simultaneously. [code lang="perl"] #!/home/chatuser/perl5/bin/perl # ofauditreport.pl # Read Openfire xml audit log and convert to csv # 1. Work with multiple logs, in chrono order (approx 90M of xml data) # 2. Do LDAP lookup to append sn, givenname, title, o, ou, exampletimezone, # postaladdress. # 3. XML attributes: # presence (id, to, from), status, priority # 4. Output field order: # UserID,Last,First,ResourceID,Title,Company,Department,TimeZone,Address # # XML data format is as follows: # # # # # Online # 1 # # # # * * * # # # Online # 1 # # # # * * * # # Run this on a prod Openfire server as the gctech user. # Created 11/11/2014 by P Lembo # On RHEL 6 had to do an alternate install of perl 5.18.4 from source to get around a serious perl bug only resolved in 5.15 and later. use strict; use XML::Twig; use Date::Calc qw(Today Add_Delta_Days); use File::Sort qw(sort_file); use Text::ParseWords; use Net::LDAP; use Net::LDAP::Entry; use DB_File; use Fcntl; my $HOME = $ENV{'HOME'}; our($ofdirHost, $ofdirUsr, $ofdirPass); require "$HOME/etc/admin.conf"; # Get yesterday's date my( $year, $month, $day ) = Today(); ( $year, $month, $day ) = Add_Delta_Days ( $year, $month, $day, -1 ); my $yesterday = sprintf("%04d%02d%02d",$year,$month,$day); my $rawName = "ofrawfile" . "-" . $yesterday . ".csv"; my $rawFile = "$HOME/data/export/$rawName"; my $rawSorted = "$HOME/data/export/$rawName\.sorted"; my $reptName = "ofauditreport" . "-" . $yesterday . ".csv"; my $reptFile = "/data/www/html/reports/$reptName"; my $errLog = "$HOME/data/logs/ofauditreport.log"; my $dirHost = $ofdirHost; my $dirUsr = $ofdirUsr; my $dirPass = $ofdirPass; my $dbmName = "ridb"; my $dbmFile = "$HOME/data/export/$dbmName"; open LOGZ, ">$errLog" or die $!; get_logs(); make_rept(); clean_up(); close LOGZ; sub get_logs { # Read all files in /data/logs/im-server from yesterday # with pattern: jive.audit-[YYYYMMDD-NNN].log, and pass # to parse_doc subroutine my $logDir = "/data/logs/im-server"; my $time = localtime(); print LOGZ "$time\tGetting audit logs for $yesterday\n"; print "$time\tGetting audit logs for $yesterday\n"; open FH, ">$rawFile" or die $!; close FH; # Get list of file names from log directory opendir(DIR, "$logDir") or die $!; my @files = readdir DIR; closedir DIR; foreach my $fileName(@files) { if($fileName =~ /$yesterday/g) { # Send each file to the XML parser my $logFile = "/data/logs/im-server/$fileName"; parse_doc($logFile); } } } sub parse_doc { # Parse the XML file at hand and write the user and # resourceIDs my $xmlFile = $_[0]; print LOGZ "Processing ", $xmlFile, "\n"; print "Processing ", $xmlFile, "\n"; open FH, ">>$rawFile" or die $!; my $twig = XML::Twig->new(); $twig->parsefile($xmlFile); my $root = $twig->root; foreach my $packets ($root->children()) { my $type = $packets->first_child()->att('type'); # Include packets that have no type, i.e. that contain # messages between clients and not inquiries (iq). if($type !/.+/) { my $timestamp = $packets->att('timestamp'); for($timestamp) { s/,/ /g; } my $from = $packets->first_child()->att('from'); my $to = $packets->first_child()->att('to'); my $id = $packets->first_child()->att('id'); my $packet = $packets->first_child(); my $status = $packet->first_child_text(); my($frjid, $resourceId) = split('/', $from); my($fruid, $frdomain) = split('@', $frjid); my($touid, $todomain) = split('@', $to); print FH $fruid; print FH ","; print FH $resourceId; print FH "\n"; } # Include iq packets with a type of 'get', these are # pings of the server by clients elsif($type = /get/g) { my $from = $packets->first_child()->att('from'); # Only record those packets that have the from attr # (these should be real people) if($from =~ /.+/) { my $id = $packets->first_child()->att('id'); my($frjid, $resourceId) = split('/', $from); my($fruid, $frdomain) = split('@', $frjid); print FH $fruid; print FH ","; print FH $resourceId; print FH "\n"; } } } close FH; } sub make_rept { # Create a report from the log data gathered my $time = localtime(); print LOGZ "$time\tMaking session report\n"; print "$time\tMaking session report\n"; # Create a hash table keyed to userIDs make_riddb(); # Open the hash table for reading my $db = tie my %riddb, "DB_File",
Copyright 2004-2019 Phil Lembo