mod_compare logs - Orange-OpenSource/mod_dup GitHub Wiki
The output of mod_compare provides information about one request executed through 2 differents webservice.
Two output destinations are supported for the "comparison/diff report"
- CompareLog FILE /path/to/my/file.log
- CompareLog SYSLOG facility
4 formats are supported:
- CompareLogType MultiLine - Human Readable diff type log
- CompareLogType Utf8Json(default) - Json for ELK (Elastic Search - Logstash - Kibana) - contains no invalid utf8 characters
- CompareLogType SimpleJson - Json for ELK (Elastic Search - Logstash - Kibana) - no check for content
- CompareLogType archive - For storing both requests and responses for later processing.
Mar 30 06:41:48 www04.front3.prod.pns.s0.p.fti.net apache2[18608]: BEGIN NEW REQUEST DIFFERENCE n°: 1108314454 Mar 30 06:41:48 www04.front3.prod.pns.s0.p.fti.net apache2[18608]: Date : 2015-Mar-30 06:41:48.112083 Mar 30 06:41:48 www04.front3.prod.pns.s0.p.fti.net apache2[18608]: Elapsed time for requests (ms): DUP 0 COMP 1 DIFF -1 Mar 30 06:41:48 www04.front3.prod.pns.s0.p.fti.net apache2[18608]: /my/api?sid=DEADBEEF Mar 30 06:41:48 www04.front3.prod.pns.s0.p.fti.net apache2[18608]: Accept: application/json Mar 30 06:41:48 www04.front3.prod.pns.s0.p.fti.net apache2[18608]: Content-Length: 315 Mar 30 06:41:48 www04.front3.prod.pns.s0.p.fti.net apache2[18608]: Content-Type: application/x-dup-serialized Mar 30 06:41:48 www04.front3.prod.pns.s0.p.fti.net apache2[18608]: Duplication-Type: Response Mar 30 06:41:48 www04.front3.prod.pns.s0.p.fti.net apache2[18608]: ELAPSED_TIME_BY_DUP: 0 Mar 30 06:41:48 www04.front3.prod.pns.s0.p.fti.net apache2[18608]: Host: me.hello.net Mar 30 06:41:48 www04.front3.prod.pns.s0.p.fti.net apache2[18608]: UNIQUE_ID: 1108314454 Mar 30 06:41:48 www04.front3.prod.pns.s0.p.fti.net apache2[18608]: User-Agent: Jakarta Commons-HttpClient/3.1 Mar 30 06:41:48 www04.front3.prod.pns.s0.p.fti.net apache2[18608]: User-RealAgent: mod-dup Mar 30 06:41:48 www04.front3.prod.pns.s0.p.fti.net apache2[18608]: X-COMPARE-TRANSLATED: 1 Mar 30 06:41:48 www04.front3.prod.pns.s0.p.fti.net apache2[18608]: X-DUPLICATED-REQUEST: 1 Mar 30 06:41:48 www04.front3.prod.pns.s0.p.fti.net apache2[18608]: X-Forwarded-For: 1.1.1.1 Mar 30 06:41:48 www04.front3.prod.pns.s0.p.fti.net apache2[18608]: X_DUP_HTTP_STATUS: 404 Mar 30 06:41:48 www04.front3.prod.pns.s0.p.fti.net apache2[18608]: ------------------- Mar 30 06:41:48 www04.front3.prod.pns.s0.p.fti.net apache2[18608]: Http Status Codes: DUP 404 COMP 200 Mar 30 06:41:48 www04.front3.prod.pns.s0.p.fti.net apache2[18608]: ------------------- Mar 30 06:41:48 www04.front3.prod.pns.s0.p.fti.net apache2[18608]: ------------------- Mar 30 06:41:48 www04.front3.prod.pns.s0.p.fti.net apache2[18608]: @@ -1,11 +1,13 @@ Mar 30 06:41:48 www04.front3.prod.pns.s0.p.fti.net apache2[18608]: <?xml version='1.0' encoding='UTF-8'?> Mar 30 06:41:48 www04.front3.prod.pns.s0.p.fti.net apache2[18608]: <myReturn> Mar 30 06:41:48 www04.front3.prod.pns.s0.p.fti.net apache2[18608]: <Process> Mar 30 06:41:48 www04.front3.prod.pns.s0.p.fti.net apache2[18608]: -<Status>Error</status> Mar 30 06:41:48 www04.front3.prod.pns.s0.p.fti.net apache2[18608]: -<Logs> Mar 30 06:41:48 www04.front3.prod.pns.s0.p.fti.net apache2[18608]: -<Code>103</code> Mar 30 06:41:48 www04.front3.prod.pns.s0.p.fti.net apache2[18608]: -<Message>Unknown ID</message> Mar 30 06:41:48 www04.front3.prod.pns.s0.p.fti.net apache2[18608]: -<Severity>Warning</logs> Mar 30 06:41:48 www04.front3.prod.pns.s0.p.fti.net apache2[18608]: +<Status>OK</status> Mar 30 06:41:48 www04.front3.prod.pns.s0.p.fti.net apache2[18608]: </process> Mar 30 06:41:48 www04.front3.prod.pns.s0.p.fti.net apache2[18608]: +<Data> Mar 30 06:41:48 www04.front3.prod.pns.s0.p.fti.net apache2[18608]: +<PreferencesData> Mar 30 06:41:48 www04.front3.prod.pns.s0.p.fti.net apache2[18608]: +<Extension> Mar 30 06:41:48 www04.front3.prod.pns.s0.p.fti.net apache2[18608]: +</extension> Mar 30 06:41:48 www04.front3.prod.pns.s0.p.fti.net apache2[18608]: +</preferencesdata> Mar 30 06:41:48 www04.front3.prod.pns.s0.p.fti.net apache2[18608]: +</data> Mar 30 06:41:48 www04.front3.prod.pns.s0.p.fti.net apache2[18608]: </myreturn> Mar 30 06:41:48 www04.front3.prod.pns.s0.p.fti.net apache2[18608]: Mar 30 06:41:48 www04.front3.prod.pns.s0.p.fti.net apache2[18608]: END DIFFERENCE n°:1108314454
BEGIN NEW REQUEST DIFFERENCE n°: 1292598226 / Elapsed time for diff computation : 1ms Date : 2015-Mar-30 15:23:20.091237 Elapsed time for requests (ms): DUP 0 COMP 13 DIFF -13 /my/api?filter=field Accept: */* Accept-Charset: utf-8, iso-8859-1, utf-16, *;q=0.7 Accept-Language: fr-FR, en-US Client-DNS: a.blabla.fr Client-IP: 109.25.5.19 Content-Length: 3743 Content-Type: application/x-dup-serialized Cookie: xxx Duplication-Type: Response User-Agent: Mozilla/5.0 (Linux; U; Android 3.2; fr-fr; GT-P7510 Build/HTJ85B) AppleWebKit/534.13 (KHTML, like Gecko) Version/4.0 Safari/534.13 User-RealAgent: mod-dup WT_SSL_SRC: 0 X-COMPARE-TRANSLATED: 1 X-DUPLICATED-REQUEST: 1 X-Forwarded-For: 1.1.1.1 X-Forwarded-Host: my.dns.fr X-Forwarded-Server: my.dns.fr X_DUP_HTTP_STATUS: 206 X_WASSUP_MSISDN: UNAVAILABLE screencapability: high x-wap-profile: http://wap.samsungmobile.com/uaprof/GT-P7510.xml ------------------- Key missing in the destination map : Key with value differences : 'Content-Range' ==> 'bytes 0-3604/3605'/'bytes 0-3601/3602' ------------------- END DIFFERENCE n°:1292598226
- Pro: The log is prettyfied for readibility
- Pro: Sequentiality is garanteed in one file on one server: each thread acquires a system wide mutex before writing to the log.
- Con: When centralizing the logs with syslog through relay servers, since it is multi-line, lines from different requests may be interleaved
- Con: The format is not simple to analyze programmatically
{ "id":1292598226, "date":"2015-03-30 15:23:20.091237", "runtime":{ "DUP":0, "COMP":13, "DIFF":-13 }, "request": { "uri":"/my/api?filter=field" "url":"/spp/main", "args":{ "version":"1.0.0", "request":"getPNS", "sid":"PUSHMS", ... }, "header":{ "Accept-Charset":"utf-8, iso-8859-1, utf-16, *;q=0.7", ..., "x-wap-profile":"http://wap.samsungmobile.com/uaprof/GT-P7510.xml" } }, "diff":{ "header":{ "Content-Range": { "src":"bytes 0-3604/3605", "dst":"bytes 0-3601/3602" } } } }
{ "id": Int : ID of the request "date": String : Date of the request format YYYY-MM-dd HH:mm:ss.SSSSSS "runtime": Dictionnary of ResponseTime for each PnS component (DUP,COMP,...) "request": Dictionnary of request's information { "uri": (Optionnal) String : URL complete "url": String : Requested url : all which precede the '?' "args":Dictionnary of the arguments of the request : split the string following the '?' with '&' to determine all the couples key/values then split on '=' to extract those into json, "header": Dictionnary of the http header with the same key/values }, "status": Dictionnary of each status code returned by the PnS components (DUP,COMP,...) "diff":{ "header": Dictionnary of diff concerning the header { "key" : Header key name presenting a difference {"src":String, "dst":String}, The two string having a diff ... } "body": Dictionnary of information concerning the body { "posDiff": Int : Line Counter of positive differences "negDiff": Int : Line Counter of negative differences "negList": list(string) : List of all the xml tags in negative differences "posList": list(String) : List of all the xml tags in negative differences "full": (Optionnal) String : Complete diff (same as log) } } }
The counters provide us a way to exclude diff where only a response time is different for example :
<Process> -<ResponseTime>104</responsetime> +<ResponseTime>92</responsetime> </process>will be transform into :
{ "negDiff":1, "posDiff":1, "negList":["ResponseTime"], "posList":["ResponseTime"], "full":" <Process>\n-<ResponseTime>104</responsetime>\n+<ResponseTime>92</responsetime>\n </process>" }which can be easily filtered/excluded then in ElasticSearch/Kibana, such as :
ResponseTime in postList && posDiff==1
The json can then be injected in an ElasticSearch cluster with the ELK stack
TODO: Add logstash example configuration
The responses are printed in boost::serialization text archive format: http://www.boost.org/doc/libs/1_64_0/libs/serialization/doc/index.html
In cases where your body is too big, your logging system (e.g: rsyslog) might not allow you to print a line of more than N thousands of characters, in these cases, mod_compare will truncate (json) or split the diff (multiline).
Lines are automatically split per XML Tag, allowing a simpler line-by-line diff even if your original xml is not one tag per line.