mod_dup configuration - Orange-OpenSource/mod_dup GitHub Wiki

This page explains how to activate and configure mod_dup.

mod_dup works by building a list of commands per apache Location and DupDestination

there must be at least one DupFilter or DupRawFilter per DupDestination for requests to be duplicated

Example of a minimal configuration in /etc/apache2/mods-enabled/dup.conf:

<Location /my_duplicated_location>
    # Duplicate 15% of all matched requests to www.stagingserver.com
    DupDestination www.stagingserver.com 15
    DupDuplicationType COMPLETE_REQUEST
    # Match all the requests that contain anything
    DupRawFilter ".*"
</Location>

Match ordering

The order of match in the code is Prevent first, "key" first, "raw" second, and within each category in the natural order of an HTTP Request, which in detail will give you the following order

  1. DupPreventFilter
    1. QUERY_STRING
    2. HEADERS
    3. BODY
  2. DupRawPreventFilter
    1. METHOD
    2. PATH
    3. QUERY_STRING
    4. HEADERS
    5. BODY
  3. DupFilter
    1. QUERY_STRING
    2. HEADERS
    3. BODY
  4. DupRawFilter
    1. METHOD
    2. PATH
    3. QUERY_STRING
    4. HEADERS
    5. BODY

Configuration Limitations

Appliers apply options to the filters which actually store in memory the configuration The configurations are currently indexed per destination, and per destination, it is a list of RawFilters in the order that you declare them in Apache conf.

This means that you cannot duplicate with same destination and same filter options with a different scope. And if you declare different filters on the same destination, it will be duplicated only once max, and you should order them in most precise to less precise filter.

Example of an unexpected behavior in /etc/apache2/mods-enabled/dup.conf:

<Location /my_duplicated_location>
    # Duplicate 15% of all matched requests to www.stagingserver.com
    DupDestination www.stagingserver.com 15
    DupDuplicationType COMPLETE_REQUEST
    # Match all the requests that contain anything
    DupRawFilter ".*"
    DupDestination www.stagingserver.com 15
    DupDuplicationType COMPLETE_REQUEST
    # Match all the requests that contain anything
    DupRawFilter ".*"
</Location>

1. Location dependent directives

The following directives are only accessible under an Apache location.

1.1. Appliers: must be used before Filters to select on what Filters will apply

DupDestination <host>[:<port>] [dup_percentage]

  • The Host and Port to send the requests that match the following filters criteria
  • Port defaults to 80 if unspecified.
  • By default, 100% of matched requests are duplicated but you may also specify a percentage (ranged from 0 to 1000), for instance if your staging servers are less powerful than production, or if you want to amplify requests to test performance.
  • This directive is mandatory, it must be present at least once per Location
  • Every time this directive is defined, it changes the destination of the next DupFilter, DupRawFilter, DupPreventFilter, DupPreventRawFilter, DupSubstitute, DupRawSubstitute.

DupApplicationScope <METHOD|PATH|QUERY_STRING|URL|HEADERS|URL_AND_HEADERS|BODY|ALL>

  • METHOD - apply the following filters on the http method e.g. GET,POST,PATCH,...
  • PATH - apply the following filters on the path part of the URL
  • QUERY_STRING - default - apply the following filters on the query string part of the URL (characters after the question mark: ?)
  • URL - apply the following filters on the whole URL, i.e. PATH and QUERY_STRING
  • HEADERS - apply the following filters on the HTTP headers
  • URL_AND_HEADERS - apply the following filters on the whole URL (PATH and QUERY_STRING) and the HTTP headers
  • BODY - apply the following filters on the request body only
  • ALL - apply the following filters on the whole request: method, path, query string, headers, and body.
  • Every time this directive is defined, it changes the scope on which are applied the next DupFilter, DupRawFilter, DupPreventFilter, DupPreventRawFilter, DupSubstitute, DupRawSubstitute.
                           URL
              |-----------------------------------|
        METHOD     PATH                QUERY_STRING
          |-| |---------------------| |-----------|
          PUT /mylocation/myfile.json?myarg=myvalue HTTP/1.1
HEADERS { Host: www.example.com:8080
        { Content-Length: 1234

BODY    { Body content of length 1234...

DupDuplicationType <NONE|HEADER_ONLY|COMPLETE_REQUEST|REQUEST_WITH_ANSWER>

  • Sets the duplication type of all the filters following this declaration. Possible values are:
  • NONE - Do not duplicate, the default
  • HEADER_ONLY - Duplication only the HTTP HEADER of matching requests, use to skip body analysis (http GET only)
  • COMPLETE_REQUEST - Duplication HTTP HEADER AND BODY of matching requests, most common use case
  • REQUEST_WITH_ANSWER - Duplication HTTP REQUEST AND ANSWER of matching requests, works only in conjunction with mod_compare on the receiving end. Changes the Content-Type to application/x-dup-serialized.
  • Every time this directive is defined, it changes the duplication type of the next DupFilter, DupRawFilter, DupPreventFilter, DupPreventRawFilter, DupSubstitute, DupRawSubstitute.

1.2. Filters: Incoming requests must be "filtered" to be matched and duplicated.

If multiple filters are specified, at least one of them needs to match for a request to be duplicated.
You may add as many filters and prevent-filters as you like.


DupFilter <param> <regexp>

  • Matches content of GET or POST params using a regular expression which needs to match.
  • for DupApplicationScope QUERY_STRING and BODY, it matches arguments with the format param=regexp
  • for DupApplicationScope HEADERS, it matches arguments with the format param: regexp
  • the param is searched for case insensitively
  • The scope on which the filter applies depends on the last DupApplicationScope directive call
  • The destination to which the request is duplicated depends on the last DupDestination directive call

Example:

   DupDestination www.otherserver.com
   DupApplicationScope QUERY_STRING
   DupFilter "INFOS" "(A|B|C)"

matches curl localhost/mypath?param1=x&infos=A&other=none and will not apply the filter on the body part of the request


DupRawFilter <regexp>

  • Filters the content of the whole application scope previously specified
  • Useful for matching text other than key=value query string arguments or key: value HTTP headers
  • The scope on which the filter applies depends on the last DupApplicationScope directive call
  • The destination to which the request is duplicated depends on the last DupDestination directive call

Example 1:

 DupRawFilter ".*"

This matches anything, whatever the application scope

Example 2:

 DupApplicationScope BODY
 DupRawFilter "."

This matches anything that has at least one character in the body


1.3. Prevent Filters: the following directives define filters that, when matched, will stop the parsing of the request and prevent any duplication.

They are applied before any Filter or RawFilter.
If multiple PreventFilters are specified, one of them matching is enough to prevent duplication of the request.
You may add as many filters and prevent-filters as you like.


DupPreventFilter <param> <regexp>

  • Filters the content of GET or POST params using a reg exp which needs to match.
  • for DupApplicationScope QUERY_STRING and BODY, it matches arguments with the format param=regexp
  • for DupApplicationScope HEADERS, it matches arguments with the format param: regexp
  • When the filter matches, no duplication is performed

Example:

   DupApplicationScope "HEADERS"
   DupPreventFilter "Content-Encoding" ".*gzip.*"
   DupFilter "Content-Type" "application/json"

The DupPreventFilter matches curl -H'Content-Encoding: gzip' 'localhost?param1=x&infos=A&other=none' so it will not be duplicated, but curl -H'Content-Type: application/json' 'localhost?param1=x&infos=A&other=none' would be duplicated because it does not match the DupPreventFilter and it matches the DupFilter


DupPreventRawFilter <regexp>

  • Filters the content of the whole application scope using a regular expression which needs to match.
  • Useful for matching queries which don''t use HTTP query string args or HTTP headers
  • when the filter matches, no duplication is performed

Example:

 DupPreventRawFilter "Some secret sentence"

1.4. Substitutions: Once mod_dup decides to duplicate a request it will apply all substitutions in their defined order.


DupSubstitute <argument> <regexp> <replacementvalue>

  • Applies the regexp regular expression on specified argument.
  • All matches found will be replaced by replacementvalue, not just the first match.
  • argument is matched case insensitively

Example 1:

 DupSubstitute "param3" "(.+)" "new_value"

Replaces the value of the attribute param3 with "new_value" Example 1:

 DupSubstitute "param4" "bar" "foo"

Duplicates the query /mypath?param4=tiki_bar,dive_bar into /mypath?param4=tiki_foo,dive_foo and sends that to DupDestination


DupRawSubstitute <regexp> <replace>

  • Same as DupSubstitute but applies to the whole application scope previously defined

Example:

 DupRawSubstitute "(.*) wrong words (.*)" "\1 fixed stuff \2. FTFY"

2. VirtualHost or Server Directives

These all have default values and don't need to be redefined for mod_dup to work.


DupErrorLogBodyMatch "<regex>"

  • In case of a duplication error, prints out in log only the part of the body that matches this regex
  • Default is to log the whole body of the request

Example:

 DupErrorLogBodyMatch "credentials.+?\}"

Generates logs such as this one (in syslog facility local2 with level error) :

 local2.err: 2018-02-07T15:56:43.167265+01:00 ecourreges-virtual-machine - apache2: [DUP] Sending request failed with curl error code: 7, request uri: localhost:16555/dup_test/dup?love=hate&second=location, matched body: credentials":{                "phone":"12345",                "id":777}

DupQueue <min> <max>
Default 1 10
Sets the minimum and maximum size of the internal request queue of each thread.
Once the maximum size is reached, a new thread will be spawned.
If the size falls below the minimum a thread is destroyed.


DupThreads <min> <max>
Default 1 10
Sets the minimum and maximum number of threads per Apache process.
If the maximum number of threads is reached, and all queues are full, new requests will get dropped.


DupTimeout <ms>
Default 0 : no timeout
The timeout for outgoing duplication requests in milliseconds. Sets the curl option CURLOPT_TIMEOUT_MS accordingly. Please set this to a value higher than your destination APIs timeout, or you will receive curl error 28 (timeout) and no proper HTTP code explanation in mod_dup's logs.


DupName <name>
Default ModDup
The name of the program which gets displayed in the logs as a prefix.


DupUrlCodec apache
Default unset, default url codec
Sets the URL codec to use during the request duplication process: by default + are decoded as blank space, the apache url codec will not decode + as space.

3. A complete example

DupQueue 1 100
DupThreads 10 10
DupTimeout 600

# We activate DupModule on the location /main/server
# Requests path that do not begin by this path will therefore not be interpreted by mod_dup
<Location /main/server>
  # For this location we duplication the request with the server answer
  DupDuplicationType REQUEST_WITH_ANSWER
  
  # The following instructions will only apply on the HEADER part of the request
  DupApplicationScope HEADER

  # The host (ip or name) to send the requests to and the port
  DupDestination "myTargetHost:8080"
          
  ## Duplication of any SID
  ## This filter only applies to the request HEADER
  DupFilter "SID" "."

  ## No duplication of requests with non migrated fields
  ## This filter only applies to the request HEADER
  DupPreventFilter "MOVIES" "Pulp|Fiction"

  # The following instructions will only apply on the HEADER part of the request
  DupApplicationScope BODY

  ## Duplication of the requests containing &random=true in their body
  ## This filter only applies to the request HEADER
  DupFilter "Random" "true"

  Allow from all
</Location>

4. Troubleshooting

  • First if you have problems starting Apache, then your configuration is probably incorrect, if you read precisely the errors, it will hint you at what went wrong.
  • Now Once you get Apache to start, It is often hard to know if a request is duplicated or not.
    For this, you can add the "X_DUP_LOG: true header" in your request to get info about what happened or did not happen

An apache path with no DupFilter or DupRawFilter defined will not answer anything interesting:

curl -H'X_DUP_LOG: true' -i localhost:8042/dup_test/comp_test1
HTTP/1.1 200 OK
Date: Wed, 07 Feb 2018 12:25:13 GMT
Server: Apache/2.2.22 (Ubuntu)
Last-Modified: Mon, 10 Jul 2017 16:07:52 GMT
ETag: "40088-4-553f8c8f44600"
Accept-Ranges: bytes
Content-Length: 4

BODY

A path with Dup properly configured but no filter matched: you get an X_DUP_LOG answer that it was not duplicated:

HTTP/1.1 200 OK
Date: Wed, 07 Feb 2018 12:25:34 GMT
Server: Apache/2.2.22 (Ubuntu)
UNIQUE_ID: 1157860589
Last-Modified: Mon, 10 Jul 2017 16:07:52 GMT
ETag: "4008b-3-553f8c8f44600"
Accept-Ranges: bytes
Content-Length: 3
X_DUP_LOG: The request is not duplicated, having found 3 DupDestination(s) and attempted to match 10 DupFilter or DupRawFilter
X-COMPARE-STATUS: No Duplication, No Comparison

DUP

A request that matches 2 destinations, is duplicated twice:

HTTP/1.1 200 OK
Date: Wed, 07 Feb 2018 13:39:09 GMT
Server: Apache/2.2.22 (Ubuntu)
UNIQUE_ID: 697824997
Last-Modified: Mon, 10 Jul 2017 16:07:52 GMT
ETag: "4008b-3-553f8c8f44600"
Accept-Ranges: bytes
Content-Length: 3
X_DUP_LOG: The request is duplicated, URL_AND_HEADERS filter: "location" matched: "location" Destination: localhost:16555 AND URL_AND_HEADERS filter: "hate" matched: "hate" Destination: localhost:8043
X-COMPARE-STATUS: NOT REACHED - curl status Couldn't connect to server

DUP
⚠️ **GitHub.com Fallback** ⚠️