Dispatcher - bartoszWesolowski/aem-tips GitHub Wiki
- Adobe Experience Manager's caching and/or load balancing tool
- deletes modified files from cache
- deletes all files matching the file path (if .html was invalidated then .json will be too)
- touches
statfile- updates the timestamp to indicate date of last change - invalidated files are removed but not replaced immediately
- invalidates parts of cache without deleting any files - at content update a
statfiletimestamp is changed to reflect last content update - when request for a file is made Dispatcher checks whether the file is newer then statfile and depending on outcome serves static file or fetches the content from AEM
By default Dispatcher will call AEM in case:
- If the request URI contains a question mark "?". This usually indicates a dynamic page, such as a search result, which does not need to be cached.
- The file extension is missing. The web server needs the extension to determine the document type (the MIME-type).
- The authentication header is set (this can be configured)
- ensures that documents for one users all composed on same AEM instance
- it's important for personalized pages and session data (logged in user)
- when using sticky connection reconsider how caching should be implemented to avoid caching user related data
- used to improve authoring performance
- {download dispatcher](https://docs.adobe.com/content/help/en/experience-manager-dispatcher/using/getting-started/release-notes.html)
- put
dispatcher-apache<...>.sointo apache httpdmodulesdirectory - copy
dispatcher.anyfile to Apache httpdconfdirectory
In Apache httpd.conf:
- add
LoadModule dispatcher_module modules/dispatcher-apache<...>.so - configure dispatcher configuration like
DispatcherConfig,DispatcherLogandDispatcherLogLevel
<IfModule disp_apache2.c>
DispatcherConfig conf/dispatcher.any
DispatcherLog logs/dispatcher.log
DispatcherLogLevel 3 # 3 - debug, 0 - error
DispatcherDeclineRoot 0
DispatcherUseProcessedURL 0 # 0 user original url, 1 - uses url processed by handlers triggered before dispatcher (rewrites)
DispatcherPassError 0 # 0 - errors handled by AEM, 1 - error handled by Apache
DispatcherKeepAliveTimeout 60
</IfModule>
- Set handler:
<VirtualHost 123.45.67.89>
ServerName www.mycompany.com
DocumentRoot /usr/apachecache/docs
<Directory /usr/apachecache/docs>
<IfModule disp\_apache2.c>
SetHandler dispatcher-handler
</IfModule>
AllowOverride None
</Directory>
</VirtualHost>
- by default stored in
dispatcher.anyfile
- define how dispatcher should handle specific website and URL
- single farms - all requests handled in same way
-
/farmproperty is multi valued
Including config files:
/farms
{
$include "myFarm.any" # or "farm_*.any" to include all files matching
}
Environment variables can be used:
/renders {
/0001 {
/hostname "${PUBLISH_IP}"
/port "8443"
}
}
/clientheaders- defines which headers will be passed from client to AEM
- list must contain all headers that will be passed (if customization needed)
/clientheaders
{
"CSRF-Token"
"X-Forwarded-Proto"
"referer"
...
}
- list of all hostname/URI compintations that Dispatcher accepts for this AEM instance (farm)
-
*as wildcard - [scheme]host[uri][*] format
/virtualhosts
{
"www.myCompany.com"
"www.mySubDivison.*"
}
All requests:
/virtualhosts
{
"*"
}
Matching a virtualhost
- starts from lowest farm and goes up
- starts with top virtual host and goes down
- First virtual host that matches scheme, host and uri is used
- If non found then first match host is used
- if non found then topmost virtualhost in topmost farm is used (default one should be first vhost in first farm)
- under
/famrsproperty - in
/cachethe/allowAuthorizedmust be stet to0 - used along with CUGs - to make pages login protected
/sessionmanagement
{
/directory "/usr/local/apache/.sessions" # required - directory where sessions are stored
/encode "md5" #(default to md5, can be hex)
/header "HTTP:authorization" # (optional), header (HTTP: prefix), or cookie (COOKIE: prefix) that defines where authorization info is stored
/timeout "800" #(optional number of seconds that will cause the session to expire after not being used)
}
-
/rendersproperty defines the AEM instance that will be used to render actual content Options: -
/receiveTimeout- number of milliseconds that the response can take, default value is 10 minutes, 504 if reached while parsing response headers, incomplete HTML will be removed in case it was reached while the response body is read (cache will be deleted) -
/secure- if set to "1" then use HTTPS to communicate with AEM
/renders
{
/myRenderer
{
# hostname or IP of the renderer, "127.0.0.1" if aem is running on same machine
/hostname "aem.myCompany.com"
# port of the renderer
/port "4503"
# connection timeout in milliseconds, "0" (default) waits indefinitely
/timeout "0"
}
}
- requests will be distributed equally between both machines
/renders
{
/myFirstRenderer
{
/hostname "aem.myCompany.com"
/port "4503"
}
/mySecondRenderer
{
/hostname "127.0.0.1"
/port "4503"
}
}
- for configuring access to content
- determines which requests are accepted by apache
- all that does not match returns 404
- if no filter defined then all requests are accepted
- best to use with whitelist strategy -> deny everything, allow what's needed
Filter rule consist of:
- type:
/allowor/deny - element of the request:
/method,/url,/query,/protocol,/path,/selectors,/extension,/suffixand special/globwhich match the whole request line - rules are matched for a request line that is
Method Request-URI HTTP-Version <CRLF>, for example :GET /content/geometrixx-outdoors/en.html HTTP.1.1<CRLF> - when creating filter rules use "" for simple patterns, if a pattern is a regular expression the use single quotes
- Trace logging can be used to debug filtering
/filter {
/0001 { /glob "*" /type "deny" }
/0002 { /type "allow" /method "POST" /url "/content/[.]*.form.html" }
/0003 { /type "deny" /url "/publish/libs/cq/workflow/content/console/archive*" }
/0004 { /type "allow" /url "/libs/cq/workflow/content/console/archive*" }
/005 { /type "allow" /extension '(css|gif|ico|js|png|swf|jpe?g)' }
/0081
{
/type "deny"
/selectors '((sys|doc)view|query|[0-9-]+)'
/extension '(json|xml)'
}
}
- filter can be used to restrict which query strings are allowed
- if a rule contains query then it will only matches if query passes the query pattern
/filter {
/0001 { /type "deny" /method "POST" /url "/etc/*" }
/0002 { /type "allow" /method "GET" /url "/etc/*" /query "a=*" } # request will be accepted only if it contains the query
}
- under farm section
- when enabled dispatcher periodically call AEM to get the list of vanity urls
- if page is denied by /filter config then dispatcher check the vanity url list and if the denied url is on the list access is allowed
- AEM requires additional package to support vanity urls
/vanity_urls {
/url "/libs/granite/dispatcher/content/vanityUrls.html" # path to vanity servlet
/file "/tmp/vanity_urls" # file to the path where vanity urls are stored
/delay 300 # seconds between calls to the aem servlet to update urls
}
- if multiple patterns apply for a request then the last one is effective
-
/docroot- file where files are cached, must be same as document root of web server -
/serveStaleOnError- if set to 1 dispatcher will not delete invalidated content unless render server returns successful response - if AEM responds with error then the outdated html file is served with HTTP status 111 - `/allowAuthorized - if set to "1" requests containing authorization headers (authorization) or cookies (authorization or login-token) can be cached. This config prevents from servinc cached documents to users who do not have required rights
/cache
{
/docroot "/opt/dispatcher/cache"
/statfile "/tmp/dispatcher-website.stat"
/allowAuthorized "0"
/rules
{
# List of files that are cached
}
/invalidate
{
# List of files that are auto-invalidated
}
}
Dispatcher will never cache when
- request uri contains
?- dynamic pages - file extension is missing - extension needed to get the mime type
- authentication header is set (can be configured)
- if AEM responds with
no-cache,no-store,must-revalidateheaders
- when you have 2 dispatchers make sure that request always go only through one dispatcher - dispatcher does not handle requests that come from another dispatcher