EtagSupport - danielbodart/webfabric GitHub Wiki
ETags are a great way to reduce bandwidth usage at the expense of a slight increase of CPU and Memory usage on the server. WebFabric adds strong ETag support to any resource(static and dynamic) that is calculated based on the MD5 sum of the data returned in the body of the Http message. As a bonus it also adds the "Content-MD5" header so that clients that support it can verify the resource was not corrupted in transit.
Lastly it adds the current time as the "Last-Modified" date (ignoring the filesystem date if it was static content) so that if you update your SiteMesh decorators you will always see the result when the client refreshes their page even with no cache directives set.
As the Http spec states the origin servers should send ETags and Last-Modified when ever possible, this filter makes it very easy to do so.
ETags are a mechanism that can be used with caching or without. A simple way to think about it is that caching controls whether the request will be sent to the server at all, while ETags (and Last-Modified) control whether to send a response to the client.
Lets look at some requests in action:
A client visits a website for the first time and requests http://www.webfabric.org/test/abc.txt
GET /test/abc.txt HTTP/1.1
Host: www.webfabric.org
The server responds
HTTP/1.1 200 OK
Last-Modified: Mon, 03 Aug 2009 20:46:47 GMT
Etag: "900150983cd24fb0d6963f7d28e17f72"
Content-MD5: kAFQmDzST7DWlj99KOF/cg==
Content-Type: text/plain
Content-Length: 3
abc
The next time the client goes to the site, it tells the server the version it currently has by the "If-None-Match" header
GET /test/abc.txt HTTP/1.1
Host: www.webfabric.org
If-Modified-Since: Mon, 03 Aug 2009 20:46:47 GMT
If-None-Match: "900150983cd24fb0d6963f7d28e17f72"
The server still processes the request but before sending the response it computes the MD5 hash and if it matches it returns
HTTP/1.1 304 Not Modified
instead of the actual response.
Now "abc" isn't a very large resource but if this was a Html page or a Xml feed the savings would be quite large. What's great about this is the client will always have the latest version so if you really can't cache (are you sure?) then at least make sure you don't keep sending the same content each time over and over.
The current version has been tested with the following jars (older versions should work)
-
sitemesh-2.4.jar (Used for faster byte buffering)
-
scala-library.jar (2.7.7)
-
commons-codec-1.3.jar (Used for Base64 and Hex encoding)
Add the following to [web-app]/WEB-INF/web.xml
within the web-app tag:
<filter>
<filter-name>etag</filter-name>
<filter-class>org.webfabric.http.StrongEtagFilter</filter-class>
</filter>
<filter-mapping>
<filter-name>etag</filter-name>
<url-pattern>/*</url-pattern>
</filter-mapping>
This filter should come before the SiteMesh filter declaration (or Content Encoding / modifying filters) but after any transfer encoding filters.
There are currently known bugs in Apache and Google App Engine that illegally modify the content without updating the ETag.