XSS Defense: No Silver Bullets - ESAPI/esapi-java-legacy GitHub Wiki

Introduction

Cross-Site Scripting (XSS) defense is hard. That is one reason why it is so pervasive. Unfortunately, for those who have not done a deep dive into XSS vulnerabilities, it appears alarmingly simple to solve. But these naive "solutions" inevitably turn out to be either incomplete at best or completely wrong. This wiki page explores some of these naive approaches and examines why there is "No Silver Bullet" (as Frederick Brooks so aptly said, all the way back in 1986) for XSS defense either.

We shall not discuss here how to defend against XSS on this wiki page. The section "How to Avoid Cross-site scripting Vulnerabilities" on the aforementioned OWASP XSS page already provides some excellent resources. Take time to read them!

Instead, we are going to explore two common anti-patterns that we've observed several times a year. We will start out with the less common and less egregious one.

Content-Security-Policy (CSP) as an Anti-pattern

First, let me be clear, we are a strong proponent of CSP when it is used properly. What we are against is a blanket CSP policy for the entire enterprise. Generally this fails because of the following reasons:

There is an assumption that all the customer browsers support all the CSP constructs that your blanket CSP policy is using and generally this is done without testing the User-Agent request header to see if the customer browser supports it. Because, let's face it, most businesses don't want to turn away customers because they are using an outdated browser that doesn't support some CSP Level 2 or Level 3 constructs. (Almost all support CSP Level 1 unless you are worried about Grandpa pulling out his old Windows 98 laptop and using some ancient version of Internet Explorer to access your site.)
Mandatory universal enterprise-wide CSP response headers are inevitably going to break some web applications, especially legacy ones. This causes the business to push-back against AppSec guidelines and inevitably results in AppSec issuing waivers and/or security exceptions until the application code can be patched up. But these security exceptions allow cracks in your XSS armor, and even if the cracks are temporary they still can impact your business.

What works, and works well, is when CSP request headers are customized for each web application and reviewed by your AppSec team for effectiveness. What works even better though is when CSP is used as a defense-in-depth mechanism along with appropriate contextual output encoding.

Interceptors and Servlet Filters as an Anti-pattern

The other common anti-pattern that we have observed is the attempt to deal with validation and/or output encoding in some sort of interceptor such as a Spring Interceptor that generally implements org.springframework.web.servlet.HandlerInterceptor or as a JavaEE servlet filter that implements javax.servlet.Filter. While this can be successful for very specific applications (for instance, if you validate that all the input requests that are ever rendered are only alphanumeric data), it violates the major tenet of XSS defense where perform output encoding as close to where the data is rendered is possible. Generally, the HTTP request is examined for query and POST parameters but other things HTTP request headers that might be rendered such as cookie data, are not examined. The common approach that we've seen is someone will call either ESAPI.validator().getValidSafeHTML() or ESAPI.encoder.canonicalize() and depending on the results will redirect to an error page or call something like ESAPI.encoder().encodeForHTML(). Aside from the fact that this approach often misses tainted input such as request headers or "extra path information" in a URI, the approach completely ignores the fact that the output encoding is completely non-contextual. For example, how does a servlet filter know that an input query parameter is going to be rendered in an HTML context (i.e., between HTML tags) rather than in a JavaScript context such as within a <script> tag or used with a JavaScript event handler attribute? It doesn't. And because JavaScript and HTML encoding are not interchangeable, you leave yourself still open to XSS attacks.

Unless your filter or interceptor has full knowledge of your application and specifically an awareness of how your application uses each parameter for a given request, it can't succeed for all the possible edge cases. And we would contend that it never will be able to using this approach because providing that additional required context is way too complex of a design and accidentally introducing some other vulnerability (possibly one whose impact is far worse than XSS) is almost inevitable if you attempt it.

Four Problems with the Interceptor Approach

This naive approach usually has four main problems.

Problem 1 - Encoding for specific context not satisfactory for all URI paths

One problem is the improper encoding that can still allow exploitable XSS in some URI paths of your application. An example might be a 'lastname' form parameter from a POST that normally is displayed between HTML tags so that HTML encoding is sufficient, but there may be an edge case or two where lastname is actually rendered as part of a JavaScript block where the HTML encoding is not sufficient and thus it is vulnerable to XSS attacks.

Problem 2 - Interceptor approach can lead to broken rendering caused by improper or double encoding

A second problem with this approach can be the application can result in incorrect or double encoding. E.g., suppose in the previous example, a developer has done proper output encoding for the JavaScript rendering of lastname. But if it is already been HTML output encoded too, when it is rendered, a legitimate last name like "O'Hara" might come out rendered like "O'Hara".

While this second case is not strictly a security problem, if it happens often enough, it can result in business push-back against the use of the filter and thus the business may decide on disabling the filter or a way to specify exceptions for certain pages or parameters being filtered, which in turn will weaken any XSS defense that it was providing.

Problem 3 - Interceptors not effective against DOM-based XSS

The third problem with this is that it is not effective against DOM-based XSS. To do that, one would have to have an interceptor or filter scan all the JavaScript content going as part of an HTTP response, try to figure out the tainted output and see if it it is susceptible to DOM-based XSS. That simply is not practical.

Problem 4 - Interceptors not effective where data from responses originates outside your application

The last problem with interceptors is that they generally are oblivious to data in your application's responses that originate from other internal sources such as an internal REST-based web service or even an internal database. The problem is that unless your application is strictly validating that data at the point that it is retrieved (which generally is the only point your application has enough context to do a strict data validation using an allow-list approach), that data should always be considered tainted. But if you are attempting to do output encoding or strict data validation all of tainted data on the HTTP response side of an interceptor (such as a Java servlet filter), at that point, your application's interceptor will have no idea of there is tainted data present from those REST web services or other databases that you used. The approach that generally is used on response-side interceptors attempting to provide XSS defense has been to only consider the matching "input parameters" as tainted and do output encoding or HTML sanitization on them and everything else is considered safe. But sometimes it's not? While it frequently is assumed that all internal web services and all internal databases can be "trusted" and used as it, this is a very bad assumption to make unless you have included that in some deep threat modeling for your application.

For example, suppose you are working on an application to show a customer their detailed monthly bill. Let's assume that your application is either querying a foreign (as in not part of your specific application) internal database or REST web service that your application uses to obtain the user's full name, address, etc. But that data originates from another application which you are assuming is "trusted" but actually has an unreported persistent XSS vulnerability on the various customer address-related fields. Furthermore, let's assume that you company's customer support staff can examine a customer's detailed bill to assist them when customers have questions about their bills. So nefarious customer decides to plant an XSS bomb in the address field and then calls customer service for assistance with the bill. Should a scenario lik3 that ever play out, an interceptor attempting to prevent XSS is going to miss that completely and the result is going to be something much worse than just popping an alert box to display "1" or "XSS" or "pwn'd".

Summary

One final note: If deploying interceptors / filters as an XSS defense was a useful approach against XSS attacks, don't you think that it would be incorporated into all commercial Web Application Firewalls (WAFs) and be an approach that OWASP recommends in the OWASP Cross-Site Scripting Prevention Cheat Sheet ?