Web Application Firewall (WAF) Troubleshooting and Maintenance - CDCgov/prime-simplereport GitHub Wiki

SimpleReport is, by its nature, a dynamic application that grows and changes with each and every development sprint. As the application changes, so to does its userbase, in response to the rising and falling tides of infectious disease spread.

As the stewards of PII and confidential healthcare information, we must undertake due diligence to protect the data with which we are entrusted. In doing so, however, we must not make the application so secure that its functionality is unduly restricted for the average user. In its current configuration, the WAF has been tuned to permit the vast majority of user functionality; however, unforeseen cases may arise in which a user’s routine transactions are blocked.

This guide is intended to diagnose and provide mitigation instructions for these instances.

Diagnosis of WAF Blocking

If a transactional request is blocked by the WAF, it will manifest in the following manner:

The user will typically be presented with a generic error message:

Unexpected token < in JSON at position 0

This message will also usually be accompanied by a more user-friendly failure message.

Analysis of the browser console and associated application logs will show an HTTP response code 403 ("Forbidden").

For confirmation, please collect the following information:

  • Which screen the user was on at the time of the error
  • The approximate timestamp of the error
  • Any pertinent information about the user that would allow DevOps personnel to identify the problematic request. If this is PII, please ensure it is properly safeguarded when communicating.

To confirm involvement of the WAF, within Azure, navigate to the simple-report-app-gateway Application Gateway, housed in the prime-simple-report-prod resource group. In the left-side menu, under Monitoring, select Logs.

Run the following query to pull results from the WAF:

AzureDiagnostics 
| where ResourceProvider == "MICROSOFT.NETWORK" and Category == "ApplicationGatewayFirewallLog"

Mitigation of WAF Blocking

Unfortunately, we are not currently able to lift exceptions on the basis of field content or the content of query string arguments. Likewise, we are not able to allowlist specific users, diseases, etc.

When developing new features and functionality, the goal is to develop in a manner that preserves as much security as possible. If changes to the WAF are necessary, they should be small enough to permit the largest amount of blocking possible, while still allowing for the new feature to function as intended. This same principle should be adhered to in the event that a problem is discovered in production.

At present, there are four different items that can be excepted: Cookies, Headers, Arguments, and entire Rules. Only three are actively used in the WAF's current implementation; instructions for each follow.

Knowledge of Terraform is required to make the following changes. For assistance, please contact the DevOps team.

Adding a Rule Exception

The WAF operates using the OWASP 3.1 rule set as a baseline. Exceptions consist of a rule_group_override, which is further broken down by rule_group_name, and a list of disabled_rules:

      rule_group_override {
        rule_group_name = "REQUEST-920-PROTOCOL-ENFORCEMENT"
        disabled_rules = [
          "920300",
          "920320"
        ]
      }

All rule_group_override blocks are daughters of the managed_rule_set block.

To add a new rule, first gather the rule group and rule ID from the Application Gateway logs. Ensure that the associated rule group is present in a rule_group_override block, and simply add a new entry in the disabled_rules array.

If a new rule_group_override block must be added, ensure you follow the syntax as it exists in the file.

Adding a Cookie Exception

Occasionally, changes to application structure or third-party integrations will necessitate the addition of a cookie exception. The Application Gateway logs will indicate when this is necessary by providing information about which cookie tripped the firewall.

To create a cookie exception, you will need to create a new exclusion block, like so:

    exclusion {
      match_variable          = "RequestCookieNames"
      selector                = "ai_session" //Part of Azure Application Insights
      selector_match_operator = "StartsWith"
    }

The match_variable will always be RequestCookieNames for cookie exceptions. Adjust the values for selector and selector_match_operator to properly match the data given to you by the Application Gateway logs.

All exclusion blocks are daughters of the managed_rules superblock.

Adding an Argument Exception

Occasionally, changes to application structure, or an unfortunate combination of user data with an overly-sensitive security rule, will require entire arguments within a GraphQL query or other query string to be excepted. (Fun fact: the example below was added because any user or facility address with the word "Union" in the street name made the WAF think an active SQL injection attempt was ongoing.) The Application Gateway logs will indicate when this is necessary by providing information about which argument tripped the firewall. ()

A good indicator that an argument exception will need to be applied is if the triggering request was made against the /api/graphql endpoint. Additionally, the keyword "ARGS" will make a prominent appearance in the log entry.

To create an argument exception, you will need to create a new exclusion block, like so:

    exclusion {
      match_variable          = "RequestArgNames"
      selector                = "variables.street"
      selector_match_operator = "Equals"
    }

The match_variable will always be RequestArgNames for argument exceptions. Adjust the values for selector and selector_match_operator to properly match the data given to you by the Application Gateway logs.

All exclusion blocks are daughters of the managed_rules superblock.

Team Responsibilities

Help and support will be needed from all members of the SimpleReport team to keep the WAF functioning at its best, without disrupting the flow of the application.

If you are a member of any of the following groups, we ask for your support in executing the following tasks:

Software Engineers

  • Test any changes to query strings, new features, etc. in a live Azure environment. (All environments have a functioning WAF).
  • Report any WAF complications to the DevOps team as soon as you can, so that a mitigation can be included before your changes go to Production.
  • If a mitigation is needed, please refrain from merging your changes until the mitigation is included in your branch.

DevOps Team

  • Regularly audit the WAF logs for signs that additional tuning or mitigations may be needed.
  • Quickly respond to Engineering and Support requests for tuning and mitigations
  • Proactively advise on the need for new mitigations as new features are designed

Support Team

  • Use the diagnostic steps above when gathering information about a user encounter with the WAF
  • Raise the issue to the engineering teams as soon as practicable for proper mitigation.