11 min read 1 March 2021

Web application firewall (WAF)

Security, as one of the top priorities, cannot rely on merely a single service.

AWS WAF
Amazon Athena
Amazon Route 53
Amazon CloudFront
Terraform

Alexandr Balakirev Cloud DevOps Engineer

The main idea and reason behind using any kind of firewalls is that, as soon as the project reaches certain level, it begins to attract more and more audience, and that includes attackers whose purpose may be to cause harm by finding various kinds of vulnerabilities, including database vulnerabilities, cross-site scripting, HTTP flood and many others. Unfortunately, the list is almost endless.

AWS services

Amazon Web Services has a number of products that are capable of countering these kinds of threats, AWS Network Firewall and AWS Web Application Firewall to name but two. The main difference between them, among many others, lies in the number of OSI layers, 3-4 and 7, respectively. AWS WAF analyzes communications between external users and web application by blocking malicious requests before they reach users or web application, and can be associated with resources such as Application Load Balancer, API Gateway, AWS AppSync and CloudFront distributions.

Basic AWS WAF pipeline with Route53 and CloudFront.

AWS WAF contains various kinds of rules (managed rule groups, own rules and rule groups) and actions that can be potentially applied (allow, block, count). In our project, we decided to use AWS Managed Rules, such as AWSManagedRulesSQLiRuleSet, AWSManagedRulesCommonRuleSet, AWSManagedRulesAmazonIpReputationList, AWSManagedRulesKnownBadInputsRuleSet, as well as our own rules for rate limits. Additionally, AWS Managed Rules include many other subrules, i.e. AWSManagedRulesCommonRuleSet also contain rules against cross-site scripting, size restrictions, bad bots etc.

Using Terraform

Undoubtedly and as a matter of good practice, it’s better to start writing any used infrastructure as code in the first place.

Example of Terraform code

resource "aws_wafv2_web_acl" "this" {  name        = var.web_acl_name  description = var.web_acl_description  scope       = var.scope  default_action {    allow {}  }  // custom rule based on waf rule group  rule {    name     = "user_defined_rules"    priority = 1    override_action {      count {}    }    statement {      rule_group_reference_statement {        arn = aws_wafv2_rule_group.custom_rules_group.arn      }    }    visibility_config {      cloudwatch_metrics_enabled = var.metrics_enabled      sampled_requests_enabled   = var.metrics_enabled      metric_name                = "custom_xss_rule"    }  }  // managed rules based on managed-rules variable  dynamic "rule" {    for_each = var.managed_rules    iterator = object    content {      name     = lookup(object.value, "name")      priority = lookup(object.value, "priority")      override_action {        dynamic "count" {          for_each = lookup(object.value, "override_action", {}) == "count" ? [1] : []          content {}        }        dynamic "none" {          for_each = lookup(object.value, "override_action", {}) == "none" ? [1] : []          content {}        }      }      statement {        managed_rule_group_statement {          name        = lookup(object.value, "name")          vendor_name = "AWS"        }      }      visibility_config {        cloudwatch_metrics_enabled = var.metrics_enabled        sampled_requests_enabled   = var.metrics_enabled        metric_name                = "metric-name-${lookup(object.value, "name")}"      }    }  }  // rate based rules  dynamic "rule" {    for_each = var.rate_based_rules    iterator = object    content {      name     = lookup(object.value, "name")      priority = lookup(object.value, "priority")      action {        dynamic "count" {          for_each = lookup(object.value, "action", {}) == "count" ? [1] : []          content {}        }        dynamic "block" {          for_each = lookup(object.value, "action", {}) == "block" ? [1] : []          content {}        }      }      statement {        rate_based_statement {          limit              = lookup(object.value, "limit")          aggregate_key_type = "IP"          scope_down_statement {            byte_match_statement {              field_to_match {                uri_path {}              }              positional_constraint = "CONTAINS"              search_string         = lookup(object.value, "search_string")              text_transformation {                priority = 0                type     = "NONE"              }            }          }        }      }      visibility_config {        cloudwatch_metrics_enabled = var.metrics_enabled        sampled_requests_enabled   = var.metrics_enabled        metric_name                = "rate-based-${lookup(object.value, "name")}"      }    }  }  tags = var.tags  visibility_config {    cloudwatch_metrics_enabled = var.metrics_enabled    metric_name                = var.web_acl_metric_name    sampled_requests_enabled   = var.metrics_enabled  }}

Typical logs flow

It should also be noted that the use of AWS WAF in real conditions on large projects is a rather time-consuming iterative process, and usually, in this case the blame falls on false positives, can’t be implemented ‘out of the box’. The most common practice is implementation according to the following scheme - collect logs in account mode, analyze them and correct the AWS WAF rules based on that analysis. The collection of logs is carried out over a certain period of time which depends on many factors, including traffic.

Logs collection in count mode is built according to the shown pipeline

Everything depends on analysis

After the logs get into AWS S3, one of the options for a quite effective analysis is using AWS Athena. This service allows you to create atable from data in a bucket and use SQL queries against it.

Example of logs received from AWS WAF:

{  "timestamp": 1612420137433,  "formatVersion": 1,  "webaclId": "***************",  "terminatingRuleId": "Default_Action",  "terminatingRuleType": "REGULAR",  "action": "ALLOW",  "terminatingRuleMatchDetails": [],  "httpSourceName": "CF",  "httpSourceId": "****************",  "ruleGroupList": [    {      "ruleGroupId": ""****************"",      "terminatingRule": null,      "nonTerminatingMatchingRules": [],      "excludedRules": null    },    {      "ruleGroupId": "AWS#AWSManagedRulesSQLiRuleSet",      "terminatingRule": null,      "nonTerminatingMatchingRules": [],      "excludedRules": null    },    {      "ruleGroupId": "AWS#AWSManagedRulesCommonRuleSet",      "terminatingRule": {        "ruleId": "GenericRFI_BODY",        "action": "BLOCK",        "ruleMatchDetails": null      },      "nonTerminatingMatchingRules": [],      "excludedRules": null    },    {      "ruleGroupId": "AWS#AWSManagedRulesAmazonIpReputationList",      "terminatingRule": null,      "nonTerminatingMatchingRules": [],      "excludedRules": null    },    {      "ruleGroupId": "AWS#AWSManagedRulesKnownBadInputsRuleSet",      "terminatingRule": null,      "nonTerminatingMatchingRules": [],      "excludedRules": null    }  ],  "rateBasedRuleList": [],  "nonTerminatingMatchingRules": [    {      "ruleId": "AWSManagedRulesCommonRuleSet",      "action": "COUNT",      "ruleMatchDetails": []    }  ],  "requestHeadersInserted": null,  "responseCodeSent": null,  "httpRequest": {    "clientIp": "***************",    "country": "**",    "headers": [      {        "name": "user-agent",        "value": "ReactorNetty/0.9.12.RELEASE"      },      {        "name": "host",        "value": "***************"      },      {        "name": "Accept",        "value": "application/json"      },      {        "name": "Content-Type",        "value": "application/json"      },      {        "name": "content-length",        "value": "1317"      }    ],    "uri": "***************",    "args": "",    "httpVersion": "HTTP/1.1",    "httpMethod": "POST",    "requestId": "***************"  }}

AWS Athena table creation (from AWS documentation):

CREATE EXTERNAL TABLE `waf_logs`(  `timestamp` bigint,  `formatversion` int,  `webaclid` string,  `terminatingruleid` string,  `terminatingruletype` string,  `action` string,  `terminatingrulematchdetails` array<                                  struct<                                    conditiontype:string,                                    location:string,                                    matcheddata:array<string>                                        >                                     >,  `httpsourcename` string,  `httpsourceid` string,  `rulegrouplist` array<                     struct<                        rulegroupid:string,                        terminatingrule:struct<                           ruleid:string,                           action:string,                           rulematchdetails:string                                               >,                        nonterminatingmatchingrules:array<                                                       struct<                                                          ruleid:string,                                                          action:string,                                                          rulematchdetails:array<                                                               struct<                                                                  conditiontype:string,                                                                  location:string,                                                                  matcheddata:array<string>                                                                     >                                                                  >                                                               >                                                            >,                        excludedrules:string                           >                       >,  `ratebasedrulelist` array<                        struct<                          ratebasedruleid:string,                          limitkey:string,                          maxrateallowed:int                              >                           >,  `nonterminatingmatchingrules` array<                                  struct<                                    ruleid:string,                                    action:string                                        >                                     >,  `httprequest` struct<                      clientip:string,                      country:string,                      headers:array<                                struct<                                  name:string,                                  value:string                                      >                                   >,                      uri:string,                      args:string,                      httpversion:string,                      httpmethod:string,                      requestid:string                      >)ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'WITH SERDEPROPERTIES ('paths'='action,formatVersion,httpRequest,httpSourceId,httpSourceName,nonTerminatingMatchingRules,rateBasedRuleList,ruleGroupList,terminatingRuleId,terminatingRuleMatchDetails,terminatingRuleType,timestamp,webaclId')STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat'OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'LOCATION 's3://athenawaflogs/WebACL/'

Sample SQL query for analysis

SELECT COUNT(httpRequest.clientIp) as count, httpRequest.clientIp,ruleGroupList[rule_number].ruleGroupId as managed_group,ruleGroupList[rule_number].terminatingRule.ruleId as rule_id,httpRequest.headers[header_number].value as host,httpRequest.uri as uriFROM waf_logs_for_reportWHERE ruleGroupList[rule_number].terminatingRule.action='BLOCK'GROUP BY httpRequest.clientIp, ruleGroupList[rule_number].ruleGroupId,ruleGroupList[rule_number].terminatingRule.ruleId, httpRequest.headers[header_number].value,httpRequest.uriORDER BY countLIMIT 100;

Next steps

After such analysis, we can understand which sub rules gave the largest number of false positives, then correct them and repeat the process of logs collection and analysis. After several iterations, as soon as we are able to get rid of the overwhelming number of false positives, we can start the implementation in block mode while intensively monitoring the logs, so that in the event of any unforeseen situations, we can have a quick rollback.

In conclusion, it should be noted that security in the current environment should, generally, be one of the top priorities, and cannot be based on one service only. Rather, it should be a mix of services and best security practices, as this allows you to avoid negative consequences for the entire project as a whole.