GeoLocation Redirection with WAF and Lambda@Edge to Bypass AWS Rate Limit

When you serve your business in multiple countries, content can vary depending on location. For example, if your website only supports a few languages, to have instructions for nonsupported languages, you might have static pages for them.

You can handle this different geolocation redirection at the backend side but this will increase your computing resource usage and you will not cache the static page content at CDN. Handling on the front end with javascript is also possible but still, the system can only forward your customer based on language or you can make a call to API to learn location with IP but this will increase the RTT and latency which means a negative impact for the customers.

AWS CloudFront Supports Lambda@Edge computing which is the lightweight version of Lambda and directly runs at CloudFront Systems. At first sight, Lambda@Edge is a good solution, it is fast, easy to attach Cloufront, and manageable but If the served website consists of multiple and dynamic pages, the problems start to appear.

Firstly when you attach Lambda@Edge to / path, a customer which not required to redirection will invoke your lambda for each request, and then you will encounter a Rate limit For Lambda@Edge and 5xx errors.

Secondly, if you want to create exceptions for bots to crawl, you need to maintain and keep updating your lambda function for different bots, so we want to eliminate bot detection from our code, and then we want to invoke lambda when redirection is required.

Thirdly and probably most importantly you cannot create basic logic at CloudFront about which paths and conditions must invoke the Lambda but WAF is capable of that such as regex, bot detection, country, and a combination of those features as well.

geo-map-example full-size-image

AWS WAF CDK Implementation

The below stack is a combination of regular rules such as blocking unknown source IPs and our redirection rules.

allow-site-assets

Firstly we are allowing static assets from our firewall because they might be used from beside of web such as Android or iPhone application integration, to secure the backend, you can have different origin configurations for those and only allow GET and remove header and queries to your origin, with this your backend will be safe.

To allow only GET and ignore fields you can use the below configurations at CloudFront additionalBehaviors part.

headerBehavior: cdk.aws_cloudfront.CacheHeaderBehavior.allowList(),
cookieBehavior: cdk.aws_cloudfront.CacheCookieBehavior.none(),
queryStringBehavior: cdk.aws_cloudfront.CacheQueryStringBehavior.none(),

AWS-AnonymousIPList

This is an example of regular built-in rules, which can be usable for everyone, there is no requirement for this rule, you can remove it if you want.

AWS-AWSManagedRulesBotControlRuleSet

Managed rules can be used with labelMatchStatement in namespace scope but to be able to use AWS managed rules labels on your custom implementation, AWS WAF requires managedRuleGroupStatement implementation before your rule because that rule load configuration to the system, otherwise labelMatchStatement will not work.
In our case, managedRuleGroupStatement is defined AWS-AWSManagedRulesBotControlRuleSet, and labelMatchStatement which is used for allow bots for crawling without facing redirection is located under geolocationRedirection rule

geolocationRedirection

If the client is not a:

  • bot
  • access location is not in dynamically loaded countries
  • the path is not assets or static pages

The system will return 302 /redirect which is your redirect lambda function is attached.

import * as cdk from 'aws-cdk-lib';
import { Construct } from 'constructs';
export class AhmetEngineerWAF extends cdk.Stack {
  public readonly WafResource: cdk.aws_wafv2.CfnWebACL;
  constructor(scope: Construct, id: string, props?: cdk.StackProps) {
    super(scope, id, props);
    let wafRules: Array<cdk.waf.CfnWebACL.RuleProperty> = [];
    let priorityNumber = 1;
    function priorityCounter(): number { priorityNumber++; return priorityNumber * 10;}

    const siteAssets = [
      'manifest.json',
      'assetlinks.json',
      'sitemap.xml',
    ];

    wafRules.push({
      name: 'allow-site-assets',
      priority: priorityCounter(),
      statement: {
        regexMatchStatement: {
          fieldToMatch: { uriPath: {}, },
          textTransformations: [ { type: 'NONE',  priority: 0,}, ],
          //! The redirect paths must be add there to prevent infinite loop
          regexString: '^\\/(\\b(' + siteAssets.join('|') + '))\\/?$',
        },
      },
      action: { allow: {},},
      visibilityConfig: {
        sampledRequestsEnabled: false,
        cloudWatchMetricsEnabled: false,
        metricName: 'ahmet-engineer-site-assets',
      },
    });


    // Example stack
    wafRules.push({
      name: 'AWS-AnonymousIPList',
      priority: priorityCounter(),
      statement: {
        labelMatchStatement: {
          scope: 'LABEL',
          key: 'awswaf:managed:aws:anonymous-ip-list:AnonymousIPList',
        },
      },
      action:{ block:{} },
      visibilityConfig: {
        sampledRequestsEnabled: false,
        cloudWatchMetricsEnabled: false,
        metricName: 'AWS-AWSManagedRulesAnonymousIPList',
      },
    });

    wafRules.push({
      name: 'AWS-AWSManagedRulesBotControlRuleSet',
      priority: priorityCounter(),
      statement: {
        managedRuleGroupStatement: {
          vendorName: 'AWS',
          name: 'AWSManagedRulesBotControlRuleSet',
          excludedRules: [],
        },
      },
      overrideAction: { count:{} },
      visibilityConfig: {
        sampledRequestsEnabled: true,
        cloudWatchMetricsEnabled: true,
        metricName: 'AWS-AWSManagedRulesBotControlRuleSet',
      },
    });

    // The static paths, which does not require redirection
    var pathWhiteList = [...siteConfigAssets, 'redirect', 'dynamic\\/images', 'world-wide', 'uk', 'ca', 'gl', 'br'];

    // Geo Location Redirection
    let geolocationRedirection: cdk.aws_wafv2.CfnWebACL.RuleProperty = {
      name: 'geolocationRedirection',
      priority: priorityCounter(),
      statement: {
        andStatement: {
          statements: [
            // if it is not bot, if bot's face with redirection, they can not crawl dynamic pages version of website.
            {
              notStatement: {
                statement: {
                  labelMatchStatement: {
                    scope: 'NAMESPACE',
                    key: 'awswaf:managed:aws:bot-control:bot:',
                  },
                },
              },
            },
            // if the region is not in the list, which is only static page avaible
            {
              notStatement: {
                statement: {
                  geoMatchStatement: { countryCodes: ['TR'], },
                },
              },
            },
            // and if the rule is not redirect path, which means static or similar for everyone
            {
              notStatement: {
                statement: {
                  regexMatchStatement: {
                    fieldToMatch: { uriPath: {}, },
                    textTransformations: [ { type: 'NONE', priority: 0,}, ],
                    regexString: '^\\/(\\b(' + pathWhiteList.join('|') + '))/?',
                  },
                },
              },
            },
          ],
        },
      },
      action: {
        block: {
          // Redirect client to redirect Cloudfront function page
          customResponse: {
            responseCode: 302,
            responseHeaders: [{ name: 'location', value: '/redirect' }],
            customResponseBodyKey: 'geolocationRedirection-body',
          },
        },
      },
      visibilityConfig: {
        sampledRequestsEnabled: true,
        cloudWatchMetricsEnabled: true,
        metricName: 'geolocationRedirection',
      },
    };
    wafRules.push(geolocationRedirection);


    // Create New waf
    this.WafResource = new cdk.aws_wafv2.CfnWebACL(this, 'ahmetengineerWebACL', {
      name: 'ahmet-engineer-waf',
      description: 'protecting ahmet.engineer web resource',
      defaultAction: {
        allow: {},
      },
      scope: 'CLOUDFRONT',
      visibilityConfig: {
        cloudWatchMetricsEnabled: false,
        metricName: 'ahmet-engineer-waf',
        sampledRequestsEnabled: true,
      },
      rules: wafRules,
      customResponseBodies: {
        [geolocationRedirection.name + '-body']: {
          content: JSON.stringify({
            detectedRule: geolocationRedirection.name,
            priority: geolocationRedirection.priority,
          }),
          contentType: 'APPLICATION_JSON',
        },
      },
    });
  }
}

CloudFront Implementation

We configured our AWS WAF to only trigger the redirection Lambda@Edge function on special conditions, with the below example CDK stack, we will attach created WAF to our CloudFront resource to serve our website.

I am not attaching the lambda code in this blog, you can find details and examples at CloudFront/example-function-redirect-url.

import * as constructs from 'constructs';
import * as cdk from 'aws-cdk-lib';
import path = require('path');
export class AhmetEngineerWeb extends cdk.Stack {
  constructor(scope: constructs.Construct, id: string, props: cdk.StackProps) {
    super(scope, id, props);
      const redirectPageCloudfrontFunction = new cdk.aws_cloudfront.Function(this, 'GeoRedirect', {
        functionName: 'GeoRedirect',
        code: cdk.aws_cloudfront.FunctionCode.fromFile({
          filePath: path.join(__dirname, 'fn/geo-redirect.js'),
        }),
      });

      const appOrigin = new cdk.aws_cloudfront_origins.HttpOrigin("origin.ahmet.engineer", {
        protocolPolicy: cdk.aws_cloudfront.OriginProtocolPolicy.HTTPS_ONLY,
      });

      new cdk.aws_cloudfront.Distribution(this, 'ahmetEngineerDist', {
        defaultBehavior: {
          origin: appOrigin,
          allowedMethods: cdk.aws_cloudfront.AllowedMethods.ALLOW_ALL,
          cachePolicy: cdk.aws_cloudfront.CachePolicy.CACHING_DISABLED,
          viewerProtocolPolicy: cdk.aws_cloudfront.ViewerProtocolPolicy.REDIRECT_TO_HTTPS,
          originRequestPolicy: cdk.aws_cloudfront.OriginRequestPolicy.ALL_VIEWER,
          compress: true,
        },
        additionalBehaviors: {
          ['/redirect*']: {
            origin: appOrigin,
            viewerProtocolPolicy: cdk.aws_cloudfront.ViewerProtocolPolicy.REDIRECT_TO_HTTPS,
            allowedMethods: cdk.aws_cloudfront.AllowedMethods.ALLOW_GET_HEAD_OPTIONS,
            compress: true,
            cachePolicy: new cdk.aws_cloudfront.CachePolicy(this, 'GeoRedirector',{
              enableAcceptEncodingBrotli: true,
              enableAcceptEncodingGzip: true,
              cachePolicyName: 'GeoRedirector',
              comment: 'redirect path and lambda at edge function',
              defaultTtl: cdk.Duration.minutes(10),
              headerBehavior: cdk.aws_cloudfront.CacheHeaderBehavior.allowList('CloudFront-Viewer-Country'), // To cache only country based
              cookieBehavior: cdk.aws_cloudfront.CacheCookieBehavior.none(),
              queryStringBehavior: cdk.aws_cloudfront.CacheQueryStringBehavior.none(),
            }),
            functionAssociations: [
              {
                function: redirectPageCloudfrontFunction,
                eventType: cdk.aws_cloudfront.FunctionEventType.VIEWER_REQUEST,
              },
            ],
          },
          
        },
        enableIpv6: true,
        minimumProtocolVersion: cdk.aws_cloudfront.SecurityPolicyProtocol.TLS_V1_2_2019,
        enabled: true,
        comment: 'ahmet.engineer',
        domainNames: ["ahmet.engineer", "www.ahmet.engineer"],
        webAclId: "arn:aws:wafv2:eu-west-1:123456789012:regional/webacl/ahmet-engineer",
        httpVersion: cdk.aws_cloudfront.HttpVersion.HTTP2_AND_3,
        certificate: cdk.aws_certificate_manager.Certificate.fromCertificateArn(this, 'ahmetEngineerSSL', "arn:aws:acm:eu-west-1:123456789012:certificate/12345678-1234-1234-1234-123456789012"),
      });
  }
}

Conclusion

With our system design, now Lambda function is invoked at the required conditions, so this will reduce invoke of lambda and customers no more face rate limit of the aws lambda.


© 2024 All rights reserved.

Powered by Hydejack v7.5.0