Tools, AWS services and third-party frameworks that speed up different parts of the process.
After reading the first part of this series, you should have a general idea about the business and technical goals that have a direct impact on the choice of event-based architecture. This time, I am going to cover tools, AWS services and third-party frameworks that speed up different parts of the process. I will also show you some values we got and problems we came across.
Basically,we’ve collected tools that are used during event-based scenarios (serverless for some people) in one bag called “BaseGear”:
We’ve already posted articles about Amplify,Vue (https://chaosgears.com/lets-start-developing-using-vue-and-amplify/) and CI/CD (https://chaosgears.com/serverless-pipeline-for-ec2-configuration-management-3-solution-architecture-and-implementation-of-serverless-pipeline/) on our blog, so I’ll skip those topics in this article.
This framework has been described in one of the previous articles, so I won’t focus on its features. However, let me give you some advice that, hopefully, will be valuable to your projects:
NOTE (from AWS documentation): “Lambda polls shards in your DynamoDB Stream for records at a base rate of 4 times per second. When records are available, Lambda invokes your function and waits for the result. If processing succeeds, Lambda resumes polling until it receives more records.”
To avoid less efficient synchronous Lambda function invocations set batchSizeor batchWindow parameter. The batch size for Lambda configures the limit parameter in the GetRecords API. DynamoDB Streams will return up to that many records if they are available in the buffer, whereas the batchWindow property specifies a maximum amount of time to wait before triggering a Lambda invocation with a batch of records.
The issue with the configuration in “serverless.yml” is that when you use the schema presented below, it doesn’t respect the Lambda function trigger configuration seen in the AWS Console.
NOTE: We suggest moving the configuration of the DynamoDB Stream and the DynamoDB Table to a separate file in another directory. It simply makes the structure tidier and better organized:
Lambda function trigger configuration in AWS console:
Generally, I haven’t come across any problems with layers defined this way. However, I did notice one thing about CloudFormation containing layers’ parameter values. Let’s use real case as an example:
We published changes for selected group of Lambda functions via CloudFormation stack. Below, I pasted only the necessary part of parameters section:
The point is that we’ve automated the whole process. We coded a function for getting the latest, available Lambda version, and then publishing a newer one accordingly. However, there is no information in AWS documentation (https://boto3.amazonaws.com/v1/documentation/api/1.9.42/reference/services/lambda.html#Lambda.Client.list_versions_by_function) that you can only collect last 50 versions (we had over 60 that time) viasingle API call. This led to a problem, because we thought we were publishing new function version (via defpublish_new_version(self, uploadId) presented below) and re-pointing the “prod” alias to it. So, as you can see below, we had a situation where AWS Console version greater than 270 was available, but get_latest_published was returning only 50 items with the maximum value of 185, and that was a value the alias has been pointed to. Moreover, it has generated additional problems, because each time we’ve updated the layer version, it wasn’t seen by the Lambda function.
To sum it up, if you’ve exceeded 50 function versions, combined with aliasing, and your Lambda layers/functions updates workflows are being done via CloudFormation/Boto3, use the “NextMarker”.
versionsPublished = ['$LATEST', '9', '13', '17', '21', '25', '29', '33', '37', '41','45', '49', '53', '57', '61', '65', '69', '73', '77', '81', '85','89', '93', '97', '101', '102', '105', '109', '113', '117', '121','125', '129', '130', '131', '132', '133', '137', '141', '145', '149','153', '157', '161', '165', '169', '173', '177', '181', '185']
Getting the latest version (with “NewMarker” key) of Lambda function, will allow you to collect more than 50 versions (especially the most recently published one) via Lambda API:
Publish new version of Lambda function:
They pretty much save you development time. Just keep in mind to follow their repositories’ issues. You can see some we’ve been using below:
serverless-python-requirements - a Serverless v1.x plugin to automatically bundle dependencies from requirements.txt and make them available in your PYTHON PATH.
serverless-plugin-aws-alerts - adds CloudWatch alarms to functions. Below, a simple example, showing how to use a default alarm “functionErrors”:
There are more default alarms:
With following default configurations:
If you want, you can create your own alarm or override default alarm’s parameters. We used that in another microservice. Here’s the example:
serverless-pseudo-parameters - you can use #{AWS::AccountId}, #{AWS::Region}, etc., in any of your config strings. This plugin replaces values with the proper pseudo parameter Fn::SubCloudFormation function. Example from our code:
serverless-plugin-lambda-dead-letter - can assign a DeadLetterConfig to a Lambda function and optionally create a new SQS queue or SNS Topic with a simple syntax. Keeping in mind the principle that everything fails, we implement “backup forces” for our Lambdas in order to handle unprocessed events. Below, one of our functions with DLQ configured:
serverless-step-functions – basically, it simplifies the Step Functions state machines definition in a serverless project.
definition: part contains YAML file with particular states configurations. I do prefer to configure it this way instead of putting the whole code in the “serverless.yml” file.
After the deployment,we got the diagram shown below. As you can see, the loop has been used in order to wait for the result returned from the CloudFormation stack. Basically, it allows you to get rid of sync calls and react, depending on the returned status from another service.
I am pretty sure each of you knows these tools, so I’ll add only a short annotation. Generally, there are some talks about Terragrunt necessity. According to the description from their repo, “Terragrunt is a thin wrapper for Terraform that provides extra tools for working with multiple Terraform modules”. Therefore, do not expect Terragrunt to wrap up all the issues you have with terraform. Personally, I like it for the way it organizes the repo:
Then, with a simple “terraform.tfvars”, you can set variables used in particular terraform modules located in“source= "../../..//modules/"”,andwith a specified module/product you want to use. Our example depictsa CI/CD pipeline for terraform infrastructural changes built on top the CodePipeline + CodeBuild + Lambda Notification(terragrunt-deployments).
An attentive reader might have noticed the buildspec_path = "./terraform/deployment/buildspec.yml" statement. In this particular case, we used it as a source buildspec file for CodeBuild that is invoking Terragrunt commands and making changes in the environment.
So far, I’ve covered tools we use to make things easier and those that save us time. Nonetheless, I would deceive you and blur the reality if I was to say that they work out-of-the-box. For me and my team, it’s all about the estimation; how much time we need to start using new tool effectively, and how much time we save by using a particular tool. Business doesn’t care about tools, it cares about time. My advice is, don’t bind yourself to tools but rather to the question: “what/how much will I achieve if I use it”. Roll up your sleeves, more chapters are on the way...
We'd love to answer your questions and help you thrive in the cloud.