One of our customers needed backup protection for their EBS volumes. We approached this serverless – and added some extras. Here's how.
One of our current customers decided that they needed a backup protection for their EBS volumes. There’s a lot of already customized or provided by Amazon Web Services out-of-the-box solutions but for me, it was a pretty good chance to test myself against Serverless Framework (that I promised to write about last time) and another Python code and opportunity to customize the solution a little bit. Apart from the snapshot and retention feature, saving information to DynamoDB and passing notification messages to Slack have been added as extra functions.
Based on Cron event (we’ve changed that into SSM Maintenance Window — I’ll describe that later) configured in CloudWatch Events a Lambda function is invoked and basing on specific TAG value in Auto Scaling Groups or single instances snapshots of attached volumes are created. The final step of ‘Round 1’ is to save the data about new snapshots in DynamoDB table called “created-snapshots” (example below). For us, it was just a simple method of keeping the information about the time of finished tasks and created snapshots.
Apart from the small bunch of information in DynamoDB, Lambda is putting a response into a CloudWatch logs like following:
('Response Body:', set(['
{
"Status": "SUCCESS",
"Reason": "See the details in CloudWatch Log Stream: 2018/08/21/[$LATEST]a1e01c47b0a44b79af9d26c8ea2b6979",
"Data": {
"SnapshotId": [
"snap-08b8b0819720d9f80",
"snap-02da7b4af1edd5633",
"snap-05fe0099524fd7f11"
],
"Change": true
},
"PhysicalResourceId": "2018/08/21/[$LATEST]a1e01c47b0a44b79af9d26c8ea2b6979"
}
']))
There’s also a second part focused on deleting created snapshots based on retention policy and TAG value set by ‘snapshot’ Lambda. Periodically (the occurrence has been specified via CloudWatch Events) Lambda is checking “DeleteOn” tag whether it’s time to delete the snapshot (by comparison of the current date against the one set in tag).
We’ve glanced at the general concept of the solution, but I’d like to talk a little bit about the change of the approach in terms of deployment. Remember, last time I was talking about replacing (just a little bit) of CloudFormation for a new framework. And here it is…
In my previous article about ‘pythoning’ I unveiled some information about replacing a well-known CloudFormation into a fancy-named framework called Serverless. Literally I’ve wasted CloudFormation in terms of Lambda provisioning. Simply saying — in a project focusing on Lambdas (as a main force) I’ve started using Serverless framework because it’s much EASIER AND FASTER to launch the environment. I’ll show you this later but in the meanwhile let me tell you briefly about that comprehensive Swiss Knife.
Definition of the framework says “The Serverless Framework is a CLI tool that allows users to build & deploy auto-scaling, pay-per-execution, event-driven functions”, but for me, it is the easiest way to deploy your prepared Lambda functions with additional, necessary AWS services and, of course, to invoke them. Serverless is written in Node.js which might not be perfect for everyone, so you still need to install Node and NPM.
First of all, you have to install this shiny tool by typing:
$ npm install serverless -g
The CLI can be accessed using either serverless or sls. To create the template for your new project type:
$ serverless create --template TEMPLATE_NAME
The variety of available templates is quite huge:
“aws-nodejs”, “aws-nodejs-typescript”, “aws-nodejs-ecma-script”, “aws-python”, “aws-python3”, “aws-groovy-gradle”, “aws-java-maven”, “aws-java-gradle”, “aws-kotlin-jvm-maven”, “aws-kotlin-jvm-gradle”, “aws-kotlin-nodejs-gradle”, “aws-scala-sbt”, “aws-csharp”, “aws-fsharp”, “aws-go”, “aws-go-dep”, “azure-nodejs”, “fn-nodejs”, “fn-go”, “google-nodejs”, “kubeless-python”, “kubeless-nodejs”, “openwhisk-java-maven”, “openwhisk-nodejs”, “openwhisk-php”, “openwhisk-python”, “openwhisk-swift”, “spotinst-nodejs”, “spotinst-python”, “spotinst-ruby”, “spotinst-java8”, “webtasks-nodejs”, “plugin” and “hello-world”.
After generation of selected template in your project directory you should see:
-rw-r--r-- 1 user staff 497B Jun 27 23:03 handler.py
-rw-r--r-- 1 user staff 2.8K Jun 27 23:04 serverless.yml
Where:
The Serverless Framework translates all syntax in serverless.yml to a single AWS CloudFormation template which makes the whole process trivial. In other words, you’re defining services you want to add in kind of declarative way and Framework is then responsible for creating this little magic. To go even deeper the process, it can be disassembled into:
There’s much more information on the official website but now, more or less, we know what’s hidden inside. Let’s get back to the code.
First part of the serverless.yml file contains general configuration regarding AWS environment and if needed (and in my case it was) some custom variables:
provider:
name: aws
runtime: python2.7
region: eu-central-1
memorySize: 128
timeout: 60 # optional, in seconds
versionFunctions: true
tags: # Optional service wide function tags
Owner: chaosgears
ContactPerson: chaosgears
Environment: dev
custom:
region: ${opt:region, self:provider.region}
app_acronym: ebs-autobackup
default_stage: dev
owner: YOUR_ACCOUNT_ID
stage: ${opt:stage, self:custom.default_stage}
stack_name: basic-${self:custom.app_acronym}-${self:custom.stage}
dynamodb_arn_c: arn:aws:dynamodb:${self:custom.region}:*:table/${self:custom.dynamodb_created}
dynamodb_arn_d: arn:aws:dynamodb:${self:custom.region}:*:table/${self:custom.dynamodb_deleted}
dynamodb_created: created-snapshots
dynamodb_deleted: deleted-snapshots
I won’t focus on this part but keep in mind that if you want to use custom-defined variable in another variable use such pattern: variable_a: {self:custom.variable_b}. The really important part is ‘functions’ one. Here’s the place for your forged Lambda functions you’ve been creating for weeks. Look, how simple it is and with a couple of lines you’ll define the environment variables, timeout, event scheduling and even roles. I’ve omitted obvious elements like tags, names and descriptions.
functions:
ebs-snapshots:
name: ${self:custom.app_acronym}-snapshots
description: Create EBS Snapshots and tags them
timeout: 120 # optional, in seconds
handler: snapshot.lambda_handler
# events:
# - schedule: cron(0 21 ? * THU *)
role: EBSSnapshots
environment:
region: ${self:custom.region}
owner: ${self:custom.owner}
slack_url: ${self:custom.slack_url}
input_file: input_1.json
slack_channel: ${self:custom.slack_channel}
tablename: ${self:custom.dynamodb_created}
tags:
Name: ${self:custom.app_acronym}-snapshots
Project: ebs-autobackup
Environment: dev
ebs-retention:
name: ${self:custom.app_acronym}-retention
description: Deletes old snapshots according to rentention policy
handler: retention.lambda_handler
timeout: 120 # optional, in seconds
environment:
region: ${self:custom.region}
owner: ${self:custom.owner}
slack_url: ${self:custom.slack_url}
input_file: input_2.json
slack_channel: ${self:custom.slack_channel}
tablename: ${self:custom.dynamodb_deleted}
events:
- schedule: cron(0 21 ? * WED-SUN *)
role: EBSSnapshotsRetention
tags:
Name: ${self:custom.app_acronym}-retention
Project: ebs-autobackup
Environment: dev
As you’ve probably noticed variables are presented as plain text and are easy to capture. Obviously, hardcoding anything into code is very, very bad idea, likewise putting sensitive data into plain text is also a bad one. My code is just an example, but I’d like to show you an easy way I often use, called Parameter Store, to avoid problems with sensitive data leaking. This is an AWS service that acts as a centralized config and secrets storage for whole bunch of your applications.
First of all, you might use AWS CLI to store your new SSM parameters:
aws ssm put-parameter --name PARAM_NAME --type String --value PARAM_VALUE
then in serverless.yml starting from version 1.22:
environment:
VARIABLE: ${ssm:PARAM_NAME}
OTE: Personally I am not a fan of hardcoding sensitive data into variables. I rather would use:
ssm = boto3.client('ssm')
parameter = ssm.get_parameter(Name='NAME_OF_PARAM', WithDecryption=True)
api_token = parameter['Parameter']['Value']
Having functions and variables configured, we can seamlessly jump into ‘resources’ part which nothing more than additional CloudFormation resources which are required for our solution.
NOTE: If you don’t want to keep resources code in serverless file use the following method which is a pretty straightforward path YAML file containing your CloudFormation file (only Resources part)
resources:
- ${file(FOLDER/FILE.yml)}
project_folder
serverless.yml
FOLDER -> FILE.yml
As you’ll see in my serverless.yml file, the IAM roles and Dynamodb tables are being added to the environment. Last but not least, a python project is to add plugin installation for requirements:
plugins:
- serverless-python-requirements
So now, the time has come to launch your serverless project. Simply type:
serverless deploy --aws-profile YOUR_AWS_PROFILE or simply sls deploy --aws-profile YOUR_AWS_PROFILE
then launching information should appear on the screen:
Serverless: Installing requirements of requirements.txt in .serverless...
Serverless: Packaging service...
Serverless: Excluding development dependencies...
Serverless: Injecting required Python packages to package...
Serverless: Creating Stack...
Serverless: Checking Stack create progress...
.....
Serverless: Stack create finished...
Serverless: Uploading CloudFormation file to S3...
Serverless: Uploading artifacts...
Serverless: Uploading service .zip file to S3 (4.6 KB)...
Serverless: Validating template...
Serverless: Updating Stack...
Serverless: Checking Stack update progress…
It’s important to notice that you can simply implement that on different accounts via well-known profile. After a while you’ll get a final message saying that your new stack has been implemented:
Serverless: Stack update finished...
Service Information
service: ebs-autobackup
stage: dev
region: eu-central-1
stack: ebs-autobackup-dev
api keys:
None
endpoints:
None
functions:
ebs-snapshots: ebs-autobackup-dev-ebs-snapshots
ebs-retention: ebs-autobackup-dev-ebs-retention
Serverless: Publish service to Serverless Platform...
Service successfully published! Your service details are available at:
https://platform.serverless.com/services/YOUR_PROFILE/ebs-autobackup
After you check that everything has been launched properly you’re able to invoke your deployed function directly via serverless call:
serverless invoke -f FUCTION_NAME
Additional arguments you might use:
After all the hard work you’ve performed with your project it can be easily deleted. Just type:
serverless remove --aws-profile AWS_PROFILE
and our “new” baby will take care of the rest. Honestly, I have to say that Serverless Framework has literally made my day. Deployment of new Lambda functions, even with extra AWS services seems to be extremely easy. Of course, after some time spent on this framework you’ll realize that ‘the devil’s in the details’. Nonetheless, I strongly encourage you to test it and believe me or not, after the first day of trial you’re gonna love it.
Next let me talk a little bit about prepared Lambda functions.
In snapshot.py file you’ll see a function called “determine_snap_retention” which does nothing more than summing up today’s date with the number of days the newly created snapshot should be kept. The result is the date of deletion:
def determine_snap_retention(retention_type='monthly',mdays=21, wdays=7):
d_today = datetime.datetime.today()
d_today = d_today.replace(hour=23, minute=0, second=0, microsecond=0)
snapshot_expiry = ""
while d_today.weekday() != 4:
d_today += datetime.timedelta(1)
if retention_type == 'monthly':
snapshot_expires = d_today + timedelta(days=mdays)
snapshot_expiry = snapshot_expires.strftime('%Y-%m-%d %H:%M:%S')
elif retention_type == 'weekly':
snapshot_expires = d_today + timedelta(days=wdays)
snapshot_expiry = snapshot_expires.strftime('%Y-%m-%d %H:%M:%S')
return snapshot_expiry
I’ve also added class called “Volumes” which has got a method:
def create_snapshot(self, owner, slack_url, file, slack_channel, tablename=’created-snapshots’)
It creates snapshots for all attached EBS volumes belonging to instances with a specific tag. Then, after successful job it tags a snapshot with specific key/pair value: “DeleteOn”/DATE. This helps to determine the date of deletion by retention Lambda. As an extra feature, the function puts info about snapshots into Dynamodb table and sends Slack notifications via:
slack_notify_snap(slack_url=slack_url, file=file, channel=slack_channel, snap_num=len(snaps_notify), snap_ids=snaps_notify, owner=owner , region=self.region)
which is imported: from slack_notification import slack_notify_snap
Snippet:
def slack_notify_snap(slack_url, file, channel, snap_num, region, snap_ids, owner):
snap_ids = ', '.join(snap_ids)
slack_message = custom_message(filename=file, snapshot_number=snap_num, snapshot_ids=snap_ids, region=region, owner=owner)
try:
req = requests.post(slack_url, json=slack_message)
if req.status_code != 200:
print(req.text)
raise Exception('Received non 200 response')
else:
print("Successfully posted message to channel: ", channel)
In handler pasted below, the environmental variables were used to avoid hardcoding. The function is still under development so elements like ‘event’ or ‘context’ are intended for future use (literally, each time I look at the code I find something that could be done another way). What they’re actually for is…
As you probably know AWS Lambda is an event-driven service, simply put, invoking a function means triggering an event within AWS Lambda. Moving further you’ve seen definition of handler def lambda_handler(event, context). First argument contains event which triggers the function, represented as a JSON object inside Lambda, but in python code it goes to the dictionary. In my particular case, it is an empty dictionary. If you were using API Gateway then whole HTTP request would be represented as a dictionary. For your Lambda it’s just an input with additional parameters that you want to pass.
Second one is a context with meta containing information about the invocation. The moment you start debugging your function, you’ll find context very useful.
def lambda_handler(event, context):
region = os.environ['region']
owner = os.environ['owner']
slack_url = os.environ['slack_url']
file = os.environ['input_file']
slack_channel = os.environ['slack_channel']
tablename = os.environ['tablename']
ec2 = Volumes('ec2', region, event, context)
ec2.create_snapshot(owner, slack_url, file, slack_channel, tablename)
I’ve followed the same methodology as with snapshot code and another class has been created. Method called “delete_old_snapshots” filters snapshots basing on tag and compares current date with the one saved in tag. If it matches or it’s after the “deletion” date, snapshot is being instantly removed. Information about job is being sent to Slack channel. Similarly to snapshot function, information is being put in DynamoDb table, but into different table. This particular one contains only info about deleted snapshots.
def delete_old_snapshots(self, owner, slack_url, file, slack_channel, tablename='deleted-snapshots'):
delete_on = datetime.date.today().strftime('%Y-%m-%d')
deleted_snapshots = []
dynamo = Dynamodb('dynamodb', self.region)
change = False
filters = [
{
'Name': 'owner-id',
'Values': [
owner,
]
},
{
'Name': 'tag-key',
'Values': [
'DeleteOn',
]
},
]
try:
snapshot_response = self.client.describe_snapshots(Filters=filters, OwnerIds =[owner])['Snapshots']
for snap in snapshot_response:
for i in snap['Tags']:
if i['Key'] == 'DeleteOn':
data = i['Value'][:10]
if time.strptime(data,'%Y-%m-%d') == time.strptime(delete_on,'%Y-%m-%d') or time.strptime(delete_on,'%Y-%m-%d') > time.strptime(data,'%Y-%m-%d'):
print('Deleting snapshot "%s"' % snap['SnapshotId'])
deleted_snapshots.append(snap['SnapshotId'])
self.client.delete_snapshot(SnapshotId=snap['SnapshotId'])
dynamo.batch_write(tablename, deleted_snapshots, region=self.region)
change = True
slack_notify_snap(slack_url=slack_url, file=file, channel=slack_channel, snap_num=len(deleted_snapshots), snap_ids=deleted_snapshots, owner=owner, region=self.region)
elif time.strptime(delete_on,'%Y-%m-%d') < time.strptime(data,'%Y-%m-%d'):
print(str(snap['SnapshotId'])+' has to be deleted on %s. Now we keep it' % i['Value'])
change = False
responseData = {
'SnapshotId': deleted_snapshots,
'Changed': change
}
sendResponse(self.event, self.context, 'SUCCESS', responseData)
I’m totally aware that there are bunch of similar tools, some of them already with out-of-the box features, but honestly, it was quite a nice lesson of implementing (almost from scratch) and what’s more important, combining some additional features together. Therefore, that mix of addons is quite frequently used by our team in other projects, like keeping data outside the function in DynamoDB or other light entity and notify ourselves about different events coming from the AWS. Moreover, it’s the next step towards being more familiar and feeling more comfortable with serverless framework which, in our case, has been moved to the first place in terms of serverless architectures implementations. Hmm, but what’s next? My piece of advice is to start using framework if you’re an enthusiast of IaC approach. Last but not least, find other areas to automate! At the end of the day it will bring some order to everyday’ chaos and give you time for more proactive tasks.
We'd love to answer your questions and help you thrive in the cloud.