AWS

Remove waste in AWS

By 10/16/2018 October 23rd, 2018 No Comments

I Hate a Mess

In the first article of this series I’ll present the way we solved a problem with empty S3 buckets, which had been created numerously during some tests and then left forgotten. I don’t like a mess after leaving unnecessary services or their parts and situation when after some time we’re asking ourselves: ‘Why the hell do we have those buckets? Is anybody using them?’. Knowing the AWS environment, you’re pretty aware that empty buckets don’t generate costs but as I mentioned they don’t have to be buckets. They can be any other AWS services which make you confused because nobody is able to tell what they’re used for.

This case below was my last encounter of provisioning Lambda functions via CloudFormation and, stay with me, then I’ll tell you why I have chosen Serverless Framework and, obviously, how I am doing it right now.

Ok, so let’s jump into AWS…

Take Your Cleaning Army

Having a wide variety of solutions, I’ve decided to stay with KISS (Keep It Simple, Stupid) rule therefore following players have been chosen:

  • AWS Lambda
  • AWS Dynamodb
  • CloudFormation
  • Step Functions
  • CloudWatch events

  1. CloudWatch event (cron job) is launched and is invoking a Lambda Function which checks if any empty buckets exist in the account
  2. The result of action 1 is put in a response and forwarded into StepFunctions workflow (number of deleted buckets and their names)
responseData = {
'NumDeletedbuckets': count,
'DeletedBuckets': deleted
}
  1. StepFunctions is invoked. Basing on the input it has got two choices. If NumDeletedBuckets is 0 the “NO CHANGE” response is sent. Otherwise invoke another function that put the names of deleted buckets into a dynamodb table.
  • bucket name
  • date_of_deletion
  • owner name
  • random hash (for further actions)

Customize it however you want. In other similar states I sometimes use SNS to push a notification in case I want to be informed about some action.

Snippet of dynamodb.py (I’ll get back to it later):

  def put_dynamo_items(self, bucketlist, owner='chaosgears'):
        try:
            for bucket in bucketlist:
                response = self.table.put_item(
                Item={
                    'bucketname': bucket,
                    'date': datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
                    'hash': generate_random_string(),
                    'owner': owner
                })
            return response
        except ClientError as err:
           print(err.response['------Error']['Code']+':', err.response['Error']['Message'])

CloudFormation Sisters – Stacks/Modules

Just because I don’t feel comfortable with AWS Management console – I prefer using CloudFormation service which is nothing more than your AWS description language (json, yaml). It gives you quite an easy way to put your infrastructure into one common code (easy to maintain or to modify). I do recommend using “object-oriented” approach. I know Cloudformation is not a programming language that’s why I put it in the quotation marks. I’m only saying that dividing your code into small pieces/modules not only allows to keep everything in ordered and clean manner but also gives the possibility to take a particular module and use it within another template.

In Resource section (snippet from my file below) of the CloudFormation I’ve used TemplateURL attribute to point my module file. Generally, the rule is simple – you’re combining necessary components inside a small module and then putting it into a single stack or several stacks in case you’d like to implement several, same services of your infrastructure. One note –  US Region(Virginia) has a quite different url that’s why condition has been added. Note: !GetAtt MyStateMachineStack.Outputs.StepFunctionsName statement which provides reference to the output(Step Functions ARN) of another CloudFormation stack(in this example to stack which is creating short Step Functions workflow which I’ll describe later). Similar one we’ve in MyStateMachineStack which takes as a reference the output from DynamoDbStack(Lambda Function ARN).

Conditions:
  IfVirginia: !Equals
    - !Ref AWS::Region
    - us-east-1
Resources:
  LambdaStack:
    Type: AWS::CloudFormation::Stack
    Properties:
      Parameters:
        S3Bucket: !Ref ResourcesBucket
        ParentStackName: !Ref ParentStackName
        StepFunctionsArn: !GetAtt MyStateMachineStack.Outputs.StepFunctionsName
        S3Key: lambda/functions/s3_cleaner/s3_cleaner.zip
      TemplateURL: !If [IfVirginia, !Sub 'https://s3.amazonaws.com/${ResourcesBucket}/modules/aws/lambda_s3_cleaner.yml', !Sub 'https://s3-${AWS::Region}.amazonaws.com/${ResourcesBucket}/modules/aws/lambda_s3_cleaner.yml']
      TimeoutInMinutes: 5
    DependsOn:
      - MyStateMachineStack
  DynamoDbStack:
    Type: AWS::CloudFormation::Stack
    Properties:
      Parameters:
        ParentStackName: !Ref ParentStackName
        S3Bucket: !Ref ResourcesBucket
        S3Key: lambda/functions/dynamodb/dynamodb.zip
        DynamodbTable: !Ref TableName
        AccountId: !Ref AccountId
      TemplateURL: !If [IfVirginia, !Sub 'https://s3.amazonaws.com/${ResourcesBucket}/modules/aws/lambda_dynamodb.yml', !Sub 'https://s3-${AWS::Region}.amazonaws.com/${ResourcesBucket}/modules/aws/lambda_dynamodb.yml']
      TimeoutInMinutes: 5
  MyStateMachineStack:
    Type: AWS::CloudFormation::Stack
    Properties:
      Parameters:
        LambdaArn: !GetAtt DynamoDbStack.Outputs.DynamodbFunctionArn
        Environment: !Ref Environment
      TemplateURL: !If [IfVirginia, !Sub 'https://s3.amazonaws.com/${ResourcesBucket}/modules/aws/step_functions.yml', !Sub 'https://s3-${AWS::Region}.amazonaws.com/${ResourcesBucket}/modules/aws/step_functions.yml']
      TimeoutInMinutes: 5

Let’s move into module and see what they look like. First one is a Lambda Function which deletes empty buckets:

AWSTemplateFormatVersion: 2010-09-09
Description: Lambda for S3 bucket cleaner
Parameters:
  S3Bucket:
    Description: S3 bucket containing the zipped lambda function
    Type: String
  S3Key:
    Description: S3 bucket key of the zipped lambda function
    Type: String
LambdaName:
    Type: String
    Default: s3cleaner
  StepFunctionsArn:
    Type: String
  Environment:
    Type: String
    Description: Type of environment.
    Default: dev
    AllowedValues:
      - dev
      - dev/test
      - prod
      - test
      - poc
      - qa
      - uat
Conditions:
  IfSetParentStack: !Not
    - !Equals
      - !Ref ParentStackName
      - ''
Resources:
  S3CleanerRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: 2012-10-17
        Statement:
          - Effect: Allow
            Principal:
              Service:
                - lambda.amazonaws.com
            Action:
              - sts:AssumeRole
      Path: /
      Policies:
        -
          PolicyName: S3Cleaner
          PolicyDocument:
            Version: 2012-10-17
            Statement:
              -
                Effect: Allow
                Action:
                  - lambda:ListFunctions
                  - lambda:InvokeFunction
                Resource:
                  - "*"
              -
                Effect: Allow
                Action:
                  - s3:*
                Resource:
                  - "*"
              -
                Effect: Allow
                Action:
                  - states:*
                Resource:
                  - "*"
              -
                Effect: Allow
                Action:
                  - logs:CreateLogStream
                  - logs:PutLogEvents
                  - logs:CreateLogGroup
                  - logs:DescribeLogStreams
                Resource:
                  - 'arn:aws:logs:*:*:*'
  S3CleanerFunction:
    Type: AWS::Lambda::Function
    Properties:
      Description: Empty S3 bucket cleaner
      FunctionName: !Sub '${ParentStackName}-${Environment}-${LambdaName}'
      Environment:
        Variables:
          StepFunctionsArn: !Ref StepFunctionsArn
      Code:
        S3Bucket: !Ref S3Bucket
        S3Key: !Ref S3Key
      Handler: s3_cleaner.lambda_handler
      Runtime: python2.7
      MemorySize: 3008
      Timeout: 300
      Role: !GetAtt S3CleanerRole.Arn

NOTE: Functions needs some time to complete, even though it uses 78MB memory, 128MB configured it times itself out. Got the ticket in AWS to find the answer how to mitigate that.

Outputs:
  S3CleanerArn:
    Description: Lambda Function Arn for empty S3 buckets cleaning
    Value: !GetAtt S3CleanerFunction.Arn
    Export:
      Name: !If
        - IfSetParentStack
        - !Sub '${ParentStackName}-${LambdaName}'
        - !Sub '${AWS::StackName}-${LambdaName}'
  S3CleanerName:
    Description: Lambda Function Name for empty S3 buckets cleaning
    Value: !Ref  S3CleanerFunction

 

The created role is simple, one point to remember are logs. Always log your Lambda invocations. To avoid increasing cost, you may change your retention policies to have the logs automatically expire in an acceptable time frame. You should limit the policies to particular resources, like states* which attach policies actions for Step Functions. Literally you’re allowing actions on Step Functions service on your Lambda Function. The rest of the code, apart from outputs which are obvious, is Lambda Function object (AWS::Lambda::Function). The most confusing part could be:

Environment:
Variables:
StepFunctionsArn: !Ref StepFunctionsArn

Don’t be scared. This Environment fragment is attribute which provides a section of system variables you’d like to pass into a function like following:

def lambda_handler(event, context):
bucket = gears3('s3', event, context)
 stepfunctions_arn = os.environ['StepFunctionsArn']
bucket.call_step_functions(arn=stepfunctions_arn)

Parameter !Ref StepFunctionsArn is going to be passed like here:

 Resources:
  LambdaStack:
Type: AWS::CloudFormation::Stack
Properties:
Parameters:
S3Bucket: !Ref ResourcesBucket
ParentStackName: !Ref ParentStackName
StepFunctionsArn: !GetAtt MyStateMachineStack.Outputs.StepFunctionsName

As I mentioned before the output of MyStateMachineStack(ARN of StepFunctions workflow) is being passed after Step Functions stack creation. Then we’re getting this parameter in Environment section in the module. Hmm, but wait, what’s inside the MyStateMachineStack. Soo…. there are two main parts: StateExecutionRole which implements IAM Role for Lambda execution. One of the steps in our workflow is:

"DeletedBuckets": {
"Type": "Task",
"Resource": "${LambdaArn}",
"End": true
},

which takes the LambdaArn variable passed as a Parameter in main product file, do you remember this line “LambdaArn: !GetAtt DynamoDbStack.Outputs.DynamodbFunctionArn”. Take the output from DynamoDbStack and give me that ARN which I need in my Step Functions workflow.
The workflow is pretty easy and bases on “Choice” state:

"ChoiceState": {
"Type" : "Choice",
"Choices": [
{
"Variable": "$.NumDeletedbuckets",
"NumericGreaterThan": 0,
"Next": "DeletedBuckets"
}],
"Default": "DefaultState"
},

Depending on NumDeletedbuckets (Lambda cleaner function is sending it in a response message) proper next step is being chosen. Then either we’re putting names of deleted buckets in the DynamoDb table (via Lambda invoked function) or by default printing “NOTHING CHANGED” just to notify that none of the buckets has been deleted. 

Parameters:
 LambdaArn:
Description: LambdaArn
Type: String
Environment:
Type: String
Description: Type of enviroment.
Default: dev
AllowedValues:
- dev
- dev/test
- prod
- test
- poc
- qa
- uat
Resources:
StatesExecutionRole:
Type: AWS::IAM::Role
Properties:
RoleName: !Sub 'StatesExecution-${Environment}'
AssumeRolePolicyDocument:
Version: 2012-10-17
Statement:
-
Effect: Allow
Principal:
Service:
- !Sub states.${AWS::Region}.amazonaws.com
Action:
- sts:AssumeRole
Path: /
Policies:
-
PolicyName: "InvokeLambdaFunction"
PolicyDocument:
Version: "2012-10-17"
Statement:
-
Effect: "Allow"
Action: "lambda:InvokeFunction"
Resource: "*"
MyStateMachine:
Type: AWS::StepFunctions::StateMachine
Properties:
DefinitionString: !Sub |
{
"Comment": "An example of the Amazon States Language using a choice state.",
"StartAt": "ChoiceState",
"States": {
"ChoiceState": {
"Type" : "Choice",
"Choices": [
{
"Variable": "$.NumDeletedbuckets",
"NumericGreaterThan": 0,
"Next": "DeletedBuckets"
}],
"Default": "DefaultState"
},
"DeletedBuckets": {
"Type": "Task",
"Resource": "${LambdaArn}",
"End": true
},
"DefaultState": {
"Type": "Pass",
"Result": "Nothing Changed",
"End": true
}
}
}
RoleArn: !GetAtt StatesExecutionRole.Arn

Generally, the flow is following: Create StepFunctions -> pass the output to LambdaStack -> take that output and pass as a Parameter -> use it in variables section in Lambda Function (StepFunctionsArn: !Ref StepFunctionsArn)

To make it all happened we need two more things:

  • DynamoDb
  • CloudWatch Events

The first one I’ve created via another stack called DynamoDBTableStack. Table was added with minimum WCU/RCU (Write/Read Capacity Units) and tags for easier management. Module used in this stack has been attached in repo (too long to present here).

  DynamoDBTableStack:
Type: AWS::CloudFormation::Stack
Properties:
Parameters:
TableName: !Ref TableName
FirstAttributeName: !Ref FirstAttributeName
FirstAttributeType: !Ref FirstAttributeType
FirstSchemaAttributeName: !Ref FirstSchemaAttributeName
ProvisionedThroughputRead: !Ref ProvisionedThroughputRead
ProvisionedThroughputWrite: !Ref ProvisionedThroughputWrite
Customer: !Ref Customer
ContactPerson: !Ref ContactPerson
Environment: !Ref Environment
Project: !Ref Project
Application: !Ref Application
Jira: !Ref Jira
AWSNight: !Ref AWSNight
TemplateURL: !If [IfVirginia, !Sub 'https://s3.amazonaws.com/${ResourcesBucket}/modules/aws/dynamoDB.yml', !Sub 'https://s3-${AWS::Region}.amazonaws.com/${ResourcesBucket}/modules/aws/dynamoDB.yml']
TimeoutInMinutes: 5

In terms of events I haven’t migrated code into a module, just because that time I’ve started analyzing Serverless Framework and decided that next cases I was going to do with framework instead of CloudFormation (pure laziness :)). We’ll get back to it in the next article presenting another customer’s problem which I decided to ease the pain of implementation with Serverless Framework. Coming back to the code below, I’ve used Events::Rule which cron expression(!Sub ‘cron(${EventsCronExpression}) has been set. Additionally permission for Lambda invocation has been added in Lambda::Permission section.

  EventsCronExpression:
Description: Cron expression
Type: String
Default: '0 23 ? * FRI *'

 ScheduledRule:
Type: AWS::Events::Rule
Properties:
Name: !Ref EventName
Description: ScheduledRule
ScheduleExpression: !Sub 'cron(${EventsCronExpression})'
State: "ENABLED"
Targets:
-
Arn: !GetAtt LambdaStack.Outputs.S3CleanerArn
Id: AsgManageFunction
DependsOn:
- LambdaStack
  PermissionForEventsToInvokeLambda:
Type: AWS::Lambda::Permission
Properties:
FunctionName: !GetAtt LambdaStack.Outputs.S3CleanerName
Action: lambda:InvokeFunction
Principal: events.amazonaws.com
SourceArn: !GetAtt ScheduledRule.Arn

More or less, we’ve analyzed Cloudformation part. Now it’s high time to dive into Lambda code. Honestly, I’m not a programmer, but I experienced that via code much more tasks are easier/faster to be done. Of course, it depends on your coding skills, but believe me or not, if you do it once (provided it works) and spend some hours/days on preparation, you’re going to be more satisfied because you won’t ever do the same thing over and over again. It will be automated. I am aware of those people who are against automation, because it’s waste of time – “we can do the same via GUI”, but those lazy boys forget about some key bullets:

  • code versioning – which simplifies team work and give the general view (declarative approach) on the infrastructure and deployed services
  • easy to copy and to use the code in different environments (dev, qa, poc)
  • human error avoidance
  • It simplifies processes and contrary to humans it never sleeps
  • tasks can be launched simultaneously
  • maybe it’s not seen initially but it speeds your work, so you’ve got more time on other tasks
  • tasks can be launched periodically

Those are only some key aspects, but the more you automate the more benefits you encounter.

Python Lambda Soldiers – Heavy Duty Tanks

I’ll start with s3_cleaner.py, which is my heavy-duty tank for s3 bucket enemies. I’ve built class gears3 which apart from deleting empty buckets does some other issues which might be inherited by other tools.

In method called delete_empty_buckets we first check whether there’s something to be deleted. If not, only NOCHANGE message is sent. Otherwise, the number of buckets for deletion is being counted and then removed with proper message containing number of deleted buckets and their names.

   def delete_empty_buckets(self):
count = 0
deleted = []
try:
todelete = self.return_empty_s3()
length = len(todelete)
if length == 0:
responseData = {
'NumDeletedbuckets': count,
}
sendResponse(self.event, self.context, 'NOCHANGE', responseData)
return responseData
else:
for bucket in todelete:
self.resource.Bucket(bucket).delete()
count += 1
deleted.append(bucket)
if count != length:
raise Exception('Failed to delete all' + str(length) + 'buckets')
else:
responseData = {
'NumDeletedbuckets': count,
'DeletedBuckets': deleted
}
sendResponse(self.event, self.context, 'SUCCESS', responseData)
return responseData
except ClientError as e:
print(e.response['Error']['Code']+':', e.response['Error']['Message'])

sendResponse is a helper function used for building human readable message:
def sendResponse(event, context, responseStatus, responseData):
responseBody = {
'Status': responseStatus,
'Reason': 'See the details in CloudWatch Log Stream: {}'.format(context.log_stream_name),
'PhysicalResourceId': context.log_stream_name,
'Data': responseData}
print('Response Body:', {json.dumps(responseBody)})

As you remember from the description of the whole solution, this Lambda is sending a response to the Step Functions services which is doing its job then. Below you can see my “9mm” gun for calling Step Functions service. The arn value I’m taking from environment variable defined in following way:

stepfunctions_arn = os.environ['StepFunctionsArn']
 def call_step_functions(self, arn):
try:
client = boto3.client('stepfunctions')
response = client.start_execution(
stateMachineArn=arn,
name=generate_random_string(),
input= json.dumps(self.delete_empty_buckets())
)
except ClientError as e:
print(e.response['Error']['Code']+':', e.response['Error']['Message'])

The final part is only to define the lambda handler in which object is created and call_step_functions method invoked.

def lambda_handler(event, context):
bucket = gears3('s3', event, context)
stepfunctions_arn = os.environ['StepFunctionsArn']
bucket.call_step_functions(arn=stepfunctions_arn)

Python Lambda Soldiers – Navy Seals

The second Lambda that has been used was small class called Dynamo which apart from put_dynamo_items method, listed below, simply puts items into a particular dynamodb table and has some additional methods which I often use in case I need to work with Dynamodb service. Check them, use them, change them.

def put_dynamo_items(self, bucketlist, owner='chaosgears'):
try:
for bucket in bucketlist:
response = self.table.put_item(
Item={
'bucketname': bucket,
'date': datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
'hash': generate_random_string(),
'owner': owner
})
return response
except ClientError as err:
print(err.response['------Error']['Code']+':', err.response['Error']['Message'])

When the Dust Settles

This short story shows how a combination of serverless AWS services might ease the pain with repeatable tasks. Honestly, it’s a trivial case and the benefits – quite obvious:

  • We stopped monitoring empty buckets which after different kinds of test become a huge group of maggots that can slip your memory. That small step allows us to, at the end of the day, spend time on new challenges/automation tasks rather than coming back to unnecessary buckets cleaning
  • We’ve gained some knowledge about integration between Lambdas and Step Functions, got new skills with boto3 library and last but not least
  • We deployed a whole solution via CloudFormation, which showed us the Serverless Framework saves much more time and it’s easier to put everything together in one place.

As for the code deployed via Serverless Framework you’ll have to wait for next episode from this series… which is coming soon.