AWS

Store with Automation

By 12/19/2018 June 18th, 2019 No Comments

Lazy Engineers

Do you sometimes feel that technology is developing too fast? Do you have an impression that you’re trying to catch up but you’re constantly standing in the same place? A place where your time is running away from you? You, after all, are taking care of very important tasks. But just maybe you’re spending too much time on doing the same things over and over, without any improvement? How about delegating the ‘dirty’ work to automation guys and use the time to do something different? Lazy Engineers are just like that. They use ready-to-go solutions and native services to do the repetitive work as much as possible. Lazy Engineers know that the number of tasks will always be more and not less and if there is no chance for additional hands to work, automation is a remedy here.

We’ve been talking recently about automation in the context of cleaning useless S3 buckets but this time we would like to focus on the possibilities that native AWS services could provide (CodePipeline, CodeBuild, ECR) and help us improve our everyday workflow.

Docker Shop

In today’s article we would like to create a mechanism of building docker images. The solution will be extremely easy but help in a simple way to illustrate how AWS automation services works and the benefits we can make thanks to them. The whole environment will be created in accordance with Infrastructure as Code principle using HashiCorp Terraform tool, the possibilities of which are often compared to AWS CloudFormation service.


Terraform is one of most popular open source tools to define and manage infrastructure for many providers. Based on HashiCorp Configuration Language (HCL), it codifies APIs into declarative configuration files that can be shared amongst team members.

The guiding idea for creating a Docker image is the need to have a common compatible environment for the whole team with exactly the same version of packages installed. In this scenario we would like to have the neighborhood prepared for the launching the Terraform tool. Yes, these same images we used to define our pipeline in terraform code and deploy on AWS (inception, yeah!). However, before we start the game, we need to go shopping and take our friends straight form automation store:

AWS CodePipeline – allows you to automate the release process for your application or service
AWS CodeBuild – helps to build and test code with continuous scaling
Amazon ECR – private Docker registry to store your images

Every great thing grows from a small seed. This is presented on the diagram below:

Picture: Docker image building workflow

  1. Dockerfile is stored in the source control repository (GitHub)
  2. AWS CodePipeline gets source code from this repository
  3. AWS CodeBuild runs a build job which is specified in buildspec.yml file included in source code and pushes Docker image to Amazon ECR
  4. Build project install required packages and push build image to Amazon ECR
  5. AWS CodeBuild stores logs into Amazon CloudWatch

At one’s fingertips

Dockerfile

The form and content of you image really depends on your needs. Remember, however, not to create too heavy images and try to build large applications in separate images.

FROM alpine:latest
ENV TERRAFORM_VERSION=0.11.10
ENV TERRAFORM_SHA256SUM=43543a0e56e31b0952ea3623521917e060f2718ab06fe2b2d506cfaa14d54527
RUN apk add --no-cache git curl openssh bash && \
    curl

https://releases.hashicorp.com/terraform/${TERRAFORM_VERSION}/terraform_${TERRAFORM_VERSION}_linux_amd64.zip

 > terraform_${TERRAFORM_VERSION}_linux_amd64.zip && \
    echo "${TERRAFORM_SHA256SUM}  terraform_${TERRAFORM_VERSION}_linux_amd64.zip" > terraform_${TERRAFORM_VERSION}_SHA256SUMS && \
    sha256sum -cs terraform_${TERRAFORM_VERSION}_SHA256SUMS && \
    unzip terraform_${TERRAFORM_VERSION}_linux_amd64.zip -d /bin && \
    rm -f terraform_${TERRAFORM_VERSION}_linux_amd64.zip

Buildspec.yml

When you use long list of commands or just want to repeat this step in another stage its recommend using YAML-formatted Buildspec file. You can define multiple build specifications and still store with the same repository but only one Buildspec file for a one build project.

Buildspec version 0.1 each command runs in isolation from all other commands (isolated shell in the build environment), that’s why it is recommended to use version 0.2 which solve this limitation. We can specify run-as indicating parameter which user runs particular commands or all of them, but root is a default. Environment variables can be defined in Buildspec or passed from the CodeBuild project. They are inexplicably stored in any logs.

version: 0.2
phases:
  install:
    commands:
      - echo "install step..."
      - apt-get update -y
  pre_build:
    commands:
      - echo "logging in to AWS ECR..."
      - $(aws ecr get-login --no-include-email --region $AWS_REGION)
  build:
    commands:
      - echo "build Docker image started on `date`"
      - docker build -t $ECR_ACCOUNT_ID.dkr.ecr.$

AWS_REGION.amazonaws.com/$IMAGE_REPO_NAME:$IMAGE_TAG

 $IMAGE_NAME/.
  post_build:
    commands:
      - echo "build Docker image complete `date`"
      - echo "push Docker image to ECR..."
      - docker push $ECR_ACCOUNT_ID.dkr.ecr.$

AWS_REGION.amazonaws.com/$IMAGE_REPO_NAME:$IMAGE_TAG

Builds in AWS CodeBuild proceed in few phases:

  • install – for installing packages
  • pre_build – for installing dependencies or signing in to another tool
  • build – for running build or testing tools
  • post_build – for building artifacts into a JAR/WAR file, pushing a Docker image into Amazon ECR or sending a build notification through Amazon SNS

Picture: AWS CodeBuild project action phases

The important thing is to be aware about transition rule every phase. Especially that POST_BUILD will execute even if BUILD fails. This allows you to recover partial artifacts for debugging build/test failures.

In case of building and pushing a Docker image to Amazon ECR, or running unit tests on your source code, but not building it, there is no need to defining artifacts. Otherwise, AWS CloudBuild can upload your artifact to the Amazon S3 output bucket.

Amazon ECR

We run our private repository in AWS. Just remember, everything is up to you. More work and more control or maybe not thinking about storage limits? Well, there is some puts limit to ECR, like 1000 repositories and 1000 images. But on the other hand, you can easily create lifecycle policy roles to remove unnecessary images or just increase limit for instance to 5000.

To be able to use your own private docker repository hosted in AWS, all you have to do is to create it and attach a policy which defines who can have access to it. When defining permissions, it’s good practice to follow the granting least privileges rule as possible and restrict only to the services which are actually used.

resource "aws_ecr_repository" "terraform" {
  name = "tf-images"
}
resource "aws_ecr_repository_policy" "terraform_policy" {
  repository = "${aws_ecr_repository.terraform.name}"
  policy = <<EOF
{
    "Version": "2008-10-17",
    "Statement": [
        {
            "Sid": "new policy",
            "Effect": "Allow",
            "Principal": {
                "Service": "

codebuild.amazonaws.com

"
            },
            "Action": [
                "ecr:BatchCheckLayerAvailability",
                "ecr:BatchGetImage",
                "ecr:CompleteLayerUpload",
                "ecr:GetDownloadUrlForLayer",
                "ecr:InitiateLayerUpload",
                "ecr:PutImage",
                "ecr:UploadLayerPart"
            ]
        }
    ]
}
EOF
}

AWS CodePipeline

Stage-based tool can build, test and deploy source code every time there is a code change, detect issues occurring on any step and thereby prevent from automatic deploying into environment. CodePipeline is fully managed AWS service providing Continuous Integration and Delivery features wraps other services depending of action type.

Source action integration

  • Amazon S3
  • AWS CodeCommit
  • GitHub
  • Amazon ECR

Build action integration

  • AWS CodeBuild
  • CloudBees
  • Jenkins
  • Solano CI
  • TeamCity

Test action integration

  • AWS CodeBuild
  • AWS Device Farm
  • BlazeMeter
  • Ghost Inspector
  • HPE StormRunner Load
  • Nouvola
  • Runscope

Deploy action integration

  • AWS CloudFormation
  • AWS CodeDeploy
  • AWS ECS
  • AWS OpsWorks Stacks
  • AWS Service Catalog
  • XebiaLabs

Approval action integration

  • Amazon Simple Notification Service (SNS)

Invoke action integration

  • AWS Lambda

Picture: AWS CodePipeline basic stages for building Docker images

Once you add with GitHub as a source repository you can use dedicated webhooks that starts your pipeline when a change occurs in the repository. Currently access for AWS CodePipeline grants permissions for all repositories to which that GitHub account has access to. If you want to limit the access AWS CodePipeline has to specific set of repositories, create a GitHub account, and grant that access to only the repositories you want to integrate with your pipeline.

resource "aws_codepipeline" "tf_image_pipeline" {
  name     = "tf-image-pipeline"
  role_arn = "${aws_iam_role.codepipeline_role.arn}"
  stage {
    name = "Source"
    action {
      name             = "GitHub-Source"
      category         = "Source"
      owner            = "ThirdParty"
      provider         = "GitHub"
      version          = "1"
      output_artifacts = ["terraform-image"]
      configuration {
        Owner  = "chaosgears"
        Repo   = "common-docker-images"
        Branch = "master"
        OAuthToken = "${data.aws_ssm_parameter.github_token.value}"
      }
    }
  }
  stage {
    name = "Build"
    action {
      name            = "Build-Image"
      category        = "Build"
      owner           = "AWS"
      provider        = "CodeBuild"
      input_artifacts = ["terraform-image"]
      version         = "1"
      configuration {
        ProjectName = "${

aws_codebuild_project.tf_image_build.id

}"
      }
    }
  }
}

If you have data that you don’t want users to alter or reference in clear text, such as tokens, passwords or license keys, then create those parameters using the Secure String data type in parameter store provided by AWS Systems Manager (SSM).

data "aws_ssm_parameter" "github_token" {
  name = "github_token"
}

To retrieve custom environment variables stored in SSM, you need to add the ssm:GetParameters action to your AWS CodePipeline service role and required permissions which allow to run CodeBuild project and ask AWS Secure Token Service (STS) to get temporary security credentials to make API calls to AWS CodePipeline service.

resource "aws_iam_role" "codepipeline_role" {
  name = "codepipeline-terraform-images"
  assume_role_policy = <<EOF
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "

codepipeline.amazonaws.com

"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}
EOF
}
resource "aws_iam_role_policy" "codepipeline_policy" {
  name = "codepipeline-terraform-policy"
  role = "${

aws_iam_role.codepipeline_role.id

}"
  policy = <<EOF
{
  "Version": "2012-10-17",
  "Statement": [
    {
        "Action": [
            "codebuild:BatchGetBuilds",
            "codebuild:StartBuild"
        ],
        "Resource": "${aws_codebuild_project.tf_image_build.arn}",
        "Effect": "Allow"
    },
    {
        "Action": [
            "ssm:GetParameters"
        ],
        "Resource": "${data.aws_ssm_parameter.github_token.arn}",
        "Effect": "Allow"
    }
  ]
}
EOF
}

It is worth mentioning, that when using multi-staging pipelines workflow, it increases the time a bit as opposed to the total launch time of individual stages. You can speed up this process by keeping your CodePipeline container ‘warm’ using git pushing garbage into a garbage branch e.g. every 15 seconds. Moreover, if a pipeline contains multiple source actions, all of them run again, even if a change is detected for one source action only.

AWS CodeBuild

With CodeBuild you are charged for computing resources based on the duration that it takes for your build to execute. The per-minute rate depends on the selected compute type. This seems to be more expensive if you have longer running builds or a lot of builds per month. Until the build server becomes a Single Point of Failure, builds start taking too long, everyone complains about delays you could want to have hosted solution what rids you of maintaining a build server.

resource "aws_codebuild_project" "tf_image_build" {
  name         = "tf-image-build"
  service_role = "${aws_iam_role.codebuild_role.arn}"
  artifacts {
    type = "CODEPIPELINE"
  }
  # The cache as a list with a map object inside.
  cache = {
    type     = "S3"
    location = "${aws_s3_bucket.codebuild.bucket}"
  }
  environment {
    compute_type    = "BUILD_GENERAL1_SMALL"
    image           = "aws/codebuild/docker:17.09.0"
    type            = "LINUX_CONTAINER"
    privileged_mode = "true"
    environment_variable = [{
      "name"  = "AWS_REGION"
      "value" = "${data.aws_region.current.name}"
      },
      {
        "name"  = "ECR_ACCOUNT_ID"
        "value" = "${data.aws_caller_identity.current.account_id}"
      },
      {
        "name"  = "IMAGE_NAME"
        "value" = "terraform"
      },
      {
        "name"  = "IMAGE_REPO_NAME"
        "value" = "${aws_ecr_repository.terraform.name}"
      },
      {
        "name"  = "IMAGE_TAG"
        "value" = "0.11.10"
      }
    ]
  }
  source {
    type      = “CODEPIPELINE"
  }
}

Build environment

A build environment includes operating system, programming language runtime and tools required to run a build. For non-enterprise or test workloads you can just reach for ready-to-use images provided and managed by AWS and choose available language e.g. Android, Java, Python, Ruby, Go, Node.js, or Docker. Windows Platform is only supported for as a base image with Windows Server Core 2016 version.

If not, you should consider a custom environment image from ECR store or another external repository. For Docker images up to 20GB uncompressed in Linux and 50GB uncompressed in Windows, regardless of the compute type. Compute power level support three different levels from 3GB/2vCPUs/64GB up to 15GB/8vCPUs/128GB.

For any data about region, account ID, user which is authorized or other values which are already established is recommended to extracting them dynamically. It can be achieved by Terraform data sources.

data "aws_caller_identity" "current" {}
data "aws_region" "current" {}

Enabled cache for AWS CodeBuild project can save notable build time. It also improves resiliency by avoiding external network connections to an artifact repository. Cached data will be stored in S3 bucket and can include whole files or only that will not change frequently between builds.

resource "aws_s3_bucket" "codebuild" {
  bucket = "codebuild-terraform-images"
  acl    = "private"
  region = "${var.aws_region}"
  versioning {
    enabled = true
  }
  tags {
    Service = "CodepBuild",
    Usecase = "Terraform docker images",
    Owner = "ChaosGears"
  }
}

AWS CodeBuild will require IAM service role to access created ECR repository, S3 bucket for cached data, specific log group in CloudWatch Logs and make calls to AWS Secure Token Service (STS).

resource "aws_iam_role_policy" "codebuild_policy" {
  name = "codebuild-terraform-policy"
  role = "${

aws_iam_role.codebuild_role.id

}"
  policy = <<EOF
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect":"Allow",
      "Action": [
        "ecr:BatchCheckLayerAvailability",
        "ecr:CompleteLayerUpload",
        "ecr:InitiateLayerUpload",
        "ecr:PutImage",
        "ecr:UploadLayerPart"
      ],
      "Resource": "${aws_ecr_repository.terraform.arn}"
    },
    {
      "Effect": "Allow",
      "Resource": [
        "arn:aws:logs:${data.aws_region.current.name}:${data.aws_caller_identity.current.account_id}:log-group:/aws/codebuild/${aws_codebuild_project.tf_image_build.name}",
        "arn:aws:logs:${data.aws_region.current.name}:${data.aws_caller_identity.current.account_id}:log-group:/aws/codebuild/${aws_codebuild_project.tf_image_build.name}:*"
      ],
      "Action": [
        "logs:CreateLogGroup",
        "logs:CreateLogStream",
        "logs:PutLogEvents"
      ]
    },
    {
      "Effect":"Allow",
      "Action": [
        "ecr:GetAuthorizationToken"
      ],
      "Resource": "*"
    },
    {
      "Effect":"Allow",
      "Action": [
        “s3:*”
      ],
      "Resource": [
        "${aws_s3_bucket.codebuild.arn}",
        "${aws_s3_bucket.codebuild.arn}/*"
      ]
    }
  ]
}
EOF
}
resource "aws_iam_role" "codebuild_role" {
  name               = "codebuild-terraform-images"
  assume_role_policy = <<EOF
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "

codebuild.amazonaws.com

"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}
EOF
}

How Can I Help You?

AWS Code* automating services are available from almost 2015 but it seems that they often seem to be underestimated. Even in environments that run in AWS cloud, we meet a more often universal Jenkins which offers a wide variety of plugins but without any reduction of time regarding additional work or limited but integrated with source code repository GitLabCI. Of course, the choice of target CI/CD tool should be made based on requirements. However, sometimes happens that the decision results from insufficient knowledge about the capabilities given to us by the tool. What we have shown today is just a drop in the sea of potential native AWS services. In near future, we would like to introduce more details of AWS Code* native tools served by one of the most popular cloud providers.