06.05, Katowice AWS Summit Poland
8 min read

Creating an AMI bakery to make time for actual baking

Time is more valuable than money, but cake trumps both. With automated (pre)baking of your AMIs you can have your cake — and eat it too.



I don’t actually remember when, but I’ve once read that a way to reduce lead time is to eliminate ‘waste’, which is ‘waiting time’. Humans are a good example of the wave of frustration when they’re forced to wait for something. We should have banners on our foreheads that say: “Don’t come any closer”.

Moreover, during the cooperation with customers we’ve noticed that the most common deployment strategy was to provision new nodes from top to bottom as the nodes were being launched. As you’re probably aware, running system updates, downloading packages and setting up configurations can take quite a long time. Everything looks nice unless some screw-ups occur while provisioning package updates or a new release of your app, which — in scenarios with rolling-updates (canary or blue/green are different story) ù might prolong the whole process dramatically.

At Chaos Gears, we hire believers in automation and the ‘everything fails’ principle and therefore, we pay strong attention to simplicity and avoid getting tangled in repeatable tasks.

Having all this information in our minds we considered a different approach for one of our latest customers.

Here are some thoughts…

Immutable = do it once and do it right

According to the 12-Factor App methodology, immutable components are replaced for every deployment rather than being updated in-place. Following this method, you treat your instances as deployable, ephemeral artifacts. They should be standalone and self-executable among environments.

Note:

Do not tag images. In accordance with the 10th rule of 12-Factor Apps — dev/prod parity — make images consistent across all of your environments, cloud or not. This will spare you a lot of headaches.

We all agreed that the application instances should have no dependency on external services which could be package repositories or configuration services. Literally, the AMI must be discrete and hermetic.

What kind of cake do you want?

Honestly, there is no “best” answer to the question regarding AMI baking, but you can check which way works best for you:

  1. Would you like to bake the software, configuration and your code into the AMI (Netflix-style)?
  2. Would you like to bake only the software and configuration and then download the app code during the instance launch time?
  3. Would you like to use a clean OS AMI then do everything on boot with usage of Ansible, Chef, Salt or simple ‘user-data’ feature in AWS?

We’ve chosen method number one. The reason was trivial — do it once and do it right. We were obligated to decrease the time of each release to the minimum (haven’t reached the final goal yet — containers needed to go in under 5 minutes). Our AMI has to go through a candidate test and release process every two weeks. We’ve divided the whole mechanism into several steps:

  1. Foundation AMI — AMI prepared by AWS.
  2. Base AMI — the one we’re talking about in this article with all necessary OS packages and tools.
  3. Base App AMI — image with all necessary packages required for proper app launch. It can be Tomcat, Apache, Python or anything else.
  4. Finally, mount the volume with app files and update them in terms of new release.

Packer, Aminator or maybe… something else?

After making the decision how to ease our pain, we came across the next obstacle: which was the simplest tool to maintain (after all, we don’t necessarily want to participate in each process — it’s called “rule of autonomy”) and which gave us the opportunity to bake AMI preparation.

We decided investigate 3 “bakers”:

Aminator

Written in Python and designed by Netflix after long period of using Bakery (which was the predecessor of current Aminator). Bakery had some drawbacks as it had been customized for CentOS base AMI and allowed to experiment with other Linux distributions just out-of-the-box. This is why the Netflix team had rewritten Bakery into Aminator. It supports Debian, Linux, RedHat and many other distributions and, what is more important, because it has a plugin architecture, it can be extended for other cloud providers, operating systems, or packaging formats.

It works like this:

  1. Create a volume from the snapshot of the base AMI.
  2. Attach and mount the volume.
  3. Chroot into the mounted volume.
  4. Provision application onto mounted volume using rpm or deb package.
  5. Unmount the volume and create a snapshot.
  6. Register the snapshot as an AMI.

Packer

Allows you to use a framework such as Chef, Ansible or Puppet, to install and configure the software within your Packer-made images. An important aspect of Packer is that it creates identical images for multiple platforms: you can run production in AWS and staging/QA in a private cloud, if you wish. After a machine image is built, that machine image can be quickly launched and smoke tested to verify that things are working as expected.

AWS Systems Manager

AWS Systems Manager is quite a different tool, combining several features, like built-in safety controls, allowing you to incrementally roll out new changes and automatically halt the roll-out when errors occur. Additionally, it presents a detailed view of system configurations, operating system patch levels, software installations, application configurations. In terms of security, AWS SSM maintains security and compliance by scanning your instances against your patch, configuration and custom policies. Last but not least, it supports hybrid environments containing Windows and Linux instances. Of course, we’re talking about AMI baking but having some extra feature in one place especially in terms of patching or maintenance for me sounds at least encouraging.

The choice

We have experience with both Packer and Aminator as well but… the winner for us was AWS Systems Manager. What tipped the scales was the following:

So, how can AWS Systems Manager bake this cake?

In general, SSM uses an agent, installed on each instance you want to maintain/monitor, and an IAM role required for the management of EC2 instances. This particular ManagedRole is called AmazonEC2RoleforSSM. Apart from that we also leveraged Maintenance Windows feature, which is really cool, because it allows you to specify a recurring time window during which Run Command and other tasks are executed. SSM wasn’t the only AWS service we used to set up this baking — we’ve also used:

Then, to make the magic real, we decided to implement the whole process as a continuous one and then, SSM brought hand in need with its Maintenance Windows feature, which according to the documentation “lets you define a schedule for when to perform potentially disruptive actions on your instances such as patching an operating system, updating drivers, installing software or patches”. Below you can see a screenshot from our maintenance windows and under the url sample document which has been used for AmazonLinux AMI baking. We’re using quite similar one for Ubuntu.

Configuration 1/3
Configuration 2/3
Configuration 3/3

11 steps of baking — AWS SSM maintenance window

As you probably noticed in the picture above, we’ve set 11 steps to prepare a single base AMI image. Let me shortly describe you what each step does:

  1. Invoke the Lambda Function, which then is looking for the newest Foundation AMI in selected region, regarding set parameters like: AMI_Name, Owner,
  2. Launch of a temporary EC2 instances from AMI selected in step 1,
  3. Verification if the installed SSM agent on the EC2 instance was done correctly,
  4. Update of all mandatory system packages is launched,
  5. Installation of additional packages like aws-cli or ansible,
  6. Download and launch of Ansible playbooks (formerly prepared) from S3 bucket,
  7. Creation of AMI images from EC2 instance (from step 2),
  8. Adding tags to make the image easy identifiable,
  9. Termination of Instance from step 2,
  10. Deletion of the Instance from AWS,
  11. Another Lambda Function is invoked, and it deletes all old images (older than the value given in the parameter).

With these 11 easy steps we receive an up-to-date image with all the required packages, updates and remedies for security vulnerabilities.

The finale

You can either provision all packages manually, via Ansible/Chef or anything else, even AWS SSM. It generally doesn’t matter. We’ve chosen to put repeatable tasks into one place and invoke the pipeline once per pre-set time.

At the end of the day you’ll find yourself at a point where you should ask yourself: “Do I want to be a clog in the machine or automate my time to have more time for other cool things”. It’s up to you, but remember — no matter which way you choose, “do not reinvent the wheel, just adjust it to your requirements”. Our team has gained confidence and decreased the possibility of mistakes occurring during manual package provisions (previously done via Ansible). This can happen many times, especially when you’re in a hurry doing many things at once.

If time is money, then we managed to save both.

Let's talk about your project

We'd love to answer your questions and help you thrive in the cloud.