Time is more valuable than money, but cake trumps both. With automated (pre)baking of your AMIs you can have your cake — and eat it too.
I don’t actually remember when, but I've once read that a way to reduce lead time is to eliminate ‘waste’, which is ‘waiting time’. Humans are a good example of the wave of frustration when they’re forced to wait for something. We should have written banner on our foreheads: “don’t come any closer”. Moreover, during the cooperation with customers we’ve noticed that the most common deployment strategy was to provision new nodes from top to bottom as the nodes were being launched. As you’re probably aware, running system updates, downloading packages and setting up configurations can take quite a long time. Everything looks nice unless some screwups occur while provisioning the package updates or a new release of your app, which in scenarios with rolling-updates (canary or blue/green are different story) might prolong the whole process dramatically. In Chaos Team we’re hiring believers in automation and the ‘everything fails’ principle and therefore, we pay strong attention to simplicity and exclude ourselves from being part of each repeatable task.
Having all this information in our minds we considered a different approach for one of our latest customers.
Here are some thoughts…
According to the 12-Factor App methodology, immutable components are replaced for every deployment rather than being updated in-place. Following this method, you treat your instances as deployable, ephemeral artifacts. They should be standalone and self-executable among environments.
NOTE: Do not tag images. In accordance with 10th rule of 12-Factor App, ‘Dev/Prod parity’, make images consistent across your dev, qa, staging, prod or any other cloud environment. Believe me, you’re going to save a lot of time.
We all agreed that the application instances should have no dependency on external services which could be package repositories or configuration services. Literally, the AMI must be discrete and hermetic.
Honestly, there is no “best” answer to the question regarding AMI baking but you can check which way works best for you:
We’ve chosen method number one. The reason was trivial — do it once and do it right. We were obligated to decrease the time of each release to the minimum (haven’t reached the final goal yet — containers needed to go in under 5 minutes). Our AMI has to go through a candidate test and release process every two weeks. We’ve divided the whole mechanism into several steps:
After making the decision how to ease our pain, we came across the next obstacle: which was the simplest tool to maintain (remember I was writing that we don’t necessarily want to participate in each process — it’s called ‘rule of autonomy’) and which gave us the opportunity to bake AMI preparation.
Generally, we decided to take 3 “bakers” under investigation:
Written in python and designed by Netflix Team after long period of using the Bakery (which was the predecessor of current Aminator). Bakery had some drawbacks as it had been customized for CentOS base AMI and allowed to experiment with other Linux distributions just out-of-the-box. This is why the Netflix Team had rewritten Bakery into Aminator. It supports Debian, Linux, RedHat and many other distributions and, what is more important, because of it is structured using a plugin architecture it can be extended for other cloud providers, operating systems, or packaging formats.
So, it works like this:
Allows you to use a framework such as Chef, Ansible or Puppet, to install and configure the software within your Packer-made images. An important aspect of Packer is that it creates identical images for multiple platforms, you can run production in AWS, staging/QA or even in a private cloud. After a machine image is built, that machine image can be quickly launched, and smoke tested to verify that things are working.
AWS Systems Manager is quite a different tool, combining several features, like built-in safety controls, allowing you to incrementally roll out new changes and automatically halt the roll-out when errors occur. Additionally, it presents a detailed view of system configurations, operating system patch levels, software installations, application configurations. In terms of security, AWS SSM maintains security and compliance by scanning your instances against your patch, configuration and custom policies. Last but not least, it supports hybrid environments containing Windows and Linux instances. Of course, we’re talking about AMI baking but having some extra feature in one place especially in terms of patching or maintenance for me sounds at least encouraging.We have experience with Packer and Aminator as well but… the winner was AWS Systems Manager.
The key benefits, which helped us select the most suitable solution
In general, SSM uses an agent, installed on each instance you want to maintain/monitor, and an IAM role required for EC2 instances’ management. This particular ManagedRole is called ‘AmazonEC2RoleforSSM’. Apart from that we also leveraged Maintenance Windows feature, which is really cool, because it allows you to specify a recurring time window during which Run Command and other tasks are executed. SSM wasn’t the only AWS service we’ve used to bring this baking to the end therefore we’ve add following ones:
First one: looking for the newest available Foundation AMI (AMI provided by AWS) based on proper input parameters,
Second one: deleting old images in accordance with retention policy (CleanUp-Ami-Images),
Then, to make the magic real, we decided to implement the whole process as a continuous one and then, SSM brought hand in need with its Maintenance Windows feature, which according to the documentation “lets you define a schedule for when to perform potentially disruptive actions on your instances such as patching an operating system, updating drivers, installing software or patches”. Below you can see a screenshot from our maintenance windows and under the url sample document which has been used for AmazonLinux AMI baking. We’re using quite similar one for Ubuntu.
As you’ve probably noticed in the picture above, we’ve set 11 steps to prepare a single base AMI image. Let me shortly describe you what each step does:
With these 11 easy steps we received up-to-date image with all the required packages, updates and remedies for security vulnerabilities.
You can either provision all packages manually, via Ansible/Chef or anything else, even AWS SSM. It generally doesn’t matter. We’ve chosen to put repeatable tasks into one place and invoke the pipeline once per pre-set time.
At the end of the day you’ll find yourself at a point where you should ask yourself: “Do I want to be a clog in the machine or automate my time to have more time for other cool things”. It’s up to you but remember — no matter which way you choose “do not reinvent the wheel, just adjust it to your requirements”. Our team has gained confidence and decreased the possibility of mistakes occurring during manual package provisions (previously done via Ansible). This can happen many times, especially when you’re in a hurry doing many things at once. And last but not least, if time is money then we managed to save both. We are all aware of the saying that “time is money”. We believe that reduction of waste time which we spend on repeatable tasks, allows us to accelerate our activities in other areas, therefore earn money. Keep in mind that in the contemporary world time is the most valuable currency.
It’s high time to Tame your Chaos!
We'd love to answer your questions and help you thrive in the cloud.