Solutions:
- Specify custom block device mapping during instance launch.
- Rebuild Ubuntu AMI (with automation).
TL;DR
I don't like to be woken up at 3 AM to restart a server, especially if hardware of the cloud provider running your instances decides to misbehave. Back in 2015 AWS made it possible to automatically recover EC2 instances from issues like:
- Loss of network connectivity
- Loss of system power
- Software issues on the physical host
- Hardware issues on the physical host that impact network reachability
With some not unreasonable limitations of course:
- Use a C3, C4, M3, M4, R3, R4, T2, or X1 instance type
- Run in a VPC (not EC2-Classic)
- Use shared tenancy (not dedicated hardware)
- Use EBS volumes (not ephemeral instance store volumes)
So if you're using official Ubuntu AMIs, you have a problem...
Let's examine the latest AMI of Ubuntu Trusty server from Canonical (HVM, EBS backed, x86_64). Run the following command (feel free to replace
--region
value with one of your preference):aws ec2 \ describe-images \ --region ap-southeast-1 \ --owners 099720109477 \ --filters \ Name=root-device-type,Values=ebs \ Name=architecture,Values=x86_64 \ Name=name,Values='*ubuntu-trusty-14.04*' \ --query 'sort_by(Images, &CreationDate)[-1]'
Take note of the
"BlockDeviceMappings"
array in the output. You should see two ephemeral devices defined:{ "DeviceName": "/dev/sdb", "VirtualName": "ephemeral0" }, { "DeviceName": "/dev/sdc", "VirtualName": "ephemeral1" }
Oops... This means that if you're using Canonical AMIs, your instances will not auto-recover.
My team and I have been unfortunate enough to confirm this a couple of times in the past month. AWS support and this forum post suggest to override block device mapping when launching the instance. However, this implies you're using AWS CLI or AWS Console to do so, what if you're lunching your instances into an Auto Scaling Group?
The alternative is to bake your own AMI without ephemeral block device mappings, which is what this post is about.
I'm a big fan of HashiCorp tools and have used Packer in the past to build my own VirtualBox/VMware vagrant boxes, so I decided to revisit it as an AMIs baking tool.
Packer getting started example gave me a good starting point, I just had to make a few small alterations to accommodate my environment and security constrains.
Here's my packer template: https://github.com/sshvetsov/packer-ubuntu-rebake/blob/master/ubuntu-14.04-x86_64.json
Notable deviations from original example:
- Remove
"access_key"
and"secret_key"
, these will be read from~/.aws/credentials
or passed via environment with awsudo. - Replace
"source_ami"
with"source_ami_filter"
to automatically use the latest AMI. - Add
"ami_block_device_mappings"
to remove ephemeral device mappings with the help of"no_device"
property. - Add
"ami_description"
and"tags"
, cause metadata is good. - Explicitly set
"vpc_id"
,"subnet_id"
and"associate_public_ip_address"
via external variables file. - Use
"force_deregister"
and"force_delete_snapshot"
to replace existing AMIs with the same name.
packer build ubuntu-14.04-x86_64.json
In my dev environment I run:
awsudo -u sa@dev -- packer build -var-file=vars-dev.json ubuntu-14.04-x86_64.json
and a couple of minutes later you have a new Ubuntu AMI without those pesky device mappings:
amazon-ebs output will be in this color. ==> amazon-ebs: Force Deregister flag found, skipping prevalidating AMI Name amazon-ebs: Found Image ID: ami-19c87b7a ==> amazon-ebs: Creating temporary keypair: packer_58cec6dc-dff7-c642-cc8f-3e1fecd15019 ==> amazon-ebs: Creating temporary security group for this instance... ==> amazon-ebs: Authorizing access to port 22 the temporary security group... ==> amazon-ebs: Launching a source AWS instance... amazon-ebs: Instance ID: i-0587ac0941f65f4f4 ==> amazon-ebs: Waiting for instance (i-0587ac0941f65f4f4) to become ready... ==> amazon-ebs: Adding tags to source instance ==> amazon-ebs: Waiting for SSH to become available... ==> amazon-ebs: Connected to SSH! ==> amazon-ebs: Stopping the source instance... ==> amazon-ebs: Waiting for the instance to stop... ==> amazon-ebs: Deregistered AMI hvm-ssd/ubuntu-trusty-14.04-x86_64-server, id: ami-2af74a49 ==> amazon-ebs: Deleted snapshot: snap-08dc6ce8b2e28c50d ==> amazon-ebs: Creating the AMI: hvm-ssd/ubuntu-trusty-14.04-x86_64-server amazon-ebs: AMI: ami-13e85570 ==> amazon-ebs: Waiting for AMI to become ready... ==> amazon-ebs: Modifying attributes on AMI (ami-13e85570)... amazon-ebs: Modifying: description ==> amazon-ebs: Modifying attributes on snapshot (snap-01d7b1b146e61d64c)... ==> amazon-ebs: Adding tags to AMI (ami-13e85570)... ==> amazon-ebs: Tagging snapshot: snap-01d7b1b146e61d64c ==> amazon-ebs: Creating AMI tags ==> amazon-ebs: Creating snapshot tags ==> amazon-ebs: Terminating the source AWS instance... ==> amazon-ebs: Cleaning up any extra volumes... ==> amazon-ebs: No volumes to clean up, skipping ==> amazon-ebs: Deleting temporary security group... ==> amazon-ebs: Deleting temporary keypair... Build 'amazon-ebs' finished. ==> Builds finished. The artifacts of successful builds are: --> amazon-ebs: AMIs were created: ap-southeast-1: ami-13e85570
Check to confirm with this command (replacing
ami-XXXXXXXX
with AMI ID returned by packer):aws ec2 describe-images --image-ids ami-XXXXXXXX
References:
- AWS Blog: New – Auto Recovery for Amazon EC2
- AWS Docs: Recover Your Instance
- AWS Docs: Troubleshooting Instance Recovery Failures
- AWS Forums: Can not configure auto-recovery on an eligible instance
- AWS Docs: Updating the Block Device Mapping when Launching an Instance
- AWS Docs: Amazon EC2 Instance Store
- Packer: Build an Image