Making sure your data is encrypted when it's being stored somewhere outside of your direct control is a good idea. When that system has your customers data on it, it's a requirement. Unfortunately when your data is in a cloud environment like Amazon EC2 your options can be limited, confusing, or both. Questions like where does one store your key (hint, not in the cloud), what encryption method should you use, what should I and can I encrypt and what happens when your EC2 instance reboots are important things to think about upfront.
This guide hopes to cover all of those and if you are using an Amazon Linux AMI (basically Centos 5.6) I'll also describe the steps you need to get started. To be specific I'm using the Basic 64-bit Amazon Linux AMI 2011.09 (AMI Id: ami-1b814f72) but these instructions should work fine with the 32-bit AMI as well.
4 Issues to Think About First
Issue 1: Managing your Encryption Keys
Before you even start to wonder what options are available, you need to think about key management and storage (this might be your private key or your encryption passphrase). For the same reasons you don't leave your car keys taped to the door of your car when you leave it in the parking lot you don't want to ever store your encryption key in the cloud. There are other cloud solutions out there that can manage your keys for you and keep them separate from where you store your data, but if you have serious privacy and confidentiality concerns, it's best to keep those keys locked up no where near where you are keeping your data. In my case I keep my passphrase locked up in my head and in an encrypted locker on a system with whole disk encryption that is in my office. I'm not the first or the only one to say it but it's my rule #5 for Cloud Computing: Encryption: Never store your keys in the cloud. I haven't published my rules for cloud computing publicly yet, but think ZombieLand. Perhaps that might be my next post.
Issue 2: Choosing an Encryption System
This really depends on the architecture of your system, within Amazon AWS you have a few choices. If you use S3 you can encrypt your data using Amazon's server side encryption feature (but this violates rule #5). If you don't have root access you should consider a userspace system like EncFS but it's more than likely you have root access if you are reading this and in which case you are going to want to use dm-crypt/LUKS using cryptsetup to configure and set everything up. If you are wondering after you have picked a system what encryption algorithm to use, I went with AES-256 (often the default) which is more than adequate. I'm in no way going to debate algorithm strengths and weaknesses here or ever, I leave that to real mathematicians and you should too.
Issue 3: What to Encrypt?
I know, you are thinking this a dumb question with an easy answer, you want to encrypt your data of course, but is that all you need to encrypt? Don't forget swap space, your tmp files or your users home directories as well. You might be thinking why not just encrypt everything? Unfortunately this is somewhat of a difficult undertaking with EC2 because you don't have terminal access to your instances when they boot. While it is theoretically possible for you to place your keys on a temporary and unencrypted EBS volume and use that as a helper to start-up your system, it violates rule #5. So do you really need to encrypt everything? If you have a EC2 instance that uses swap (check yours, not all do) consider encrypting at least that plus your data at a minimum. In my case I didn't use swap and only needed to encrypt my data.
Issue 4: Dealing with reboot
Sooner or later you are going to reboot your system (or a system failure will do it for you), when you do, your system is going to try to automount your encrypted file system, when this happens your instance is going to hang waiting for you to enter the password on the terminal, not good when your system is in EC2 and you don't have terminal access. You could completely unmount your file system (and remove the LUKS mapping) before you reboot, but what if you forget to do that or if your system has an "unscheduled reboot"? Before recent Linux kernel updates you only needed to modify your /etc/crypttab to not automount but now that the latest kernels use dracut you need to edit your /etc/grub.conf file instead to skip automounting md-crypt/LUKS filesystems. Once your system has rebooted, you can then easily ssh in and mount your encrypted volumes. You could automate all of this and put your keys on your EC2 system but again that would violate rule #5. A better way would be to use a helper system that is not in the cloud that would monitor your EC2 instances and detect automatically when a instance is rebooted and needs post boot configuration, like mounting your encrypted filesystem. Ultimately the system I'm building will use a local helper to monitor and maintain the parts of my system that I don't want in the cloud but for now I ssh into my systems and take care of things by hand.
Side note: What to do if you forgot to unmount and now your instance is stuck?
You might have found this post because you just rebooted your instance and now have found out that it's stuck and won't start. You could try waiting for a while and see if it ever boots because according to various documentation the password prompt should time out eventually but when this happened to me, my instance remained stuck no matter how long I waited. Now, I could just be impatient, but geting out of this jam is easy. Just detach your encrypted EBS volume from your instance (from another EC2 instance or the AWS web console) and reboot. Then re-attach after it's up and you are back in the running.
Setting up EC2 Encrypted File Storage
Enough already with the preamble, lets walk through the process of setting this all up.
Step 1: Create an EBS volume
You can do this from the AWS web interface or from the command line, for this example I'm going to do everything from the command line of the EC2 instance I want to configure.
ec2-create-volume --size 10 -z us-east-1c
Size is in gigabytes, the zone you pick must be the same as the zone your EC2 instance is located in otherwise you won't be able to attach the volume to your instance, if you don't know what zone you are in use the EC2 metadata query tool. You could also just check it from the AWS web console to find out but that's cheating and violates my Rule #1 of Cloud Computing: To Know the Cloud is to Know it's API's. When you have the time, you should spend some time getting to know the EC2 meta-data API, it's extremely handy for a lot of things, like getting the run time of your instance for example, but I digress...
Sample output will be something like this (save that vol-XXXXXXXX you will need it later):
VOLUME vol-XXXXXXXX 1 us-east-1c creating 2011-11-26T20:08:15+0000
Step 2: Attach it
Using the vol-XXXXXXXX that you received earlier run the following.
ec2-attach-volume vol-XXXXXXXX -i i-XXXXXXXX -d /dev/sdf
Where i-XXXXXXXX is your instance ID and /dev/sdf is a free device to attach it to. Choose something that is not being used between sdf and sdp, note that the Amazon Linux AMI will likely translate /dev/sdf into /dev/xvdf. If you don't know your instance ID, use the EC2 metadata tool again or look it up in the AWS web console (cheater!).
ATTACHMENT vol-XXXXXXXX i-XXXXXXXX /dev/sdf attaching 2011-11-26T20:19:45+0000
Step 3: Setup the volume for encrypted storage
Now that we have an EBS volume and it's attached, lets ready it by formatting it for encryption
cryptsetup -y luksFormat /dev/xvdf
Note that I went ahead and changed from /dev/sdf to /dev/xvdf because on the Amazon Linux AMI that's how your devices will get mapped, just make sure you choose the right device here, once you format your volume, there is no going back. The command will prompt you for confirmation, then a passphrase and then quickly prepare the volume (should take a few seconds).
This will overwrite data on /dev/xvdf irrevocably.
Are you sure? (Type uppercase yes): YES
Enter LUKS passphrase: ***********
Verify passphrase: **********
Step 4: Verify the results of your handiwork
You can make sure your new shiny volume is indeed setup for encryption by typing the following:
cryptsetup luksDump /dev/xvdf
LUKS header information for /dev/xvdf
Cipher name: aes
Cipher mode: cbc-essiv:sha256
Hash spec: sha1
Payload offset: 4096
MK bits: 256
MK digest: xx 22 e1 53 6a 17 hj xx d8 d7 05 55 b7 ee 57 c0
MK salt: ec d3 2e 0s f6 e0 05 7e 30 rf xx 76 8d 26 fg 00
c3 kl a0 db xx 68 39 d9 a5 30 31 jk 51 dx 00 c0
MK iterations: 32375
Key Slot 0: ENABLED
Salt: xx 30 9d 9b 5e 6b e9 a4 dd g3 fa b6 80 dc 55 ze
9c b0 fg x8 11 9c ec 41 94 hf be cj 40 89 k3 fd
Key material offset: 8
AF stripes: 4000
Key Slot 1: DISABLED
Key Slot 2: DISABLED
Key Slot 3: DISABLED
Key Slot 4: DISABLED
Key Slot 5: DISABLED
Key Slot 6: DISABLED
Key Slot 7: DISABLED
Step 5: Create a device we can map to
Now that we have an encrypted volume set up, we need to set up a device mapping that we can actually mount, to do so type the following replacing myencfs with something easy for you to remember:
cryptsetup luksOpen /dev/xvdf my_enc_fs
You will be prompted for your password and if everything runs right you will be left with a new device that you can mount or use at /dev/mapper/myencfs (assuming you named your device "myencfs"
Enter passphrase for /dev/xvdf: ***********
Step 6: Format with your filesystem of choice
At this point the volume is now just like any other newly attached disk volume and needs to be formated and initialized for use. To format for ext4, a fine choice for a filesystem, type the following
mkfs.ext4 -m 0 /dev/mapper/my_enc_fs
mke2fs 1.41.12 (17-May-2010)
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=0 blocks, Stripe width=0 blocks
65408 inodes, 261632 blocks
0 blocks (0.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=268435456
8 block groups
32768 blocks per group, 32768 fragments per group
8176 inodes per group
Superblock backups stored on blocks:
32768, 98304, 163840, 229376
Writing inode tables: done
Creating journal (4096 blocks): done
Writing superblocks and filesystem accounting information: done
This filesystem will be automatically checked every 23 mounts or
180 days, whichever comes first. Use tune2fs -c or -i to override.
Step 7: Mount it!
Everything's been initialized, configured and formated and your filesystem is now ready for use, this is the moment you have been waiting for, go ahead and create a mount point and mount the volume.
mount /dev/mapper/my_enc_fs /encrypted_vol
You can now start copying files to /encrypted_vol with the confidence your data is encrypted. In case you are paranoid you can verify this by running the following
cryptsetup status my_enc_fs
You should see something like the following:
/dev/mapper/my_enc_fs is active and is in use.
keysize: 256 bits
offset: 4096 sectors
size: 14752 sectors
Step 8: Disable automount
Your next step should be to disable auto-mounting of dm-crypt/LUKS filesystems to prevent your system from hanging on reboot. To do this you need to add rdNOLUKS to your /etc/grub.conf
Mine looked like this after I finished editing it:
# created by imagebuilder
title Amazon Linux 2011.09 (188.8.131.52-103.47.amzn1.x86_64)
kernel /boot/vmlinuz-184.108.40.206-103.47.amzn1.x86_64 root=LABEL=/ console=hvc0 LANG=en_US.UTF-8 rd_NO_LUKS KEYTABLE=us
title Amazon Linux AMI
kernel /boot/vmlinuz-220.127.116.11-97.44.amzn1.x86_64 root=LABEL=/ console=hvc0 rd_NO_LUKS
Step 9: Profit! Your done!
That's it, your system is all set for secure file storage. Just remember to always store your encrypted data in the right place and if you forget, securely delete it after copying into the right place. If you should ever want to unmount and fully detach your encrypted volume, do the following:
cryptsetup luksClose my_enc_fs
Your encrypted volume is now ready to be detached from your instance and re-attached to another one if desired. To re-attach run the following commands (again assuming the volume was mounted to /dev/xvdf, you will be prompted for your pass phrase after the first command):
cryptsetup luksOpen /dev/xvdf my_enc_fs
mount /dev/mapper/my_enc_fs /encrypted_vol
You will also need to run these command after you reboot your instance to get access to your data. If you have application data like MySQL DB's located in your encrypted volume, you should create a script that takes care of mounting your volume and making sure MySQL is running after things are set up which by the way is Rule #17 of the cloud: Automate everything, it it's worth doing once, it's worth automating.