I recently decided to put some of my most critically secure data in the cloud. I spent time over several weeks researching how best to do this, and got advice from some of my colleagues with the thickest tinfoil hats. The purpose of this post is to document my approach, mostly so that in the future I'll remember what I did, exactly. Perhaps it will also be useful to others who care about security but, like me, are not infosec professionals.

Like many others, I had been using an encrypted USB drive which stored my most important secure data. For me, that's mostly my PGP revocation certificates and private master keys; I only keep subkeys on my laptop. If I had a bitcoin wallet, it would be on there, too, along with some other keys and credentials.

This always made me anxious, though. USB keys are really easy to lose, and if someone really wanted to, easy to steal. The drive is well-encrypted, so losing it wouldn't necessarily pose a great risk to the data (ignoring the wrench scenario), but it would be a royal PITA for me to recover from the data loss. And remembering to carry it around anytime I might need it was also somewhat obnoxious. Backing up data and making it easily available from anywhere are both things that cloud storage are really good at, so the solution seems pretty obvious.

Security and convenience are generally at two opposite ends of a sliding scale. Here, though, I honestly don't think I sacrificed much in the way of security for dramatically more convenience.

Background

My scheme really isn't anything special, and isn't all that different from an encrypted USB drive (i.e., an encrypted portable filesystem). The primary difference is that instead of keeping the filesystem on a USB drive, it's just a file that I store in the cloud. Or, put differently, it's an encrypted file that contains a filesystem. When I want to use it, I decrypt and mount it just like any other block device. I'm also using detached LUKS headers to provide something like 2FA.

This is a common approach, and really isn't novel in any way. As I said, this post is mostly for me to document my own research and process.

It's worth noting that, using the same process I describe below, you can easily containerize data into different encrypted files (e.g., your PGP master key in one, your Bitcoin wallet in another, etc.,). As long as you aren't re-using passwords, this compartmentalization provides even more security.

LUKS

I'm using the Linux Unified Key System (LUKS) for encryption. It's easy to use on Linux systems, and well-trusted. On the downside, it is not supported on non-Linux systems, namely macOS. I don't have a Mac at the moment, though, so that's not currently a problem.

There are a few things worth being aware of when you create a LUKS block device, which I'll lay out, here. LUKS can be confusing to understand at first, especially if you don't have a background in cryptography. The ArchLinux wiki has a great ASCII visual that shows how LUKS works (available and used here under the GNU Free Documentation License 1.3+):

╭┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈╮                         ╭┈┈┈┈┈┈┈┈┈┈┈╮
┊ user passphrase  ┊━━━━━⎛key derivation⎞━━━▶┊    key    ┊
╰┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈╯ ,───⎝   function   ⎠    ╰┈┈┈┈┈┬┈┈┈┈┈╯
╭──────╮            ╱                              │
│ salt │───────────´                               │
╰──────╯                                           │
╭─────────────────────╮                            ▼         ╭┈┈┈┈┈┈┈┈┈┈┈┈╮
│ encrypted master key│━━━━━━━━━━━━━━━━━━━━━━(decryption)━━━▶┊ master key ┊
╰─────────────────────╯                                      ╰┈┈┈┈┈┈┈┈┈┈┈┈╯

When you format something with LUKS, you provide the user passphrase in the above visual. This is not the password that is used to encrypt the data. A salt is added to the passphrase, and the result is input into a key derivation function. The result of this key derivation function is then used to protect the master key, which is what is used to actually encrypt the data. For the key derivation function, LUKS defaults to "Password Based Key Derivation Function 2" (PBKDF2), which is part of IETF RFC 2898. I found this Stack Overflow post on password hashing to be tremendously educational in understanding this structure.

LUKS is used through cryptsetup, which supports a number of other schemes, as well.

Passwords

The password you use makes a critical difference in the security of your data. The cryptsetup / LUKS FAQ has a good technical analysis of this. From that link, some back-of-the-envelope math on cost-to-break by password entropy:

Unfortunately, the way we have been taught to think about passwords is wrong. XKCD has a great comic on this topic (used under CC BY-NC 2.5):

I recommend reading the FAQ link on password entropy, and then playing with this Password Strength Test to get a feel for how strong different passwords are.

Password Hashing

The PKBDF2 algorithm used by LUKS will apply some hash function many times to thwart brute-force attacks.

If you run sudo cryptsetup --help, it will dump the defaults for your cryptsetup installation. On my system, this shows:

Default PBKDF2 iteration time for LUKS: 2000 (ms)

Default compiled-in device cipher parameters:  
    loop-AES: aes, Key 256 bits
    plain: aes-cbc-essiv:sha256, Key: 256 bits, Password hashing: ripemd160
    LUKS1: aes-xts-plain64, Key: 256 bits, LUKS header hashing: sha256, RNG: /dev/urandom

So, on my system, the LUKS default is 256-bit keys hashed with sha256 however many iterations it takes to consume two seconds of wall-clock time. You can change all of these settings with flags to cryptsetup, of course.

To see how different hash functions impact computation time, run sudo cryptsetup benchmark. The output of that command on my system shows:

PBKDF2-sha1       551301 iterations per second for 256-bit key  
PBKDF2-sha256     732245 iterations per second for 256-bit key  
PBKDF2-sha512     587108 iterations per second for 256-bit key  
PBKDF2-ripemd160  324435 iterations per second for 256-bit key  
PBKDF2-whirlpool  231575 iterations per second for 256-bit key  

The thing to keep in-mind, here, is that performance is different from system to system. If you choose a complex hash algorithm with many iterations that completes in 5 seconds on your current system, but then need to open your LUKS device on some other slower system, you may find it takes much longer. The reverse is also true, and upgrading your system will result in a faster computation time for the same algorithm.

Choosing a Cipher

This is the cipher with which your data is actually encrypted. With LUKS, the storage device never sees unencrypted data. Since the data is encrypted / decrypted on-the-fly, the cipher directly affects disk access speeds. It is entirely possible to outfit your computer with an ultra-fast SSD only to bottleneck its throughput with encryption.

I recommend running sudo cryptsetup benchmark, if you didn't do it earlier, to see the performance benchmarks for your system. On my laptop, I get:

#  Algorithm | Key |  Encryption |  Decryption
     aes-cbc   128b   585.8 MiB/s  2527.1 MiB/s
 serpent-cbc   128b    79.3 MiB/s   502.5 MiB/s
 twofish-cbc   128b   179.8 MiB/s   328.0 MiB/s
     aes-cbc   256b   442.3 MiB/s  1685.8 MiB/s
 serpent-cbc   256b    74.4 MiB/s   476.3 MiB/s
 twofish-cbc   256b   175.7 MiB/s   310.4 MiB/s
     aes-xts   256b  1771.8 MiB/s  2025.3 MiB/s
 serpent-xts   256b   471.8 MiB/s   457.7 MiB/s
 twofish-xts   256b   310.7 MiB/s   281.3 MiB/s
     aes-xts   512b  1561.7 MiB/s  1275.4 MiB/s
 serpent-xts   512b   468.7 MiB/s   455.3 MiB/s
 twofish-xts   512b   312.5 MiB/s   321.8 MiB/s

You'll note that xts mode has significantly better performance, which is expected as it was designed for efficient disk encryption. In terms of key length, 128 bits or more is pretty damned good. Note that when using an xts mode, your key length is actually half of what's listed (so aes-xts with a 256b key actually only uses a 128b key).

Per the earlier dump from cryptsetup --help, the default on my system is aes-xts-plain64 with 256b keys. This looks excellent in terms of throughput and also provides great security, so if I was encrypting a portion of my SSD or a USB drive I would likely roll with that. Since I'm only encrypting a small amount of data contained in a single small file, though, I could go with basically any option and not see a real difference in performance.

Your (P)RNG

Random Number Generators (RNG) play a crucial role in cryptography algorithms. Since you don't have a way of creating truly random data on your computer, these are usually called Pseudo RNGs (PRNG). Linux systems generally provide /dev/random and /dev/urandom for this purpose, and you'll see one of them listed as your RNG in the cryptsetup --help defaults.

Right now, there doesn't seem to be a reason to really worry about which one you are using. If you're interested in the difference between /dev/random and /dev/urandom, this post is helpful.

Detached Headers

At the highest level, a LUKS-encrypted block has two pieces: the header, and the encrypted data. The header is unencrypted, and provides the information necessary to interact with the encrypted data. With cryptsetup, you have the option of storing these two pieces, the header and the data, separately rather than in one 'file'. This is called a detached header. There are two main reasons to use detached headers, in my opinion: backups, and additional security.

First, even if you don't use detached headers, you should back-up your LUKS headers. As the LUKS FAQ states, if you don't back-up your headers you will eventually lose access to your data. It's just a fact of hardware decay.

Secondly, by storing the header separately, even if an attacker got their hands on the encrypted file, it would be totally useless without the header. Per the FAQ, in fact, simply blowing away the LUKS header is a good way to effectively make your data permanently inaccessible. Detached headers essentially provide a method of 2FA. Since the point of this exercise is for me to feel comfortable putting my encrypted data on the Internet, I like the additional security this provides.

Using detached headers with LUKS is super easy: you simply use the --header flag.

Process

So, given that background, here was my process:

Creating the LUKS-Encrypted File

First, allocate the two files - one for the detached header, and one for the encrypted data:

$ fallocate -l 2MiB enc.h
$ fallocate -l 10M encData.bin

Next, we'll create a LUKS block with those files. Here is where you would pass flags like --hash, --cipher, etc., if you wanted to use something other than your cryptsetup defaults.

$ sudo cryptsetup luksFormat -y --header enc.h encData.bin

Now, I recommend inspecting your LUKS header to make sure it reflects what you asked for:

$ sudo cryptsetup luksDump enc.h

Now, we need to open up the LUKS block and create a filesystem on it.

$ sudo cryptsetup luksOpen --header ./enc.h ./encData.bin encVolume
$ sudo mkfs.ext4 /dev/mapper/encVolume

You're done. Now you can mount the device and use it as you would any other disk image in Linux.

Usage

Here's the quick summary of how you use your file day-to-day:

1) Map your LUKS device:

$ sudo cryptsetup luksOpen --header ./enc.h ./encData.bin encVolume

2) Mount the volume:

$ sudo mount /dev/mapper/encVolume <mount point>

3) Do stuff.

4) Unmount the volume:

$ sudo umount <mount point>

5) Unmap the LUKS device:

$ sudo cryptsetup luksClose encVolume

Close

As I said at the top of this post, there isn't anything novel in my scheme or process. The purpose of this article is really for me to document resources I found helpful while researching how best to do this, and to provide a reference for myself in the future.

Special thanks to Johnathan Corgan, Chief Architect of GNU Radio and a core contributor to Bitcoin, for his guidance as I read up on this stuff.