Continuing in this series of blog posts taking a bit of a “warts and all” view of a few Amazon AWS features, below are a handful more tips and gotchas when designing and implementing solutions on Amazon AWS. This week, we talk about the latest feature of AWS, EFS (aka Elastic File System).
For the first post in this series with a bit of background on where it all originated from, see here:
Amazon #AWS Tips and Gotchas – Part 1
For more posts in this series, see here:
Index of AWS Tips and Gotchas
20. Amazon AWS Tips and Gotchas – Part 10 – EFS (Elastic File System)
A big challenge when designing highly available web infrastructures is historically how to provide a centralised content store for static content without wasting resources.
A classic model for this is a pair of web / file servers with either rsync or Gluster to replicate the content between them. In Windows world, this would be something like either a WSFC (failover cluster) or perhaps something evil like a DFS replicated share. This means that not only are you wasting money on multiple virtual machines / instances just to serve file content, but you also add significant risk and complexity in the replication and failover between these machines.
Enter, AWS EFS!At a simple level, EFS is basically an NFS (v4.1) share within the AWS cloud, which is replicated across all AZs in any one region. No need for managing and replicating between instances, or indeed paying for EC2 instances just to create file shares! Great!
As this is still a relatively immature product, there are still a few “features” to be aware of:
- There is no native EFS backup solution (yet!). I’m sure this will come very soon. As we have Re:invent coming up, it wouldn’t surprise me if something came out then. In the meantime, your main methods would be either to use Data Pipeline to backup to another EFS store or potentially mount EFS and backup through an EC2 instance using your own tools or scripts. I would be concerned about backing up EFS to EFS (if in the same region), as this is putting all your eggs in one basket. Hopefully, AWS will provide other target options in the future.
- There is no native encryption of EFS data as yet. If you need this right now, you could achieve it by simply pre-encrypting the data in your application first, before it is written to EFS. Alternatively, just hold your breath as AWS have already stated that:
“Amazon EFS does not currently provide the option to encrypt data at rest, but we will offer this option soon”.
- If you have less than about 100GB, then due to the way the performance burst credits work you may not get the performance you need. The more you buy, the more performance you get, so don’t short change your app for the sake of a few dollars!
“Amazon EFS uses a credit system to determine when file systems can burst. Each file system earns credits over time at a baseline rate that is determined by the size of the file system, and uses credits whenever it reads or writes data”.
In early testing, it has been seen that very small filesystems can lead to IO starvation and performance issues. I would recommend you start with 100GB as a minimum (subject to your workload requirements of course!). This is still pretty cheap at only about $30-33 a month; a lot less than even a pair of EC2 instances, never mind the complexity reduction benefits. KISS!
Of course, the more caching you can do on that content, e.g. using CloudFront as a CDN, the lower the IO requirements on your EFS store.
For more info on performance see here:
Amazon EFS Performance
- And finally… being NFS based, this is obviously primarily aimed at Linux solutions. It would be nice to think that AWS will release an SMB version in the future… we can but hope!
Thanks to my learned colleague Tom Ellis for the tip! As he says, “The size needs to be determined by the throughput needs, and not the storage capacity needs. “
Find more posts in this series here:
Index of AWS Tips and Gotchas