It is a scalable, cloud-based file system for Linux-based applications and workloads that can be used in combination with AWS cloud services and on-premise resources.
Amazon Elastic File System (Amazon EFS) provides a simple, scalable, fully managed elastic NFS file system.
Auto scales on demand, even to petabytes.
For websites with dynamic user interactions, using Amazon EFS is an ideal option to use along with using Amazon S3 for static non-changing data.
Provide massively parallel shared access to thousands of Amazon EC2 instances.
EFS supports multiple AZ within the same region.
Data is stored within and across multiple Availability Zones (AZs) for high availability and durability
EFS uses the NFSv4 protocol for its file system structure, which mirrors a standard on-premise structure and simplifies transferring and accessing your files. For a list of Amazon EC2 Linux Amazon Machine Images (AMIs) that support this protocol.
We recommend using a current generation Linux NFSv4.1 client, such as those found in the latest Amazon Linux, Redhat, and Ubuntu AMIs, in conjunction with the Amazon EFS Mount Helper.
It can be used in combination with Elastic Cloud Compute (EC2) instances or as a stand-alone file system.
EFS does not require storage provisioning and is pay-for-use allowing you to scale services as needed.
They can be simultaneously accessed by up to a thousand EC2 instances within the cloud or via VPN or AWS Direct Connect, making EFS good for hybrid solutions.
EFS is designed for low latency with IOPS and throughput that scale with usage and the number of attached instances, meaning that as storage size grows, performance increases.
At peak performance, it offers 10 GB/sec throughput and 500k IOPS.
EFS scales automatically as data is moved in or out, minimizing fears of running out of space or paying for storage you aren’t using.
EFS allows multiple layers of security and relies on your existing security infrastructure.
It can be used with IAM roles as well as VPC security groups and allows you to define individual file permissions using POSIX.
EFS has built-in compliance with common regulatory standards, including PCI DSS, HIPAA, and SOC with the ability to meet others if necessary.
Amazon EFS supports authentication, authorization, and encryption capabilities to help you meet your security and compliance requirements.
Amazon EFS supports two forms of encryption for file systems, encryption in transit and encryption at rest.
Amazon EC2 instances running in multiple Availability Zones within the same AWS Region can access the file system, so that many users can access and share a common data source.
To access your Amazon EFS file system in a VPC, you create one or more mount targets in the VPC.
A mount target provides an IP address for an NFSv4 endpoint at which you can mount an Amazon EFS file system.
You mount your file system using its Domain Name Service (DNS) name, which resolves to the IP address of the EFS mount target in the same Availability Zone as your EC2 instance.
Each file system has a DNS name of the following form: file-system-id.efs.aws-region.amazonaw.
You can create one mount target in each Availability Zone in an AWS Region.
Each mount target has the following properties: the mount target ID, the subnet ID in which it is created, the file system ID for which it is created, an IP address at which the file system may be mounted, VPC security groups, and the mount target state.
You can mount your Amazon EFS file systems on your on-premises data center servers when connected to your Amazon VPC with AWS Direct Connect or AWS VPN.
Keep the following considerations in mind when using Amazon EFS with an on-premises server:
Your on-premises server must have a Linux-based operating system. We recommend Linux kernel version 4.0 or later.
For the sake of simplicity, we recommend mounting an Amazon EFS file system on an on-premises server using a mount target IP address instead of a DNS name.
Throughput can be burst and provisioned.
While mounting Amazon EFS, if encryption of data in transit is enabled, the EFS Mount helper initializes the client Stunnel process to encrypt data in transit. EFS Mount helper uses TLS 1.2 to encrypts data in transit.
Because EFS uses a credit system to determine when file systems can burst. Each file system earns credits over time at a baseline rate determined by the size of the file system and uses credits whenever it reads or writes data. So if you have proportionally high read/write compared to overall data, you may face the burst capacity issues and opt to the Throughput Provisioned Mode.
The encryption of data at rest has to be enabled when the Amazon EFS file system is created.
The encryption of data in transit can be enabled when the file system is mounted in the EC2 instance. Command: sudo mount -t efs -o tls fs-12345678:/ /mnt/efs
To mount EFS in EC2 using encription in transit we need:
Get the EFS file system's ID from the console or programmatically through the Amazon EFS API.
Create mount targets for your EC2 instances.
When the mount helper utility is used, add the encryption option: “-o tls”.
Complete reservation or a subset of it can be modified in one or more of the following ways:
Standard (EFS Standard): The Standard storage class is used to store frequently accessed files.
Infrequent Access (EFS IA): EFS IA provides price/performance that's cost-optimized for files not accessed every day. By simply enabling EFS Lifecycle Management on your file system, files not accessed according to the lifecycle policy you choose will be automatically and transparently moved into EFS IA. The EFS IA storage class costs only $0.025/GB-month*. We recommend EFS IA storage if you need your full dataset to be readily accessible and want to automatically save on storage costs for files that are less frequently accessed.
While workload patterns vary, customers typically find that 80% of files are infrequently accessed (and suitable for EFS IA), and 20% are actively used (suitable for EFS Standard), resulting in an effective storage cost as low as $0.08/GB-month*.
Amazon EFS transparently serves files from both storage classes in a common file system namespace.
EFS supports two throughput modes to choose from for your file system, Bursting Throughput, and Provisioned Throughput.
With Bursting Throughput mode, throughput on Amazon EFS scales as your file system grows. File-based workloads are typically spiky, driving high levels of throughput for short periods of time and low levels of throughput the rest of the time. To accommodate this, EFS is designed to burst to high throughput levels for periods of time.
With Provisioned Throughput mode, you can instantly provision the throughput of your file system (in MiB/s) independent of the amount of data stored.
Amazon EFS is well suited to support a broad spectrum of use cases from home directories to business-critical applications.
Customers can use EFS to lift-and-shift existing enterprise applications to the AWS Cloud.
Other use cases include: big data analytics, web serving and content management, application development and testing, media and entertainment workflows, database backups, and container storage.
You can mount your EFS file systems on on-premises servers to migrate datasets to EFS, enable cloud bursting scenarios, or backup your on-premises data to EFS.
Move to managed file systems - Move your business critical, Linux-based applications to managed file systems with Amazon EFS.
Analytics & machine learning - For example Data scientists can use EFS to create personalized environments, with home directories storing notebook files, training data, and model artifacts. Amazon SageMaker integrates with EFS for training jobs.
Web serving & content management - Since Amazon EFS adheres to the expected file system directory structure, file naming conventions, and permissions that web developers are accustomed to, it can easily integrate with web applications.
Application testing & development - For example, you can provision, duplicate, scale, or archive your test, development, and production environments with a few clicks.
Media & entertainment - Media workflows like video editing, studio production, broadcast processing, sound design, and rendering often depend on shared storage to manipulate large files.
Database backups - Amazon EFS presents a standard file system that can be easily mounted with NFSv4 from database servers. This provides an ideal platform to create portable database backups using native application tools or enterprise backup applications.
Container storage - Amazon EFS is ideal for container storage providing persistent shared access to a common file repository.
Amazon Backup Service (prefered)
AWS Backup is a fully managed service that allows you to create, manage, and automate incremental backups according to a schedule you define through a central location.
This system is PCI and ISO compliant and HIPAA eligible, to ensure that your compliance needs are covered.
It is possible to use AWS Backup whether your system has a cloud-native, hybrid, or on-premise configuration.
This solution is easy to implement, and incremental backups help keep your costs low but requires manually pausing the applications and processes being backed up and only allows backups to be stored on EFS.
Amazon EFS always prioritizes file system operations over backup operations.
EFS to EFS Backup
There is no built-in EFS backup, and EFS does not have a native snapshot mechanism.
So, before AWS Backup was released, backups had to be done using a template in AWS CloudFormation. This involves using scripts to access the AWS Data Pipeline, from which you must transfer data between multiple services before finally storing the backed-up data in EFS.
With this process, you are still able to control backup schedules and life cycles as with the AWS Backup Service.
The main downsides of this option are that it does require some programming knowledge and is not easily monitored. Additionally, if you are not careful to change time constraints according to the amount of data you are backing up, your process can fail.
Backing Up to Amazon S3
Backing up your data to S3 is another option you might consider and one that can help decrease your storage costs. This process begins the same as the EFS to EFS backup but rather than moving the incremental backups from S3 to EFS at the end of the process, they are simply left in S3.
Monitor Your EFS Burst Credits
Make sure that you monitor your burst credits either manually or by adding alerts in CloudWatch that will notify you when they run low.
Making sure to schedule backups during periods of low use or during non-working hours will help save your credits for when they are truly needed.
You should keep in mind that EFS volumes start with a .5MB/s transfer rate and only 7.2 minutes of burst credits for throughput up to 100MB/s, which is probably not enough to meet your needs. You can increase these numbers by increasing the amount of data you’re storing but there is no way to simply buy additional credits.
Don’t Run Applications from EFS
If you plan to run and host applications from EFS, EBS is a better option for you.
EFS is not meant for the large file read volumes or fast speed that tasks like managing codebases or application deployment require, so you should not try to use it for these tasks. Instead, stick to tools that deploy code to local filesystems or containers.
Using EFS as it is designed, for storage of media assets, exported data files, asynchronous logs, etc.
Choose the Right Performance Mode
EFS is designed with two different performance mode options: General Purpose and Max I/O.
General Purpose is the default setting and the one that will work best for most users. It focuses on lower throughput levels in exchange for lower latency and is good for tasks like web-hosting and content management.
Max I/O focuses on higher throughput at the cost of higher latency, making it better for tasks such as big data analysis or media processing.
AWS EFS
EFS is fairly easy to learn to use because it mirrors a traditional file hierarchy structure.
Since it allows multiple instances to be connected simultaneously across regions and AZs, it does not require as much redundant storage as other options and allows for easy cloud file sharing and collaboration.
As mentioned, EFS is ideal for use in global content management systems, big data analytics, and media processing workflows.
AWS EBS
EBS uses volumes to provide low-latency block storage for use with EC2 instances.
It can be a bit complicated to set up as it requires that you choose your volume type, according to performance requirements, from the start and it doesn’t automatically scale like other options.
EBS is designed for application hosting and storage and high-volume databases.
AWS S3
S3 provides object storage, without the use of EC2 instances, that is accessible directly through the internet. It is the cheapest option and provides a lot of flexibility in storage, but can require some programming proficiency as it is typically managed through AWS Software Development Kits (SDKs). Learn how to mount Amazon S3 as a drive to simplify management.