AWS Cloud Migration services help to address a lot of common use cases such as
cloud migration,
disaster recovery,
data center decommission, and
content distribution.
For migrating data from On Premises to AWS, the major aspect for considerations are
amount of data and network speed
data security in transit
existing application knowledge for recreation
AWS EC2 VM Import/Export
Allows easy import of virtual machine images from existing environment to EC2 instances and export them back to on-premises environment.
VM Import/Export enables you to easily import virtual machine images from your existing environment to Amazon EC2 instances and export them back to your on-premises environment.
Allows leveraging of existing investments in the virtual machines, built to meet compliance requirements, configuration management and IT security by bringing those virtual machines into EC2 as ready-to-use instances.
Common usages include
Migrate Existing Applications and Workloads to EC2, allows to preserve software and settings that configured in the existing VMs.
Copy Your VM Image Catalog to Amazon EC2
Create a Disaster Recovery Repository for your VM images.
AWS Server Migration Service (SMS)
It is an agentless service which makes it easier and faster to migrate thousands of on-premises workloads to AWS.
Allows you to automate, schedule, and track incremental replications of live server volumes, making it easier to coordinate large-scale server migrations.
Currently supports migration of virtual machines from VMware vSphere and Windows Hyper-V to AWS
Supports migrating Windows Server 2003, 2008, 2012, and 2016, and Windows 7, 8, and 10; Red Hat Enterprise Linux (RHEL), SUSE/SLES, CentOS, Ubuntu, Oracle Linux, Fedora, and Debian Linux OS
Replicates each server volume, which is saved as a new AMI, which can be launched as an EC2 instance
It is a significant enhancement of EC2 VM Import.
AWS Application Discovery Service
Helps enterprise customers plan migration projects by gathering information about their on-premises data centers.
Collects and presents server specification information, performance data, and details of running processes and network connections.
Provides protection for the collected data by encrypting it both in transit to AWS and at rest within the Application Discovery Service data store.
AWS Database Migration Service (DMS)
Helps migrate databases to AWS quickly and securely.
Source database remains fully operational during the migration, minimizing downtime to applications that rely on the database.
Supports homogeneous migrations such as Oracle to Oracle, as well as heterogeneous migrations between different database platforms, such as Oracle or Microsoft SQL Server to Amazon Aurora.
Monitors for replication tasks, network or host failures, and automatically provisions a host replacement in case of failures that can’t be repaired.
Supports both one-time data migration into RDS and EC2-based databases as well as for continuous data replication.
Supports continuous replication of the data with high availability and consolidate databases into a petabyte-scale data warehouse by streaming data to Amazon Redshift and Amazon S3.
Provides free AWS Schema Conversion Tool (SCT) that automates the conversion of Oracle PL/SQL and SQL Server T-SQL code to equivalent code in the Amazon Aurora / MySQL dialect of SQL or the equivalent PL/pgSQL code in PostgreSQL.
The replication instance should be created in the DMS console instead of the EC2 instance.
If DMS doesn't support the database then you can convert your information in CSV and upload the exported CSV files to S3 at first. Then create S3 source endpoint and DynamoDB target endpoint in AWS DMS console. When the S3 source endpoint is configured, add the table mapping rule with a JSON table structure. Create a Replication Task to move the data from the source endpoint to the target endpoint.
With the "Selection Rule" you can filter the records.
With the "transformation Rule" you can remove/add columns, change prefixes in the columns and more.
Transformation Rule examples:
//Remove Column
{
"rules": [{
"rule-type": "selection",
"rule-id": "1",
"rule-name": "1",
"object-locator": {
"schema-name": "test",
"table-name": "%"
},
"rule-action": "include"
}, {
"rule-type": "transformation",
"rule-id": "2",
"rule-name": "2",
"rule-action": "remove-column",
"rule-target": "column",
"object-locator": {
"schema-name": "test",
"table-name": "latestnews",
"column-name": "test%"
}
}]
}
//Modify Prefix
{
"rules": [{
"rule-type": "selection",
"rule-id": "1",
"rule-name": "1",
"object-locator": {
"schema-name": "test",
"table-name": "%"
},
"rule-action": "include"
}, {
"rule-type": "transformation",
"rule-id": "2",
"rule-name": "2",
"rule-action": "add-prefix",
"rule-target": "table",
"object-locator": {
"schema-name": "test",
"table-name": "%"
},
"value": "DMS_"
}]
}
AWS provides a suite of data transfer services that includes many methods that to migrate your data more effectively.
Data Transfer services work both Online and Offline and the usage depends on several factors like amount of data, time required, frequency, available bandwidth and cost.
VPN
Connection utilizes IPSec to establish encrypted network connectivity between on-premises network and VPC over the Internet.
This is an online data transfer method.
Connections can be configured in minutes and a good solution for an immediate need, have low to modest bandwidth requirements, and can tolerate the inherent variability in Internet-based connectivity.
Quick to setup and cost efficient .
Still requires internet and be configured using VGW and CGW.
AWS Direct Connect
Provides a dedicated physical connection between the corporate network and AWS Direct Connect location with no data transfer over the Internet.
This is an online data transfer method.
Helps bypass Internet service providers (ISPs) in the network path.
Helps reduce network costs, increase bandwidth throughput, and provide a more consistent network experience than with Internet-based connection.
Takes time to setup and involves third parties.
Is not a cost efficient solution
It's not redundant and would need another direct connect connection or a VPN connection.
Security
provides a dedicated physical connection without internet
For additional security can be used with VPN
AWS Import/Export (upgraded to Snowball)
Accelerates moving large amounts of data into and out of AWS using secure Snowball appliances.
Note: Originally we had other method "AWS Import/Export Disk (Disk)", and was the only service offered by AWS for data transfer by mail.
Note: "AWS Import/Export Disk (Disk)" Supports data encryption methods: PIN-code encryption; Hardware-based device encryption, and TrueCrypt; software encryption.
Note: "AWS Import/Export Disk (Disk)" can Import to S3, Import to EBS, Export to S3.
Note: "AWS Import/Export Disk (Disk)" doesn't support Server-Side Encryption to import, maximum file size 5 TBs, Maximum device capacity 16 TBs, and S3 only export the last version when we have versioning with S3.
Note: "AWS Import/Export Disk (Disk)" for Amazon EBS imports, if your storage device is less than or equal to 1TB, its contents will be loaded directly into an Amazon EBS snapshot. If your storage device’s capacity exceeds 1TB, a device image will be stored in your specified Amazon S3 bucket.
Using snowball is cheaper.
Disk supports transfers data directly onto and off of storage devices you own using the Amazon high-speed internal network
AWS transfers the data directly onto and off of the storage devices using Amazon’s high-speed internal network, bypassing the Internet.
Data Migration
For significant data size, AWS Import/Export is faster than Internet transfer is and more cost-effective than upgrading the connectivity.
If loading the data over the Internet would take a week or more, AWS Import/Export should be considered
Data from appliances can be imported to S3 and EBS volumes, snapshots and exported from S3.
Not suitable for applications that cannot tolerate offline transfer time
Security
Snowball uses an industry-standard Trusted Platform Module (TPM) that has a dedicated processor designed to detect any unauthorized modifications to the hardware, firmware, or software to physically secure the AWS Snowball device.
AWS Import/Export is ideal for transferring large amounts of data in and out of the AWS cloud, especially in cases where transferring the data over the Internet would be too slow (a week or more) or too costly.
Common use cases include
first time migration – initial data upload to AWS
content distribution or regular data interchange to/from your customers or business associates,
off-site backup – transfer to Amazon S3 or Amazon Glacier for off-site backup and archival storage, and
disaster recovery – quick retrieval (export) of large backups from Amazon S3 or Amazon Glacier
Snow Family
Highly-secure, portable devices to collect and process data at the edge, and migrate data into and out of AWS.
This is an offline data transfer method.
AWS Snowcone
AWS Snowcone is the smallest member of the AWS Snow Family of edge computing, edge storage, and data transfer devices, weighing in at 4.5 pounds (2.1 kg) with 8 terabytes of usable storage. Snowcone is ruggedized, secure, and purpose-built for use outside of a traditional data center.
AWS Snowball
Snowball is a petabyte-scale data transport solution that uses secure appliances to transfer large amounts of data into and out of the AWS cloud.
Using Snowball addresses common challenges with large-scale data transfers including high network costs, long transfer times, and security concerns.
Transfers the data to S3 bucket (Import/Export)
Transfer times are about a week from start to finish.
Storage options 50 TB (42 TB usable) - US regions only & 80 TB (72 TB 72 usable).
Ideal for one time large data transfers with limited network bandwidth, long transfer times, and security concerns.
Is simple, fast, and secure.
Can be very cost and time efficient for large data transfer
Are commonly used to ship terabytes or petabytes of analytics data, healthcare and life sciences data, video libraries, image repositories, backups, and archives as part of data center shutdown, tape replacement, or application migration projects.
AWS Snowball Edge devices
Is a petabyte to exabytes scale data transfer device with on-board storage and compute capabilities .
Contain slightly larger capacity and an embedded computing platform that helps perform simple processing tasks.
Move large amounts of data into and out of AWS, as a temporary storage tier for large local datasets, or to support local workloads in remote or offline locations.
Ideal for one time large data transfers with limited network bandwidth, long transfer times, and security concerns
Is simple, fast, and secure.
Can be very cost and time efficient for large data transfer.
AWS Snowball is a service that allows you to transfer large amounts of data by sending you a highly secure, tamper-proof storage device which you transfer your data to locally and then ship directly to AWS.
AWS Snowball Edge provides compute on the device in addition to storage, so you can perform schema conversions on the data before sending the data to AWS.
Part of the AWS Snow Family, is an edge computing, data migration, and edge storage device that comes in two options. Snowball Edge Storage Optimized devices provide both block storage and Amazon S3-compatible object storage, and 40 vCPUs. They are well suited for local storage and large scale-data transfer. Snowball Edge Compute Optimized devices provide 52 vCPUs, block and object storage, and an optional GPU for use cases like advanced machine learning and full motion video analysis in disconnected environments.
With durable storage capacity can be 100 TB (83 TB usable) or 100 TB Clustered (45 TB per node), withurable local storage.
Can be rack shelved and may also be clustered together, making it simpler to collect and store data in extremely remote locations.
Commonly used in environments with intermittent connectivity (such as manufacturing, industrial, and transportation); or in extremely remote locations (such as military or maritime operations) before shipping them back to AWS data centers.
You can Import data into Amazon S3, export from Amazon S3.
Delivers serverless computing applications at the network edge using AWS Greengrass and Lambda functions.
Common use cases include capturing IoT (AWS IoT Greengrass) sensor streams, on-the-fly media transcoding, image compression, metrics aggregation and industrial control signaling and alarming.
You can local compute with AWS Lambda & Amazon EC2 compute instances, use in a cluster of devices and transfer files through NFS with a GUI.
AWS Snowmobile
Moves up to 100PB of data (equivalent to 1,250 AWS Snowball devices) in a 45-foot long ruggedized shipping container and is ideal for multi-petabyte or Exabyte-scale digital media migrations and datacenter shutdowns.
Arrives at the customer site and appears as a network-attached data store for more secure, high-speed data transfer. After data is transferred to Snowmobile, it is driven back to an AWS Region where the data is loaded into S3.
Is tamper-resistant, waterproof, and temperature controlled with multiple layers of logical and physical security — including encryption, fire suppression, dedicated security personnel, GPS tracking, alarm monitoring, 24/7 video surveillance, and an escort security vehicle during transit.
1 Exabyte is transported in 6 months instead of 26 years.
AWS Storage Gateway
Connects an on-premises software appliance with cloud-based storage to provide seamless and secure integration between an organization’s on-premises IT environment and the AWS storage infrastructure.
Provides low-latency performance by maintaining frequently accessed data on-premises while securely storing all of the data encrypted in S3 or Glacier.
For disaster recovery scenarios, Storage Gateway, together with EC2, can serve as a cloud-hosted solution that mirrors the entire production environment.
Data Migration
With gateway-cached volumes, S3 can be used to hold primary data while frequently accessed data is cached locally for faster access reducing the need to scale on premises storage infrastructure.
With gateway-stored volumes, entire data is stored locally while asynchronously backing up data to S3.
With gateway-VTL, offline data archiving can be performed by presenting existing backup application with an iSCSI-based VTL consisting of a virtual media changer and virtual tape drives.
Security
Encrypts all data in transit to and from AWS by using SSL/TLS.
All data in AWS Storage Gateway is encrypted at rest using AES-256.
Authentication between the gateway and iSCSI initiators can be secured by using Challenge-Handshake Authentication Protocol (CHAP).
S3
Data Transfer
Files up to 5GB can be transferred using single operation
Multipart uploads can be used to upload files up to 5 TB and speed up data uploads by dividing the file into multiple parts
transfer rate still limited by the network speed
Security
Data in transit can be secured by using SSL/TLS or client-side encryption.
Encrypt data at-rest by performing server-side encryption using Amazon S3-Managed Keys (SSE-S3), AWS Key Management Service (KMS)-Managed Keys (SSE-KMS), or Customer Provided Keys (SSE-C). Or by performing client-side encryption using AWS KMS–Managed Customer Master Key (CMK) or Client-Side Master Key.
AWS S3 Transfer Acceleration
Makes public Internet transfers to S3 faster.
Helps maximize the available bandwidth regardless of distance or varying Internet weather, and there are no special clients or proprietary network protocols. Simply change the endpoint you use with your S3 bucket and acceleration is automatically applied.
Ideal for recurring jobs that travel across the globe, such as media uploads, backups, and local data processing tasks that are regularly sent to a central location.
This is an online data transfer method.
AWS DataSync
Automates moving data between on-premises storage and S3 or Elastic File System (Amazon EFS).
An online data transfer service that simplifies, automates, and accelerates copying large amounts of data to and from AWS storage services over the internet or AWS Direct Connect.
Automatically handles many of the tasks related to data transfers that can slow down migrations or burden the IT operations, including running your own instances, handling encryption, managing scripts, network optimization, and data integrity validation.
Helps transfer data at speeds up to 10 times faster than open-source tools.
Uses AWS Direct Connect or internet links to AWS and ideal for one-time data migrations, recurring data processing workflows, and automated replication for data protection and recovery.
This is an online data transfer method.
DataSync can copy data between:
Network File System (NFS) or Server Message Block (SMB) file servers,
Amazon S3 buckets,
Amazon EFS file systems,
Amazon FSx for Windows File Server file systems
How is it works:
Deploy an agent – Deploy a DataSync agent and associate it to your AWS account via the Management Console or API. The agent will be used to access your NFS server or SMB file share to read data from it or write data to it.
Create a data transfer task – Create a task by specifying the location of your data source and destination, and any options you want to use to configure the transfer, such as the desired task schedule.
Start the transfer – Start the task and monitor data movement in the console or with Amazon CloudWatch.
Concepts:
Agent – A virtual machine used to read data from or write data to an on-premises location. Agents need to be activated first using an activation key entered in the AWS console, before you can start using them. You must activate your agent in the same region where your S3 or EFS source/destination resides. You can have more than one DataSync Agent running.
Location – Any source or destination location used in the data transfer.
Sources - Self-managed storage (including NFS shares, SMB shares, object storage, or NFS on your AWS Snowcone device), Amazon S3 (in AWS Regions), Amazon EFS, or Amazon FSx for Windows File Server, Amazon S3 (in AWS Regions), Amazon EFS, or Amazon FSx for Windows File Server, Amazon S3 (in AWS Regions), Amazon S3 on AWS Outposts
Destinations - Amazon S3 (in AWS Regions), Amazon EFS, or Amazon FSx for Windows File Server, Self-managed storage (including NFS shares, SMB shares, object storage, or NFS on your AWS Snowcone device), Amazon S3 (in AWS Regions), Amazon EFS, or Amazon FSx for Windows File Server, Amazon S3 on AWS Outposts, Amazon S3 (in AWS Regions)
Task – A task includes two locations (source and destination), and also the configuration of how to transfer the data from one location to the other. Configuration settings can include options such as how to treat metadata, deleted files, and copy permission.
Task execution – An individual run of a task, which includes options such as start time, end time, bytes written, and status. A task execution has five transition phases and two terminal statuses. If the VerifyMode option is not enabled, a terminal status occurs after the TRANSFERRING phase. Otherwise, it occurs after the VERIFYING phase.
Features:
The service employs an AWS-designed transfer protocol—decoupled from storage protocol—to speed data movement.
The protocol performs optimizations on how, when, and what data is sent over the network.
A single DataSync agent is capable of saturating a 10 Gbps network link.
DataSync auto-scales cloud resources to support higher-volume transfers, and makes it easy to add agents on-premises.
All of your data is encrypted in transit with TLS. DataSync supports using default encryption for S3 buckets using Amazon S3-Managed Encryption Keys (SSE-S3), and Amazon EFS file system encryption of data at rest.
DataSync supports storing data directly into S3 Standard, S3 Intelligent-Tiering, S3 Standard-Infrequent Access (S3 Standard-IA), S3 One Zone-Infrequent Access (S3 One Zone-IA), Amazon S3 Glacier, and S3 Glacier Deep Archive.
You can use AWS DataSync to copy files into EFS and configure EFS Lifecycle Management to migrate files that have not been accessed for a set period of time to the Infrequent Access (IA) storage class.
DataSync ensures that your data arrives intact by performing integrity checks both in transit and at rest.
You can specify an exclude filter, an include filter, or both, to limit which files, folders, or objects get transferred each time a task runs.
Task scheduling enables you to configure periodically executing a task, to detect and copy changes from your source storage system to the destination.
DataSync supports VPC endpoints (powered by AWS PrivateLink) in order to move files directly into your Amazon VPC.
Use Cases:
Data migration to Amazon S3, Amazon EFS, or Amazon FSx for Windows File Server.
Data processing for hybrid workloads. If you have on-premises systems generating or using data that needs to move into or out of AWS for processing, you can use DataSync to accelerate and schedule the transfers.
If you have large amounts of cold data stored in expensive on-premises storage systems, you can move this data directly to durable and secure long-term storage such as Amazon S3 Glacier or Amazon S3 Glacier Deep Archive.
If you have large Network Attached Storage (NAS) systems with important files that need to be protected, you can replicate them into S3 using DataSync.
AWS DataSync vs AWS CLI tools:
AWS DataSync fully automates and accelerates moving large active datasets to AWS, up to 10 times faster than command line tools.
DataSync uses a purpose-built network protocol and scale-out architecture to transfer data.
DataSync fully automates the data transfer. It comes with retry and network resiliency mechanisms, network optimizations, built-in task scheduling, and CloudWatch monitoring that provides granular visibility into the transfer process.
DataSync performs data integrity verification both during the transfer and at the end of the transfer.
DataSync provides end to end security, and integrates directly with AWS storage services.
AWS DataSync vs Snowball/Snowball Edge:
AWS DataSync is ideal for online data transfers. AWS Snowball/ Snowball Edge is suitable for offline data transfers, for customers who are bandwidth constrained, or transferring data from remote, disconnected, or austere environments.
AWS DataSync vs AWS Storage Gateway File Gateway:
Use AWS DataSync to migrate existing data to Amazon S3, and then use the File Gateway to retain access to the migrated data and for ongoing updates from your on-premises file-based applications.
AWS DataSync vs Amazon S3 Transfer Acceleration:
If your applications are already integrated with the Amazon S3 API, and you want higher throughput for transferring large files to S3, you can use S3 Transfer Acceleration. If not, you may use AWS DataSync.
AWS DataSync vs AWS Transfer for SFTP:
If you currently use SFTP to exchange data with third parties, you may use AWS Transfer for SFTP to transfer directly these data.
If you want an accelerated and automated data transfer between NFS servers, SMB file shares, Amazon S3, Amazon EFS, and Amazon FSx for Windows File Server, you can use AWS DataSync.
Pricing:
You pay for the amount of data that you copy. Your costs are based on a flat per-gigabyte fee for the use of network acceleration technology, managed cloud infrastructure, data validation, and automation capabilities in DataSync.
You are charged standard request, storage, and data transfer rates to read to and write from AWS services, such as Amazon S3, Amazon EFS, Amazon FSx for Windows File Server, and AWS Key Management Service (KMS).
When copying data from AWS to an on-premises storage system, you pay for AWS Data Transfer at your standard rate. You are also charged standard rates for Amazon CloudWatch Logs, Amazon CloudWatch Events, and Amazon CloudWatch Metrics.
You will be billed by AWS PrivateLink for interface VPC endpoints that you create to manage and control the traffic between your agent(s) and the DataSync service over AWS PrivateLink.
s