Azure Site Recovery (ASR), Microsoft’s DRaaS solution, was named an industry leader by Gartner in 2016 for its completeness of vision and ability to execute.
This blog post will explore some of ASR’s capabilities, inner workings, and use cases to demonstrate what makes it a world class DRaaS solution.
Azure Site Recovery (ASR) is a DRaaS offered by Azure for use in cloud and hybrid cloud architectures. As a disaster recovery platform, it makes it possible for Azure Virtual Machines, Hyper-V, physical on-prem systems, and VMWare to failover to and successfully failback once the disaster has been resolved. A near-constant data replication process makes sure copies are in sync.
But its biggest advantage may be its pricing.
The chief benefit of ASR is its cost-effectiveness.
DRaaS solutions in the cloud vary in price, and ASR is cheaper than its industry rivals. The savings extend beyond ASR’s pricing; by providing access to Azure as a secondary site, the costs of building and maintaining a secondary site can be avoided.
Azure Site Recovery offers a low-cost alternative to traditional hosted or self-provisioned DR environments because the only ongoing costs are for storage required to support the application replicas and the desired retention of recovery points and the per machine per month service fee.
There are no compute, network infrastructure, facility rental, or software licensing fees required during ongoing protection.
ASR also has the advantage of being easy to use when replicating Hyper-V or VMware VMs, and physical Windows and Linux servers. The Azure ASR console provides a unified view on the replication status of all your different workloads and allows you to carry out maintenance tasks, such as tweaking recovery plans.
The console also integrates with other BCDR solutions such as Oracle Data Guard and SQL Always On. ASR is an effective tool for workload migration.
You can use ASR to migrate workloads from on-premises, AWS, or even other Azure regions. This can also provide a flexible replication option for hybrid environments.
For workload and application protection, ASR integrates with several critical workloads, including Active Directory, DNS, Exchange, SAP, SQL Always On, and Oracle Data Guard.
DRaaS is about recovery, and ASR handles recovery very well. High recovery time objective (RTO) and recovery point objective (RPO) thresholds can be costly to an organization, so ASR provides replication frequency as low as 30 seconds (for Hyper-V) and continuous replication for VMWare.
That means ASR users are provided with low RPO threshold. RTO can be reduced greatly through automation and use of Azure Traffic Manager. ASR’s rich automation and orchestration is provided through PowerShell and production-ready Azure Runbooks, which help in making the recovery process consistently accurate and repeatable.
To further prepare your system in case of a failure, ASR can run non-disruptive failover and DR drills. In addition to executing non-planned failovers during production downtime, ASR can carry out test failovers or planned failovers to test DR capabilities and planned outages.
ASR’s customizable recovery plans also allow sequenced failover and recovery of multi-tiered apps like Database and Web Services.
Here are some of the common ASR components and terminologies:
ASR Service: Azure’s managed service, which is responsible for management and orchestration of whole processes
Config Server: The centralized on-premises appliance coordinating VMWare and physical server replication
Process Server(s): Caching, compression, and encryption for VMWare and physical server bi-directional replication during.
Mobility Service: Captures block level changes in memory on each protected VMware or physical machine. Supports filesystem (Linux) and application (Windows) level consistency across multiple servers in a consistency group.
ASR Provider and the ASR Agent: Used for replicating and controlling replication of Hyper-V VMs
HRL Files: Files that are used to track the delta replication changes that occur after the initial replication
Azure Site Recovery is billed based on number of instances protected. Every instance protected with Azure Site Recovery is free for the first 31 days and then billed at $16 per month for recovery to customer owned sites or $54 per month when protected to Azure. You'll also pay for the storage used at rates from a low of $0.024 per GB for locally redundant storage (LRS) up to $0.061 per GB for Read-Access Geo Redundant (RA-GRS) storage. Prices go down as your total storage increases.
Bottom line: you will pay roughly $80 per month to protect a single VM with up to 1TB of disk using locally redundant storage at one Azure data center. That's a far cry from competitors like Quorum onQ Hybrid Cloud Solution that are not only more expensive but also require dedicated on-site hardware.
Azure Site Recovery is not solid as an instantaneous failover solution. In the case of an unplanned failover it can start up the instance of the most recently synced data in a fairly short amount of time, depending on the complexity of the workload, but that will still be measured in minutes or potentially even longer spans. Then again, for typical DRaaS use cases, instantaneous failover isn't much of an issue when you're busy fleeing the office. The default setting for replication is set to 15 minutes. For small workloads involving only a few VMs, the time to spin up in the Azure environment is minimal, and in our case just over five minutes.
Planned failovers perform a synchronization prior to the transferring execution from the local instance to the one running in Azure. I ran a planned failover on a single-VM instance which had been previously synced to Azure, and it required a total of one hour and 24 minutes. This included 39 minutes for data synchronization and 37 minutes to execute the failover.
You can tweak these numbers by managing your configuration and network settings matched against ongoing ASR reporting. ASR has some reporting available through its native management interface, but for best results, customers should use reports obtained through Windows Server management or System Center. Azure Site Recovery provides a cost effective solution for Microsoft-centric installations integrating with Microsoft System Center offering a wide range of automation options for specific applications. For its overall depth of backup reporting features and workload options combined with the lowest price of the bunch, Azure Site Recovery was the clear pick for Editors' Choice.
This section will provide a walkthrough for how to replicate data to the cloud using ASR. As with every DRaaS and migration project, your company will first need an agile plan to ensure a successful DRaaS strategy.
There are several factors that govern a DRaaS strategy: RTO and RPO goals, storage (IOPS and storage account), capacity planning, network bandwidth, network reconfiguration, and daily change rate.
Microsoft-provided tools Azure Site Recovery Capacity Planner and Azure Site Recovery Deployment Planner can help you analyze your source environment and compute requirements for the target environment.
One aspect of Azure ASR to keep in mind at this point is network planning. You have to determine if you want to use a stretched subnet across both sites or if you will use a subnet failover. You will also need determine the failover IP ranges.
Make sure to review the support Matrix to understand the prerequisites for replicating using ASR. It is also prudent to verify the kinds of workloads that can be migrated using ASR.
You can find the full list here.
Pro tip: Lookout for limitations like a 4TB limit for individual disks on each protected VM. If workloads are being migrated from AWS, be aware that it is a one-way migration to Azure and the replication cannot be enabled back to AWS.
Also, lookout for additional charges for storage account usage, storage transactions, and outbound data transfers when configuring ASR.
Now that we have a solid plan based on source environment analysis and capacity planning, we can start preparing our environments for replication. The first step is to prepare the source.
ASR supports several source environments like VMware (with or without vCenter), Hyper-V VMs (with or without SCVMM), AWS workloads, physical servers, and Azure VMs. It is important to note that there are different requirements based on the source environment.
For example, VMware VMs would require additional resources such as a configuration server, process server, and mobility services to help manage, coordinate, and send the encrypted and compressed data chunks to the Recovery Services destination.
The next step is to prepare the destination environment. The destination or target for ASR replications can be a Hyper-V host, VMware Site, or Azure. No matter which one you choose, the very first thing to do would be to create a Recovery Services Vault in Azure (either through Resource Manager or Classic portal).
The Recovery Services Vault will house the replication settings and manage the replication.
If your target is Azure, you need to create storage and network accounts which will house the replicated on-premises machines (note: for the storage accounts you’ll have to decide between standard and premium account types, and set the LRS and GRS replication options based on your RPO).
If you are replicating to a secondary site, you will need to prepare the hosts on the secondary site by installing the configuration components: Azure Site Recovery Provider for all SCVMM servers (in case of Hyper-V hosts) and InMage Scout components for VMWare machines or physical hosts.
Lastly, it is time to configure and enable replication. After the source and target have been prepped, you need to create a replication plan that aligns with your RTO and RPO objectives. Now select the Virtual Machines to be replicated and select the Replication policy that you defined earlier.
Finally, enable the initial replica (note: this process can take quite some time). After the initial replication is complete, ASR replicates data in incremental chunks (changed data) at an interval defined by your replication policy.
Now that you have performed the replication, it is time to validate the setup and determine if and what changes you need to make if you have to execute a failover. There are two ways to try this: a test failover or an actual planned or unplanned failover.
A test failover has no impact to production, but a planned or unplanned failover involves shifting the production site to the replication site such as Azure or another host.
A test Failover can be done either through a recovery plan (to orchestrate failover of multiple machines) or manually for each VM through the Azure console.
If you executed a planned failover, don’t forget to reprotect the machines after they have failed over. Once your source site is up, you can failback the VMs using the process server, master target server, and a failback policy.
Note: In Windows VMs, don’t forget to set the SAN policy to Online All if you want to retain the drive letters after failover.
It is advisable to keep monitoring your replication settings to ensure that your RPO objectives stay aligned. You can tweak replication settings or add scaled out process servers to meet these objectives.
Apart from providing job alerts on the Azure console, ASR also has its own Event Log Source that can be useful for troubleshooting replication failures. Here is a guide on what event sources and ports need to be looked at while troubleshooting these failures.
In addition to being an excellent BCDR tool, ASR’s migration capabilities deserve a special mention. Not only can you migrate on-premises workloads with ASR, you can also migrate cloud workloads such as AWS VMs and Azure VMs from other regions.
The initial setup for performing such migrations is very similar to replicating physical machines to Azure.
In ASR migration, instead of executing a failover you would migrate by right clicking the VM on the Azure portal and executing “Complete Migration.” This will completely migrate the workload, stop replication, and stop ASR billing for the machine.
You can find more details on that here.
Azure Site Recovery guarantees application uptime by replicating workloads to a secondary location. In the event of an outage at the primary location, traffic is automatically redirected to the secondary location to make sure the applications remain accessible. It’s important to note that ASR can replicate workloads from on-premises servers in addition to Azure VMs. This means applications that are not hosted in the cloud can still take advantage of one of the major benefits of cloud computing – the ability to quickly provision and scale resources when you need it most.
Another important feature of ASR is the ability to have very low recovery time objectives (RTO) and recovery point objectives (RPO) at a reasonable cost. RTO is how quickly the applications will be online in the secondary site, and RPO is how much data loss would occur in the event of a failover. It’s ideal to have low standards for both of these situations, but that traditionally means higher costs for your disaster recovery plan. Azure Site Recovery can provide RTO and RPO within minutes out of the box – and at a comparatively low cost.
The last feature to highlight here is testing. Having a plan in place is great, but it’s essential to be able to verify that it works as needed. Testing other disaster recovery solutions can be time-consuming. Azure Site Recovery makes it possible to execute and verify a test failover scenario with only a few clicks of the mouse.
More information about additional features that Azure Site Recovery provides can be found here. Microsoft also has an extremely helpful quick start guide that walks through how to get started with Azure Site Recovery. If you have questions about how to implement ASR or your application requires advanced configuration, contact DragonSpears for more info!
Owing to its cost-effectiveness, ease of use, and support for an extensive list of workloads, ASR has established itself as a world-leader in BCDR solutions.
However, proper capacity and deployment planning are important steps before starting an ASR migration. It is important to educate yourself about ASR technology through blogs, forums, and real-world examples such as Paul Smith, Dartmouth-Hitchcock and Duro Dakovich.
If you are a current ASR user or planning to become one—and you are wondering, how do I get my data to Azure, NetApp’sCloud Volumes ONTAP (formerly ONTAP Cloud) can be a great solution.