VMware Site Recovery Manager 5.8 released

VMware released Site Recovery Manager 5.8 at September 9.

An overview of some of the new features and enhancements. There are quite a few more.

  • Integrated into the vSphere web client
  • can use the built in vPostgress database
  • enhanced IP customization
  • Enhanced reporting – This version of vSphere Replication introduces a new reporting dashboard that provides summary reports of vSphere Replication usage. Administrators now have better visibility into their replication environment, simplifying management and reducing downtime.
  • Interoperability with VCHS-DR service and SRM – The prior release, version 5.6, added functionality to support replicating VMs to the vCloud Hybrid Service (VCHS) but was not supported with SRM. This release allows customers to replicate workloads to VCHS while also supporting orchestration of other workloads via SRM. Note that SRM cannot be used with vSphere Replication for orchestrated disaster recovery to VCHS.
  • 5x the scale of protection – IT organizations can set up recovery plans scalable up to 5,000 virtual machines per vCenter Server using array-based replication to enable enterprise-level protection–five times larger than with previous limits.
  • Enhanced self-service – New integrations will offer customers self-service access to provision predefined disaster recovery protection tiers to new VMs via blueprints in vCloud Automation Center when using array-based replication.

Derek Seaman has some more new features listed in his blog.

The Site Recovery Manager 5.8 release notes are here 

Download Site Recovery Manager 5.8 here.
Datasheet here

Combine the best of two worlds: vSphere stretched cluster and Site Recovery Manager

For Business Continuity and Disaster Recovery in a vSphere infrastructure most customers make a choice of two options. Either use VMware Site Recovery Manager (SRM) or build a  vSphere Metro Stretched Cluster.

While both are great options for BC/DR, both also have some disadvantages.

VMware announced at VMworld 2014 they are working on integration of SRM with a vSphere Metro Stretched Cluster.

I am sure these new features are not in  SRM 5.8 announced during VMworld. Jason Boche made three videos demoing the new release. In the demo of a tech preview some errors were encountered.

In May 2012 I predicted a couple of the enhancements now announced. Good to see there are now becoming a reality.

This is a summary of breakout session BCO1916. You can watch this interesting session yourself here.

So the two options for BC/DR in a vSphere datacenter are:

-option 1: two datacenters, both running production in an active-active configuration with stretched storage and networking. We call this a vSphere Metro Stretched Cluster

-option 2: two datacenters in active/passive. One running production, the other test/dev. If production site fails, VMware Site Recovery Manager is used to perform an orchestrated recovery of the virtual machines in the recovery site. Alternative tools are vSphere Replication or Zerto Virtual Replication.

Option 1 is great for disaster avoidance, balancing of resources and planned maintenance. When IT knows in advance one of the datacenters might become unavailable because of a hurricane/downtime of power/SAN maintenance etc virtual machines can be vMotion-ed to the alternate datacenter.

In case of an unplanned event like a fire or earthquake, VMware HA will take care of the restarts of virtual machine. The advantage is that up-to-date virtual machine disk files are available in the recovery site so RPO as well as RTO is  low.

However VMware HA is not designed for large scale recovery of a complete site. VMware  HA does not offer runbooks for an automated  recovery. It is not aware of application dependancies nor is it site aware. HA does not offer a granular control over VM start priority. Also a failover cannot be tested.  So we cannot shutdown and reboot a VM without taking a production VM down.

srm-stretched-clusters-issues

Another restriction is that because of the synchronous replication of the storage layer the distance between the two datacenters is limited to about 100km.   A  vSphere Metro Stretched Cluster  is typicaly deployed in a metro area. A huricane or earthquake is likely to hit a larger area so both sites might be hurt.

Option 2 does offer orchestration using runbooks aka recovery plans. IT can test a recovery without disturbing the production environment.

Currently combining a vSphere Metro Stretched Cluster with VMware Site Recovery is not  possible.

Quite a few blogposts, VMworld sessions and whitepapers have been written about the advantages/disadvantages of both scenario’s. Duncan Epping wrote a blogpost titled SRM vs Stretched Cluster solution  about this in 2013. This is a great whitepaper about this topic published by VMware.

 

As said VMware is working on a tech preview of SRM which enables using SRM in a vSphere Metro Stretched Cluster.

Some of the requirements for such a setup have been announced at VMworld:

  • vSphere 6.0 will enhance Longdistance vMotion by supporting a roundtrip of 100ms. This enables a much bigger distance between two datacenters
  • vMotion  will be possible between two different vCenter Servers.  Two vCenters are a requirment for SRM.

In the future SRM will allow organizations using vSphere Metro Stretched Cluster  to orchestrate planned failovers using SRM. So SRM will use a recovery plan to initiate vMotion of virtual machines. It will monitor the vMotion progress and report success or failure. Not always a vMotion will succeed for example because of latency issues. In that case SRM allows to execute a rerun of a planned failover runbook. If a vMotion fails, SRM will shut down the VM on the production site and restart it on the recovery/secondary site.

For SRM to understand stretched storage, vendors will need to develop new Storage Replication Adapters (SRA).

Storage Profile Protection Groups (SPPG) will be a new component of SRM. The idea is that once a protection group (PG) is created, storage profiles are added to the PG. Any virtual machine or datastore part of the storage profile will automatically be protected.

srm-protectiongroups

In case of an unplanned failover SRM will obviously not use vMotion as the production site is down. SRM will take care of restarting VMs in the recovery site according to the recovery plan.

srm-unplannedfailover

 

One of the new features of combining SRM with a stretched cluster is the ability to perform test failovers. This will not actually perform a vMotion. It will power on the virtual machines in the recovery site using an isolated network.

 

srm-testfailover

 

It is also possible to reprotect. Reprotect is to make the recovery site the primary site. The site which was originally the protected site now will become the recovery site. Failback is supported as well.

srm-reprotect

 

VMware did not reveal the release number of SRM which will support stretched clusters. Nor did they reveal a release date. My guess this feature will be in SRM 6.0 to be released near the GA of vSphere 6.0.

 

Microsoft InMage Scout now available for download as part of Azure Site Recovery subscription

Customers of the Azure Site Recovery service are now able to try InMage Scout for free . Last week Microsoft InMage was acquired by Microsoft.

InMage enables protection of both virtual and physical servers. It does so by installing an agent which intercepts storage traffic, redirect that traffic to an appliance which then replicates the data to another location.

InMage can now be dowloaded for customers wanting to do recovery between two on-premises VMware sites.

As InMage Scout is able to perform V2V conversions it is likely in the future VMware customers can perform a recovery on Microsoft Azure.

Between July 1, 2014 and August 1, 2014 InMage Scout software can be used on a trial basis.

InMage Scout can be downloaded by creating a Azure Site Recovery Vault. At the Setup Recovery select ‘Between two on-premises VMware sites.

Customers pay for the instances they are protecting with InMage Scout through the Azure Site Recovery (ASR) SKU available in the Microsoft Enterprise Agreement beginning on August 1, 2014. InMage Scout is a deployment option available to you as part of the Azure Site Recovery service. Scout is not available for purchase via Azure Direct or Azure Enterprise Agreement monetary commitment. Customers should use the true-up process in the Enterprise Agreement to account for additional instances protected with InMage Scout.

The download is 770 MB in size.

Read the Microsoft blog about this news here.

 

Microsoft acquires InMage. Enhanced disaster recovery services for Microsoft Azure

Today Microsoft announced it  has acquired InMage. InMage is a US company while software development is done in India. InMage offers software to enable disaster recovery (DR) for mid-market and enterprises. 

There are many solutions on the market offering DR. However InMage is the only one supporting all assets in a datacenter: both physical and virtual servers ( VMware vSphere, Microsoft Hyper-V and XenServer). It supports Windows Server, Linux, IBM AIX  and Solaris. It supports major enterprise applications like Exchange, SQL, Oracle, SAP and Sharepoint.

One of the software solutions of InMage is Scout. Scout is storage agnostic and allows to replicate virtual machines as well as physical servers to a target location. This can be either a secondary datacenter, to a cloud provider like Azure or to a Managed Service Provider datacenter. InMage has many Service Provider customers in the US. For example SunGuard. Cisco uses InMage Scout in its blueprints which can be used by partners building DRaaS solutions. InMage partners with HP, Hitachi and Fujitsu which provide DR services.

Scout current version is 7.1.

The solutions are offered in three form factors: software, a hardware and as Software as a Service.

Scout will be integrated in the current Microsoft Azure service called ‘Azure Site Recovery’ which is in Preview at the moment.

Besides the support for all major hypervisors a very interesting feature of InMage Scout is the ability to covert hypervisor virtual machine disk formats. So a VMware customer can protect their virtual machines running on vSphere  (which uses  VMDK format) to Microsoft Azure which uses Hyper-V .VHD virtual disks.

Also for example an Amazon customer can easily migrate virtual machines to Azure using InMage Scout.

In this  blogpost,  Takeshi Numoto – Corporate Vice President, Cloud and Enterprise Marketing , states

This acquisition will accelerate our strategy to provide hybrid cloud business continuity solutions for any customer IT environment, be it Windows or Linux, physical or virtualized on Hyper-V, VMware or others. This will make Azure the ideal destination for disaster recovery for virtually every enterprise server in the world. As VMware customers explore their options to permanently migrate their applications to the cloud, this will also provide a great onramp.

Microsoft has two main goals by the acquistion of InMage:

  1. attract Microsoft customers to Microsoft Azure
  2. attract VMware and other non Hyper-V customers to Microsoft Azure. VMware has a large installed base but not every VMware customer can afford a secondary datacenter. Especially in Europe there are not many Service Providers offering a mature Disaster Recovery as a Service offering. VMware itself only recently introduced its vCHS-DR service.

It is interesting to see how the currently in Preview service ‘Azure Site Recovery’ (ASR) will mature now InMage has been acquired. ASR support is limited to Hyper-V virtual machines running on-premises. It provides some orchestration features but is limited in out of the box post-processing of failover of virtual machines. For example changing IP addresses needs to be scripted. It is not unlikely development of ASR will change course.

Technology
InMage Scout uses agents which are installed in a source server (physcial or virtual server). This agent copies every write to disk and sents it to a software appliance called the InMage Scout Server. I understand this can be either a virtual machine (called the Process server) or a hardware appliance 

This appliance has two functions:

  • -a backup function. It stores backup data on disk.
  • -a disaster recovery function. It replicates data to a secondary site or to the cloud. It does compression and encryption as well.

In the secondary location there is a virtual appliance as well which is used to process the replicated data. It stores the replicated virtual disks on storage. Replica’s of virtual machines do not have to be powered on during the replication. This is very usefull as it does not consume compute and memory resources thus lowering costs.

At failover or failover testing virtual machines are created and started.

Conclusion

The acquisition of InMage is a very interesting one. Many see Disaster Recovery as a Service as a  first step for organizations to embrace cloud computing. Now DraaS is open for any enterprise, also non Hyper-V customers. The barrier for using DRaaS is lowered now.

 

Microsoft introduces Microsoft Azure StorSimple, a new virtual appliance and 2 new arrays

Today Microsoft announced some interesting news on StorSimple:

  • introduction of two new StorSimple arrays
  • a new StorSimple Virtual Appliance
  • a new Azure service called ‘Microsoft Azure StorSimple’.

For those unaware of Microsoft StorSimple: it  is a hardware storage appliance available in 4 models which is placed in an on-premises location. It has a couple of SSD and SAS drives for local storage. Volumes are served to hosts using iSCSI. SMB shares are not supported yet. StorSimple  complements Tier1 storage systems by being able to automatically move infrequently accessed data to Microsoft Azure storage. Data stored in Azure remains available for access by users and applications. Users will only notice a small delay when accessing data stored in Azure. Think about a single stretched volume which has data located on the StorSimple local storage as well as in Azure.

The main driver to use StorSimple devices is saving on storage costs while its unique disaster recovery features are a nice bonus as well.

StorSimple is limited to serving  unstructured data like Office files etc.

The interesting news of today is that StorSimple will be available as a virtual appliance as well. An Azure virtual machine can run the StorSimple software and perform the same features as the hardware appliance placed on-premises.

This will be available from August 1.

The news was announced in this blog titled Introducing Microsoft Azure StorSimple

The two new arrays are the 8100 and the 8600. The 8100 has a raw local storage capacity of 15 TB while the 8600 has 40 TB. By using compression and deduplication the effective capacity for the 8100 is 75 TB to 200 TB for the 8600.

8100-8600

The new StorSimple virtual appliance  enables for example a scenario where customers replicate on-premises StorSimple snapshots to Azure based virtual appliances. Applications running in Azure can then analyze data without disrupting production workloads running on-premises.

The StorSimple virtual appliance is only supported when the StorSimple 8100 and StorSimple 8600 are used. Current models are not supported! Which is a big pity if you ask me.

Another use case is disaster recovery. A StorSimple hardware appliance is a single point of failure. If it breaks down or is destroyed because of fire/flooding/collapse of the datacenter, customers will require a spare StorSimple appliance to be able to recover data. Now recovery can be performed using StorSimple virtual appliances running in Azure.

Recovery using StorSimple is not having to wait before a restore has complete. StorSimple instant recovery works as follows: a cloudbased snapshot is mounted to a StorSimple array or virtual appliance. Then the data is made available to users instantly. The file is only moved from Azure to the appliance when it is being accessed. So instead of recovering all files, only files which are accessed are restored while showing all files.

The Virtual Appliance connects to Azure VMs using a virtual iSCSI Ethernet network and the same platform volume and storage management tools (such as Windows Disk Management) and iSCSI initiators that are used on-premises. That means many of the same system management skills used on-premises are used in Azure to do the same things there. 

This Microsoft blog has some details!

Management of the 8000 series arrays and the Azure StorSimple virtual appliance running in Azure is done using a new Azure service called ‘Azure StorSimple Manager‘.
StorSimple Manager provides a central console for monitoring multiple StorSimple devices which are located in for example branch offices. It shows whether it is online or offline, shows the ratio of the provisioned capacity to the maximum capacity of the device. It also can be used to restore a cloudbased backup from for example an on-premises StorSimple device to a StorSimple virtual device.

 

Two new hardware appliances will be available: the model 8100 and the 8600.The 8100 is a 2u model while the 8600 has 4u. The specifications are shown below. For a full overview see this page.

8100-vs-8600

The StorSimple appliances are not cheap. The 8600 has a listprice of $ 170.000,-

pricing

The pricing overview showing all currently available StorSimple devices is here.