Is Windows Azure Virtual Machines (WAVM) a true IaaS plattform? Microsoft drops SLA on single role instances.
November 11, 2012 2 Comments
Microsoft introduced a new service on Windows Azure named Virtual Machines. Using this service also advertised as feature, Windows Azure customers are able to manage and are resposible for the operating system. Virtual Machines allows a deployment of a virtual machine using a cataloge or upload a self made VHD file.
This enables developers to run applications on their platform of choice. The PaaS platform which Microsoft offers on Azure does not have a choice of the operating system running underneath the development tools.
Microsoft Azure Virtual Machines is currently running in a Preview version. This can be compared to a Beta status. At the annoucement of the Virtual Machines feature back in June 2012 Microsoft offered two SLA’s for availability of the virtual machines. 99.9 % for single role virtual machines and 99.95 % for multiple role instances. A single role instance is a *single* VM presenting an application. If the VM becomes unavailable (crash of guest, crash of host etc), the application becomes unavailable as well. A multiple role instance has at least two VMs offering the same application. A load balancer distributes application requests over the available VMs.
The image below shows the two SLA’s presented at various TechEd events in North America, Europe and Australia. See for example Mark Russinovich his presentation on Azure Virtual Machines at TechEd 2012 USA.
Mark explained that both SLA’s for single and multiple roles instances would be effective when Windows Azure Virtual Machines goes General Availability.
Massimo Re Ferre’ (VMware employee working in vCloud Architect role) has an interesting post about the same subject titled Azure Virtual Machines: what sort of cloud beast is it? He writes about design for fail IaaS cloud and an enterprise IaaS cloud. Very good read.
Single instance role SLA dropped
However in a presentation of Mark Russinovich at the Build conference in Seatlle (End of October 2012) there is only one SLA mentioned. That explicit SLA has a 99.95 % availability for multiple role instances. Mark tells there is an implicit SLA for a single instance role which can be calculated based on the 99.95 for multiple role. The implicit SLA (this is not a documented SLA but more availability expected by the customers) is 99.76 %. However this SLA is not offered by Microsoft. If a customer wants a SLA, it is 99.95 % for a multiple role instances virtual machine.
So gone is the 99.9 % SLA.
The images below are taken from the Microsoft Build conference at Seattle. You can download the slides or watch the video here. At around 36 minutes into the presentation Mark explains about the single SLA.
So here comes the question if Windows Azure is a true IaaS platform.
Yes, it does deliver management of the operating system level by the cloud consumer. But to get a SLA the consumer needs to have at least two instances of virtual machines serving the same application. Those two need to be member of the same availability group. Then the Azure fabric controller will make sure both VMs are running on different Azure hosts located in different racks.
What could be the reason of the prerequisite to have at least two VMs in a availability set to have Microsoft offer a SLA?
I can only guess but I believe this is because of the limitation of the Azure architecture to be able to move running VMs off a host when planned maintenance needs to be done. Frequently Azure hosts needs updates bringing new functionality or security updates. Windows Azure hosts do not offer a feature which Hyper-V has called Live Migration. Azure running a dedicated, Microsoft developed hypervisor, not Hyper-V.
Azure has been designed for a PaaS role in which an application is served by multiple instances. When a single instances failes, there is no issues. So in the architecture of Azure, a Live migration feature was not a requirement.
So when an Azure hosts needs a reboot, the VMs running on that host needs a shutdown and will probably be restarted on another host. Hence the 99.9 % SLA on availabilty which was until recently advertised by Microsoft and now removed.
Suppose a customer wants to run a standard backoffice infrastructure on Azure: fileserver, printserver, application servers etc. All applications will need to be running on at least two servers for Microsoft giving any guarantees on availability. That will be an interesting challenge for some applications which cannot be made highly available.
I am not sure if Windows Azure Virtual Machines can be described as a true IaaS. Microsoft states in the prestations ‘If it requires a developer, it’s not IaaS’. So I guess if a cloudconsumer needs to call the developer to enable SQL mirroring or otherwise make the application redundant you are not using IaaS.