This post is part of a series of posting on the VMworld 2013 announcements. See this post for an overview of what has been announced.
At VMworld 2013 VMware announced a new product named Virtual SAN or VSAN. VSAN allows to create a distributed storage out of local storage. VSAN is an additional solution in the range of hyperconverged infrastructure solutions. Typical for those solutions is that compute, memory, networking and storage are in the same box. These solutions are also called Datacenter in a Box.
Nutanix is the most mature player in the field of Datacenter in a Box. This post will compare Nutanix Virtual Computing Platform with VMware Virtual SAN 1.0.
The end of this long post has a feature compare matrix showing the most relevant features of both Nutanix VCP and VSAN 1.0
Introduction to converged storage and compute.
We are living in a world where over 50% of Window servers are now running as a virtual machine. Besides servers we are starting to virtualize desktops as well. More and more business critical, Tier1 applications are running on VM’s. And we are doing Big Data formerly known with a less sexy name as data mining.
Storage Array Networks (SAN) and NAS are traditionally used to store data of those VM’s. Until recently this offered the only way to have Enterprise level shared access to storage across hosts.
However SANs were originally not designed for the number of IOPS we are consuming today. They are also not aware of what is running on storage. Management is not granular at the VM level but at volume level.
So mainstream, established SAN vendors are creating new solutions like adding SSD-storage into the SAN to deliver more IOPS. These are typically quite expensive solutions and they do not solve the issue of hops. Data requests from apps running in the VM needs to travel over several hops (switches) to reach the SAN which increase latency.
Other issues with using SAN are the costs, complexity, scalability (large increments and thus big investments) and management (often done by isolated teams)
Two trends to tackle this performance issue which is initiated by the increased demand for IOPS by VDI & Big Data are :
- Move the flash storage to the server. Server based flash on PCIe and SSD is used as a read and write buffer. Persistent data is still stored on a SAN or NAS. Capacity and performance are decoupled here. For example PernixData is offering this.
- Use local server based storage, make it high available and present it as a shared storage to the hypervisor. Storage performance and capacity are in the same chassis as compute.
The second trend is called converged compute and storage. The advantage here is that the CPU is very close to the storage, in the same chassis , on fast bus connection without a traditional network having switches and cabling. Obviously this delivers much reduced latency and more IOPS. Typically these scale in a linear fashion in small increments.
Use cases for datacenters in a box are typically VDI, Big Data like Hadoop and Splunk, Disaster Recovery and server virtualization.
The image below shows latency for various types of storage. It is part of this story.
Just a few vendors have such solutions in their portfolio. All of the solutions are mainly targeted at VDI deployments but can be used for server virtualization as well. Nutanix , SimpliVity and Scale Computing are some examples. Other vendors offers like Pivot3 and V3sys offer converged storage but this is not made redundant to survive a node failure. These are only suited for VDI-deployments.
A new player in this field is VMware. At VMworld 2013 VMware announced its converged storage and compute solution named Virtual SAN or VSAN. At the moment it is in a public beta. VSAN aggregates local storage and presents a datastore to ESXi. It can tolerate node, network and disk failures.
So what is the difference between VSAN and Nutanix converged solution named Virtual Computing Platform?
Lets first zoom in on Nutanix. Nutanix is a start-up company founded in in September 2009, by three cofounders. Nutanix offers a shared nothing architecture based on x86 hardware with intelligent software. The architecture principle is the same as Google, Facebook, Twitter uses in their datacenters: Low cost, commodity servers with local storage and smart software to make the data fault tolerant to component failures.
Nutanix shipped their first system to their first two customers in April 2011. At VMworld 2011 the company came out of stealth mode and soon got more attention and added a second model in December 2012. In 2013 two new models were introduced. So they have been in production now for over 2 years.
Since they came out of stealth mode Nutanix got a lot of attention in the press, by bloggers, by analysts, by customers because of their disruptive technology. Venture capitalist firms saw that too and invested $71.6 million in the company since July 2010.
Nutanix sells as said a ‘Datacenter in a box’ solution with intelligent software. The Virtual Computing Platform is available on 8 different hardware models. Each model or node is an SuperMicro (for the NX-2000/3050/60X0) or Quanta Computer (NX-3000) x86 based server with CPU, memory, network and local storage.
Nutanix platform has a couple of advantages:
- Takes less space than blades
- Less power consumption than blades
- Lower costs than traditional blades+switches+SAN
- Easier to manage / quicker to provision than blades+SAN
- High performance because of using SSD and RAM/SSD caching local to compute
- High availability built into the distributed filesystem
- Scaling in small steps
Compared to blades Nutanix offers better density. Their 2U chassis can have 4 servers. HP C7000 is a 10U enclosure with a capacity of 16 servers. Using blades you still need space and power for external storage.
A Nutanix NS-3050 can handle 400 VM’s in a VDI deployment alone and consumes about 1.1 KW. Compare this with your typical SAN array that typically needs 700W just for one midrange array. You probably need multiple arrays to handle 400VMs and some storage switches.
Storage used are Seagate SATA disk drives, Intel SATA S3700 SSD flash drives, and in some models (NX-2000, NX2050, NX-3000) PCI-e based flash memory provided by Fusion-io (past) and Intel (current). The server/node is placed into an enclosure. Nutanix calls it a block. Basically each block is just a metal frame with a redundant powersupply and cooling in it. Unlike other blade enclosures it does not have a shared backplane inside it. So each node is attached using 10 Gbps nics to an external switch. Depending on the enclosure it can have 2 or 4 nodes.
Each node runs VMware ESXi or KVM. Support for Hyper-V is added soon. Depending on the model the standard VMware downloadable vSphere ISO can be used or an modified vSphere image made by Nutanix.
Each node runs a Nutanix virtual machine running CentOs Linux. This is called the Nutanix Controller VM which has the same kind of function as the hardware based controller used in a SAN. But the controller does a lot more as in a Nutanix platform data is distributed. The controller is the brain of the Nutanix platform. It runs the Nutanix Operating System (NOS) and uses several open source software like NoSQL to store metadata information and Apache ZooKeeper to avoid split brain and storage configuration data. ZooKeeper is also used by Netflix for its online video service. Google Snappy algorithm is used for compression. NFS is used to export/present files to the hypervisor. It uses Avahi network discovery software to discover new Nutanix nodes so they can be automatically added to the cluster. Avahi is based on Apple bonjour.
So NOS is basicallya box with open source components which is very smartly glued together and extended with in house developed Nutanix software.
The controller creates out of the local storage a distributed datastore and presents it as NFS to ESXi (or KVM). iSCSI storage is also available for VMs or workloads requiring block storage or RDMS . If the Controller VM fails there is no problem. Controllers running on other nodes will take over without virtual machines and applications noticing anything.
For a great and simple explanation see this video
Two tiers of persistent storage are offered for storage of persistent VM data : PCIe Flash/SSD and HDD. For models without PCIe Flash, data is stored on SSD and HDD.
The Nutanix Distributed File System (NDFS) is one of the components doing the magic. It is developed by engineers who also developed Google GFS and Exadata. NDFS has been in production for over a year and half with good success.
To protect against failures of disks, nodes or blocks/enclosures, data is copied to other nodes. VMDK data is fully written locally with chunks of the data distributed amongst all nodes in the cluster. Default there is always at least one copy of the ‘live’ VMDK available. This RAIN-10 (distributed over nodes of RAID10) is very similar to the Network RAID in HP StoreVirtual storage systems (formerly known as Lefthand).
NDFS offers information lifecycle management (ILM) which automatically moves often requested data blocks (HOT blocks) to the flash or memory tier (heat-optimized tiering).
In the 3050/3051/6050/6070/1050 series Nutanix uses a SATA SSD and several SATA HDDs. The SSD/Flash tier serves as the highest performance tier where information lifecycle management (ILM) is done between the SSD/Flash and HDD tiers. Nutanix also has a read cache which spans both memory as well as SSD/Flash. Another note is that all SSD/Flash and HDD are persistent tiers so depending on the workload and IO type Nutanix software determines the tier placement. For example, random IO goes to SSD/Flash, sequential to HDD, hot data to the read cache, etc.
To make reads even more fast internal memory of the server is used as a read cache. The size is default 2 GB. It can be extended to like 64 Gb. It can then deliver 160,000 IOPS of Random Reads
Nutanix uses the NoSAN slogan in their marketing. Although many workloads will run fine on Nutanix, it will not replace use cases where physical servers are attached LUNS presented by SAN. Also if synchronous replication of data is a requirement you still need to use a SAN.
The hypervisor boots of an embedded USB device located inside the servers for the NX-3000 models and later.
Data is compressed either online or offline. Policies can be set on the VM or VMDK level. Nutanix comes with built in DR software which is VM centric. So each individual VM can be selected for DR or no DR. Nutanix does not support stretched clusters (single cluster living in two sites). There is asynchronous replication between two clusters in each site.
Nutanix does dedupe in the storage tier as well as in the performance tier to give the highest possible utilization of the cache and SSD tiers. Also, this is extremely granular so the amount of data being pulled up to the cache should accurately match the actual bytes being read
Nutanix uses a replication factor (RF) for configuration of NFS containers. The RF determines how many replicas/copies of data are stored for redundancy reasons. If a drive fails or a complete node, you do want to have a replica. The default value of RF is set to 2. This means there is one backup of the live data. This value cannot be changed in the GUI. It can be changed in Nutanix Command Line interface(NCLI). So if RF is set to 1, and a drive is lost, the data is lost!
So what happens if a VM is moved to another node, or is started at another node after a HA event. Will the VMDK files then be located at the same node as the VM is running on?
NOS offers a feature called data locality or “follow me” data. For example, a VM is running on host A that is moved to host B by a vMotion (manual or DRS). Now prior to the move all primary copies of the VM’s data sat on host A with chunks of replicas distributed throughout all nodes in the cluster. When the VM is moved to host B all new writes will occur locally on host B (not going over the network). All reads of data that hasn’t been just written will be forwarded to host A, of which they will then be stored locally on host B allowing all reads to occur locally from host B.
Another feature is the Shadow Clone feature. With VDI and linked clones the replica is hosted by a single node which can be a hot spot and bottleneck for reads. Nutanix automatically detect a “multi-reader” scenario and dynamically creates distributed clones of that data and allow it to be cached by any controller VM. This allows all reads to occur on the local node directly
Nutanix is hypervisor agnostic. It supports both vSphere ESXi 5.0 or later,KVM and Windows Server 2012 Hyper-V. At the moment the platform is available as a hardware appliance. It could theoretically be delivered as a software appliance which can run on any certified x86 server. However for support, certification of the components etc. customers are bound to the current 8 models available.
Nutanix nodes have a call home feature. If there is an error the software will automatically contact Nutanix Service Center and query for a match. If there is, Nutanix support will automatically be informed and they will help the customer to solve the issue.
Nutanix offers a nice integration with vSphere. As soon as a NFS container has been created it is automatically made available to all ESXi hosts in the cluster. The vSphere admin does not need to rescan/mount using the vSphere client.
To automate procedures Nutanix offers a REST API programming interface.
The Nutanix models come with ESXi pre-installed but customers needs to license those installs.
Pricing for a four-node NX-1050 is $90,000 versus $144,000 for an NX-3000. A 2U NX-6000 includes 32 cores, 32 TB of hard drive capacity and either 1.6 TB (NX-6050) or 3.2 TB (NX-6070) of flash. The NX-6050 costs $120,000 for two nodes, while the NX-6070 is priced at $180,000 for two nodes.
The image below shows NX-6000
Mind all the mentioned prices are listprice and no streetprice. As with all storage you probably will be able to get some discount
Nutanix has a one year warranty. It is possible to purchase support on either Gold or Platinum level which has software updates, 24/7 access to global technical support, providing guidance from the Ethernet port all the way up to the application, next business day hardware replacement shipment and access to online training/certification. Pricing for support starts at 10% of the costs of the Nutanix server.
Nutanix offers a lot of features embedded in software. All except DR and post-process compression are included in the price. When new features are released in NOS customers do not have to pay for those when they are on a support contract.
The value of the Nutanix solution is clearly in the software. The hardware is commodity. I did a rough calculation on the NX-1050 model (the one without expensive PCIe flash) and came to a cost for the server at about $ 4000,-. You can do your own math to determine the cost of the software. And compare it to value & cost of VSAN.
Nutanix has a promo running at the moment offering a free NX-1050 node when three nodes are purchased. It kind of proves the margin available for bargaining.
An overview of the specifications of the Nutanix server models can be found in this PDF.
While Nutanix offers a hot product which is technically very advanced and has much to offer right out of the box, there are some caveats. The bean counters are not interested in cool tech but in value and cost reduction.
First there is the ratio between compute and storage. If your organization need more storage, you will need to buy additional nodes. There is no way you can add HDD or SDD yourself in an existing node (i believe there is no space). That extra nodes gives not just extra storage, but also CPU power. It could well be the case you do not need that extra CPU power. You will also have to buy additional ESXi licenses to be able to make that needed storage available. This adds to the costs.
To solve this recently Nutantix added a model which is storage heavy.
Another concern is the maximum amount of physical memory. With vSphere 5.5 now able to support 4 TB, Nutanix has a maximum of 256 GB of internal RAM. I know a couple of large SQL servers running as a VM with 64 GB of virtual memory. So this reduces density.
Also when doing TCO calculations you have to keep in mind that at least half of the available disk capacity is lost because of the distributed RAID10 used to protect data.
A last warning is that when moving storage from SAN/NAS to servers, the storage admin department will not be happy. They might lose their job and will not be supporters.
If you want to learn more about Nutanix and its distributed filesystem and you are at VMworld USA see this session at Wednesday August 28.
VAPP6268 – Building Google-like Infrastructure for the Enterprise
Much more details on Nutanix by Marco Broeken here and Cormac Hogan here. If you are looking on what is going on under the hood, read this series of postings.
VMware Virtual SAN 1.0
VMware announced the public beta of VSAN 1.0 in August 2013. The product is not yet GA and is expected so in first half of 2014. VSAN aggregates local storage and creates distributed datastores out of it. It is embedded in the ESXi 5.5 hypervisor. It must have at least 1 SSD for read cache and write buffer. It also needs at least one spinning disk HDD for storage of virtual machine files (VMDK), swap and snapshots.
At VMworld Europe VMware announced VSAN will support up to 16 nodes in de 1.0 version.
It does not offer the rich features of Nutanix like de-dupe, compression, DR out of the box, multiple tiers for storage, auto tiering etc.
A complete overview of VSAN 1.0 can be found in this post.
This is an overview of all VSAN related breakout sessions at VMworld.
Feature compare matrix
This image shows the features of both VSAN 1.0 and Nutanix NOS 3.5
Nutanix offers a mature, scalable converged compute and storage platform based on open source software which has proven itself to be very robust. It has a lot of interesting, cost reducing and performance enhancing features which can make the business case of particular VDI-deployments much better than when using traditional SAN.
Nutanix is likely able to replace SAN in many use cases, especially in greenfield scenarios. As there is not much competition for converged storage/compute the pricing is quite high. You have to do your own cost calculation of traditional SAN/NAS versus Nutanix and have all the features which Nutanix delivers in the calculation. Mind however issues like losing half of the raw storage due to the replication factor and the ratio between compute/storage and costs for vSphere.
Lets hope because of the introduction of VSAN in this market prices will drop and the software comes available as a virtual storage appliance.
VMware VSAN is more a complimentary solution to an existing SAN than that it will be able to replace a SAN. At the moment the software is in Open Beta and VMware recommends to use it for VDI, Test/Dev and DR only.
It is a niche solution which can hold VMs in the TIER2 & 3 range with average IOPS requests. The features are limited as it is a 1.0 version. Its most attractive selling point is its price and the flexibility to choose any x86 server supported by vSphere.
However the product is still in a beta. VMware needs a lot of time to develop a robust, reliable distributed storage architecture, probably from scratch without using much open source. Back in 2012 the Virtual SAN was already spoken off. VMware will most likely need another 2 years to be at the level Nutanix is right now.