Senthil Rajaram, Senior PM, Microsoft
Taylor Brown, Senior SDET, Microsoft
This was an excellent session about the major storage improvements in Windows Server 2012. These enhancements are foundational to Hyper-V and other roles such as file servers. To say that Microsoft made some advancements in storage is a bit of an understatement. Among the new features they took a page out of VMware’s book with their ODX features, and now also offer unmapping of space down to the SAN array when using Windows Server 2012. Your VM volumes can now stay thin within your array that supports hardware thin provisioning and free space reclamation, without resorting to tools like sdelete to zeroize free blocks. Now that a server OS supports ‘trim’ when virtualized, I’m sure VMware will support that feature in a future release.
SMB 3.0 also appears to surpass NFS v3, as it supports active/active multi-channel I/O, which is basically MPIO for SMB. As storage NAS devices start to support SMB 3.0, this will make building Hyper-V clusters easier and less expensive. The performance enhancements in SMB 3.0 let it operate at near wire speed with very little overhead, unlike previous versions that were pathetically inefficient and slow.
Check out all of these new features:
- Top impediments to increased virtualization
- Memory 35%
- Network 27%
- Storage 45%
- CPU 0%
- High availability is out, now there’s continuous availability
- More storage options for Hyper-V
- External storage arrays – Nothing new
- Remote file servers using SMB 3.0 – New
- Clustered PCI RAID – New
- External Storage Array enhancements
- Virtual Fibre Channel
- Extends Fibre Channel into the guest VM
- High performance workloads
- Guest Clustering
- Exposes SAN functionality
- Uses NPIV functionality
- Support: Guest – Windows Server 2008 and higher
- Live Migration still works
- VM is assigned two WWPN, one is active one is standby. Second address is used during Live Migration. IO queue depth limited to 1 during migration to ensure consistency.
- HBA can be shared between guests and physical host
- Increased Storage Efficiency – Unmap (trim)
- Storage informed of unused space
- Efficiencies at virtual layer – Allows reused of unused blocks
- Efficiencies at physical layer – VMs unmap passed to hardware
- Supported on VHDX and passThru disks (iSCSI and Fibre Channel)
- Optimize-volume -driveletter x -retrim
- Offloaded Data Transfer (ODX)
- Traditional data copy model – data is read into server memory and read/written back to the SAN. Very inefficient and uses CPU time. Increased storage traffic.
- Offloaded data transfer – SAN internally copies data without network traffic
- Reduce time for merge mirror, VHDX creation
- Demo: 100GB VHDX creation took 4.5 seconds
- SMB File Storage
- Supports all existing scenarios
- Enables new scenarios
- Shared nothing Live Migration
- Cross-Cluster Live Migration
- Requires SMB 3.0
- Handling Intermittent Network Failure
- Resiliency – Transparently re-establishes network connection – 60 seconds of state saved. Applications see no error or failures.
- Multichannel – Transparently uses alternate network path (basically MPIO for network)
- Active/Active mode for SMB 3.0 shares and bandwidth aggregation
- Continuous Availability
- Transparently fails over share to different file server
- All state is persisted when the failover occurs. Application notices no error.
- Scale-out file server foundational feature
- Handling Hyper-V Node failure
- Cluster client failover – VMs communicate identity and enables quick recovery. No waiting 60s for file handles to expire. No downtime.
- Host based backup and restore
- Virtual shadow service for SMB
- No change in flow for backup
- Performance
- SMB Direct (SMB over RDMA) – Minimal CPU utilization and low latency
- SMB Multi-channel
- Setup and Administration
- Management – A lot of PowerShell cmdlets to manage
- Storage Spaces
- Inbox solution provides
- Pooling
- Resiliency
- Simple space
- Mirror Space
- Parity Space
- Thin provisioning
- Cluster Supports Spaces
- Simple and mirrored
- Shared JBOD SAS array
- Use low latency high bandwidth for cluster network (10G or RDMA)
- Clustered PCI RAID
- Host hardware RAID in a cluster
- Resiliency to node failure – LUN fails over
- Resiliency to disk failure – hardware RAID
- Virtual Storage Stack Improvements
- VHDX
- New default format for virtual disks
- Up to 64TB disk support (VMware has 2TB limit)
- Internal log for enhanced resiliency such as power failures. Protects VHDX metadata and contents.
- Large sector disk support (4K) with no performance issue
- 512e uses read-modify-write – shipping today
- 4K native – Q1 2013, exposes 4K directly to OS
- VHD has sub-optimal format for 512e disks – 30-50% performance hit
- New VHDs are padded for 4K aligned
- 6000% increase for aligned writes (64KB test)
- VHDX supports native 4K disk projection into the VM
- Enhanced performance by supporting larger block sizes
- Embed custom metadata – User defined metadata
- VHDX performance – Fixed, dynamic disks match passthrough disks
- Reduce downtime – online metaoperations
- Reclaim deleted snapshot space
- Online virtual disk merge
- VM Mobility
- Online virtual disk mirror
- IO Scaling
- IO Throughput was limited: 1 channel per VM, 256 queue depth/SCSI for all devices, fixed vCPU for IO and network interrupts.
- Windows Server 2012
- 1 channel per 16 vCPUs, per SCSI device
- 256 queue depth/device per SCSI device (not the whole adapter)
- IO interrupt handling distributed amongst vCPUs and NUMA aware
- Demo’d 1 million IOPS to a single VM (3x what VMware boasts)
Microsoft has secret internal course for their PFE, which called “Supporting Windows Server 2012” and covers Hyper-V, Failover Clustering, Storage and Networking in WS2012.