VMworld 2014: Next-Gen Storage Best Practices

Session STO2496 with Rawlingson Rivera (VMware), Chad Sakac (EMC), and Vaughn Stewart (Pure Storage)

Simplicity is the key to success:

Use large datastores
Limit use of RDMs
Use datastore clusters
Use array automated tiering storage
Avoid jumbo frames for iSCSI and NFS
KISS gets you out of jail a lot
Use VAAI
Use plug-able storage architecture

Key Best Storage Best Practice Documents – Use your vendor’s docs first. VMware docs are just general guidelines.

Hybrid arrays – Use some degree of flash and HDD. Examples are Nimble, Tintri, VNXe, etc.

Host caches such as PernixData, VFRC, SanDisk.

Converged Infrastructure such as Nutanix and Simplivity.

All flash arrays such as SollidFire, ExtremIO, Pure Storage

Data has active I/O bands – the working set size. Applications tend to overwrite the most 15% of data.

Benchmark Principles

Don’t let vendors steer you too much – Best thing is to talk to different customers
Good benchmarking is NOT easy
You need to benchmark over time
Use mixed loads with lots of hosts and VMs
Can use slob or IOMETER and configure to set different IO sizes
Don’t use an old version of IOMETER. A new version was released about six weeks ago
Generating more than 20K IOPS out of one workload generator is hard. If you want 200K IOPS, you will likely need 20 workers

Architecture Matters

Always benchmark under normal operations, near system capacity, during system failures
Always benchmark resiliency, availability and data management features
Recommend testing with actual data
Dedupe can actually increase performance by discarding duplicate data
Sometimes all flash array vendors will suggest increasing queue depth to 256, over the default of 32
Queues are everywhere

Storage Networking Guidance

VMFS and NFS provide similar performance
Always separate guest VM traffic from storage VMkernel network
Recommendation is to NOT use jumbo frames – 0 to 10% performance gain with it turned on
YMMV

Thin provisioning is not a data reduction technology

Data reduction technologies are the new norm: Dedupe block sizes change (512b Pure, 16KB 3PAR, 4KB NetApp)

There is a big operational difference between inline and post-process

Data reduction in virtual disks (better in Hyper-V than vSphere 5.x): T10 UNMAP is still a manual process in vSphere

Quality of Service

In general QoS does not ensure performance during storage maintenance or failures
Many enterprise customers can’t operationalize QoS and do better with All flash arrays
QoS can be important capability in some storage processor use cases
With vVols there may be a huge uptick in QoS usage

VMware Virtual SAN Integration

Unparalleled level of integration with vSphere
Enterprise features: NIOC, vMotion, SvMotion, DRS, HA
Data protection: Linked clones, snapshots, VDP advanced
Policy based management driven architecture
VSAN best practices: 10Gb NICs, Use VDS, NIOC, queue depth of 256, don’t mix disk types in a cluster
Uses L2 multicast
10% of the total capacity should be in the flash tier

Automation

Automate everything that you can
Do not hard code to any vendor’s specific API
Do not get hung up on Puppet, Chef, vs. Ruby, etc.

2 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

Adam Bergh

August 27, 2014 8:20 am

I'm curious on the "Avoid jumbo frames for iSCSI and NFS". Can you expand on this? Is this just for the KISS approach or is there just not enough performance benefit from it these days to justify the added config? I've been doing Jumbo frames on all installs and haven't had any negative experience other then the added setup time which is pretty minimal on FlexPod type designs.

-Adam Bergh
@ajbergh http://www.thepartlycloudyblog.com

Author

Derek Seaman

August 31, 2014 8:21 am

Reply to Adam Bergh

Yes the basis of the recommendation is KISS. Human error in the datacenter is a leading cause of downtime. If you have standardized on jumbo frames and your network team won't accidentally disable them, then I would leave them enabled. For customers considering jumbo frames, unless you are bumping up against CPU utilization bottleneck or maxing out the network links, I would KISS the network and not use jumbo frames.

VMworld 2014: Next-Gen Storage Best Practices

Related Posts

Home Assistant: iOS Focus Mode Automation Triggers

Digital Privacy Decoded: Simple Ways to Secure Your Information

Part 2: Ruckus Unleashed (200.18+) Best Practices Guide

Part 1: Ruckus Unleashed (200.18+) Best Practices Guide