On the last day of Storage Field Day 12, we got to visit Intel Storage, who gave us some insight into what they were seeing and how they are driving the future of high performance storage. Just prior to the Intel Storage session, we heard from SNIA’s Mark Carlson about trends and requirements mainly driven by the needs of the Hyperscalers, and Intel’s session detailed some very interesting contributions that line up very neatly with those needs.
One of the big takeaways from the SNIA session was that the industry is looking to deal with the issue of “Tail Latency events”, or as Jonathan Stern (a most engaging presenter) put it, “P99’s”. Tail Latency occurs when a device returns data 2x-10x slower than normal for a given I/O request. Surprisingly, SSD drives are 3 times more likely to have a tail latency event for a given I/O than spinning media. Working the math out, that means that a Raid stripe of SSD drives has a 2.2% chance of experiencing tail latency- and the upper layers of the stack have to deal with that event by either waiting for that data or repairing/calculating that late data via parity.
Now one would think that when you’re dealing with latencies on NVM of 90-150 microseconds, even going to 5x keeps you within 1ms or so. But what the industry (read: Hyperscalers who purchase HALF of all shipped storage bytes) is looking for is CONSISTENCY of latency- they want to provide service levels and be sure that their architectures can deliver rock-solid, stable performance characteristics.
Intel gave us a great deep dive of the Storage Performance Development Kit (SPDK), which is an answer to getting much closer to that lower standard deviation of latency. The main difference in their approach, which is the most interesting development (that could drive OTHER efficiencies in non-storage areas, IMO), is that they have found that isolating a CPU core for storage I/O in NVMe environments provided a MUCH better performance consistency, primarily because they eliminate the costs of context switching/polling .
The results they showed by using this approach were staggering. By using 100% of ONE CPU core with their USER space driver, they were able to get 3.6 Million IOPs with 277ns overhead per I/O from the transport protocol. Of course that’s added to the latency of the media, but that is a small fraction of what’s seen when using the regular Linux kernel-mode drivers that run across multiple CPUs. We’re talking nearly linear scalability when you add additional NVMe SSDs using that same single core.
This is still relatively young, but the approach Intel is taking with the single-core, user space driver is already being seen in the marketplace (E8 Storage comes to mind, it’s unknown if they are using Intel’s SPDK or their own stuff).
Intel’s approach of stealing a dedicated core may sound somewhat backwards; however as Intel CPUs get packed with more and more cores, the cores start to become the cheap commodity, and the cost of stealing a core will start to go below the performance cost of context switching/polling (it may have already), as the media we’re working with now has become so responsive and performant that the storage doesn’t want to wait for the CPU anymore!
This is also consistent with a trend seen across more than a few of the new storage vendors out there, which is to bring the client into the equation with either software (like the SPDK) or a combination of hardware (like R-Nics) and software to help achieve both the high performance AND the latency consistency desired by the most demanding of storage consumers.
We may see this trend of dedicating cores become more popular across domains, as the CPU speeds aren’t improving but core counts are- and the hardware around the CPU becomes more performant and dense. If you play THAT out long term, virtualized platform architectures such as VMWare that run hypervisors across all cores (usually) may get challenged by architectures that simply isolate and manage workloads on dedicated cores. It’s an interesting possibility.
By giving the SPDK away, Intel is sparking (another) storage revolution that storage startups are going to take advantage of quickly, and change the concept of what we consider “high performance” storage.
**NOTE: Article edited to correct Mark Carlson’s name. My apologies.