Intel Storage – Storage Field Day 12

On the last day of Storage Field Day 12, we got to visit Intel Storage, who gave us some insight into what they were seeing and how they are driving the future of high performance storage.  Just prior to the Intel Storage session, we heard from SNIA’s Mark Carlson about trends and requirements mainly driven by the needs of the Hyperscalers, and Intel’s session detailed some very interesting contributions that line up very neatly with those needs.

One of the big takeaways from the SNIA session was that the industry is looking to deal with the issue of “Tail Latency events”, or as Jonathan Stern (a most engaging presenter) put it, “P99’s”.  Tail Latency occurs when a device returns data 2x-10x slower than normal for a given I/O request.  Surprisingly, SSD drives are 3 times more likely to have a tail latency event for a given I/O than spinning media.  Working the math out, that means that a Raid stripe of SSD drives has a 2.2% chance of experiencing tail latency- and the upper layers of the stack have to deal with that event by either waiting for that data or repairing/calculating that late data via parity.

Now one would think that when you’re dealing with latencies on NVM of 90-150 microseconds, even going to 5x keeps you within 1ms or so.  But what the industry (read: Hyperscalers who purchase HALF of all shipped storage bytes) is looking for is CONSISTENCY of latency- they want to provide service levels and be sure that their architectures can deliver rock-solid, stable performance characteristics.

Intel gave us a great deep dive of the Storage Performance Development Kit  (SPDK), which is an answer to getting much closer to that lower standard deviation of latency.  The main difference in their approach, which is the most interesting development (that could drive OTHER efficiencies in non-storage areas, IMO), is that they have found that isolating a CPU core for storage I/O in NVMe environments provided a MUCH better performance consistency, primarily because they eliminate the costs of context switching/polling .

The results they showed by using this approach were staggering.  By using 100% of ONE CPU core with their USER space driver, they were able to get 3.6 Million IOPs with 277ns overhead per I/O from the transport protocol.  Of course that’s added to the latency of the media, but that is a small fraction of what’s seen when using the regular Linux kernel-mode drivers that run across multiple CPUs.   We’re talking nearly linear scalability when you add additional NVMe SSDs using that same single core.

This is still relatively young, but the approach Intel is taking with the single-core, user space driver is already being seen in the marketplace (E8 Storage comes to mind, it’s unknown if they are using Intel’s SPDK or their own stuff).

Intel’s approach of stealing a dedicated core may sound somewhat backwards; however as Intel CPUs get packed with more and more cores, the cores start to become the cheap commodity, and the cost of stealing a core will start to go below the performance cost of context switching/polling (it may have already), as the media we’re working with now has become so responsive and performant that the storage doesn’t want to wait for the CPU anymore!

This is also consistent with a trend seen across more than a few of the new storage vendors out there, which is to bring the client into the equation with either software (like the SPDK) or a combination of hardware (like R-Nics) and software to help achieve both the high performance AND the latency consistency desired by the most demanding of storage consumers.

We may see this trend of dedicating cores become more popular across domains, as the CPU speeds aren’t improving but core counts are- and the hardware around the CPU becomes more performant and dense.  If you play THAT out long term, virtualized platform architectures such as VMWare that run hypervisors across all cores (usually) may get challenged by architectures that simply isolate and manage workloads on dedicated cores. It’s an interesting possibility.

By giving the SPDK away, Intel is sparking (another) storage revolution that storage startups are going to take advantage of quickly, and change the concept of what we consider “high performance” storage.

 

**NOTE: Article edited to correct Mark Carlson’s name. My apologies.

Advertisements
Intel Storage – Storage Field Day 12

HPE buys Nimble for $1.09B – Trying to make sense of this

So per IDC, HPE was statistically tied for the #2 spot in the External Storage Array market in Q3 ’16, with $549M in sales vs $650M the prior year’s Q3.  That’s quite a downward trend. Included in that number are the multiple storage offerings that HP owns: 3PAR, LeftHand, its own arrays, etc.  

Today we find out that HPE has paid $1.09B for a company that has total revenues of around $500M, was losing money at a rate of around $10M per quarter, and had no appreciable market share in the external array market. Nimble made its initial bet on hybrid flash architecture, which became a problem as the market moved to all-flash. Nimble changed course and provided all-flash, but many other vendors were far ahead here. 

So what gives?  How can Nimble fit into a long-term strategy for HP?

Nimble isn’t really part of a hyperconverged play, so in the context of the recent Simplvity acquisition, this seems a parallel move. 

There’s InfoSight, which provides predictive analytics and ease of management for Nimble Arrays; perhaps HP sees a platform it can expand to its enterprise customers. But a BILLION dollars for that??

Nimble has a lot of small customers (over 10,000 at this point based on a recent press release), but 10,000 customers is a pittance compared to HP’s existing customer base across all customer sizes. 

In the short term, this acquisition will bump up HP into the #2 spot alone in external arrays, but not by much, and given the current trajectory of their external storage array revenue, it’s likely they won’t hold that spot for long. 

When you consider this acquisition from all the angles (and I’m sure I’m missing some and welcome the discussion), I struggle to make sense of this acquisition from HPE’s standpoint.  I only see cannibalization of existing storage business, disruption of the storage sales organization, and no added value to HPE’s overall storage offering.  Did HPE simply have $1B lying around with nothing better to do with it?  Guys, call ME next time.

HPE buys Nimble for $1.09B – Trying to make sense of this