Storage companies, persistent losses, and architectural decisions

Today I saw Tintri’s stock price take a 17% hit.  Consensus among my various independent storage pals is that they’ve got two quarters of cash left, and prospects are NOT good for their ability to continue forward.  Their IPO was a huge disappointment, but even if it had raised the amount of desired capital, the revenue and forward sales outlook were still both pointing to a bleak future for these guys. It’s too bad; they have some cool technology around VM performance insight in their all-flash platform.

Also, the news this week from Barron’s is that Pure Storage is shopping themselves (this comes via Summit Redstone’s Srini Nandury); IF TRUE, that’s a clear indication that they see no independent future that protects their shareholders’ value. Their revenues continue to grow, but they also have yet to produce a single dollar in profit, and it’s doubtful they will in the near future.

HP’s acquisition of Nimble is an example of how those who deploy platforms made by persistently negative-income technology firms can work out OK; HP is continuing development on the platform, and providing existing users with the comfort that their investment and their time-consuming integrations are safe. So Nimble customers can thank their lucky rabbit’s feet that the right acquirer came along.

But what if there is no HP equivalent to rescue these technology companies? Certainly Pure has built a great customer base, but will anyone want to put out the $3B-$4B that would satisfy the investors? Would Cisco risk alienating its existing partners and go it alone in the converged infrastructure space after the Whiptail fiasco?  Who in their right minds would touch Tintri at this point from an acquisition perspective?

If you’ve deployed these platforms in your environment, you have some thinking to do.

Consider this net income graph for NetApp from their IPO in 1996:

ntap96-17.png

On an annual basis, Netapp has generally run at a profit from the beginning of their public life.  Certainly, for the four years prior, they ran at a net loss- but their product at the time was used for specific application workloads (development/engineering) – NOT as a foundational IT component that touched every piece of the enterprise, as they do today quite well.  It would have been unwise for an IT architect to consider using NetApp in 1994 to house all of their firms’ data, as the future wasn’t as certain for the technology at the time.  But NetApp used their IPO in ’96 as a statement that “we’re here, we’re profitable, and we’re ready to make our lasting mark on the industry.”  Which they did and continue to do.

For comparison, let’s look at Pure Storage net income since it’s IPO:

PSTG.png

It’s hard to call the right side of that chart a “turnaround”. It’s more of an equilibrium.

Now Pure Storage has some really good technology. The stuff works, it works well, and it’s relatively easy to implement and manage.  However, Pure does not differentiate from the other established (and PROFITABLE) competitors in their space enough for that differentiation to create a new market that they can dominate in (they’re not alone in this; NONE of the smaller storage vendors can claim that they do, that’s the problem).  As is normal for today’s Angel-to-VC-to-IPO culture, Pure used their IPO as an exit strategy for their VC’s and to raise more desperately needed cash for their money-losing growth strategy (the net income chart speaks for itself). That strategy is failing.  With the news that they’re shopping, they realize this too.  When prospective clients realize this, it’s really going to get difficult.

So while the tech geek in all of us LOVES new and cool technology, if you’re going to make a decision on a platform that is foundational in nature (meaning it will be the basis of your IT infrastructure and touch everybody in your firm), you’d be well advised to dig deep into the income statement, balance sheet, and cash flow of the firm making that technology, and put those stats right up there with the features/benefits, speeds and feeds.  Otherwise, you may have some explaining to do later.

Bottom line: if your selected storage manufacturer is losing money, has never actually made any money, and doesn’t look like they’re going to make money anytime soon, there’s a relatively good chance you’re going to be forced into an unwelcome decision point in the near-to-medium future. Caveat emptor.

 

 

Advertisements
Storage companies, persistent losses, and architectural decisions

NetApp gets the OpEx model right

Ever since the Dot-Com Boom, enterprise storage vendors have had “Capacity on Demand” programs that promised a pay-as-you-use consumption model for storage. Most of these programs met with very limited success, as the realities of the back-end financial models meant that the customers didn’t get the financial and operational flexibility to match the marketing terms.

The main cause of the strain was the requirement for some sort of leasing instrument to implement the program; meaning that there was always some baseline minimum consumption commitment, as well as some late-stage penalty payment if the customer failed to use as much storage as was estimated in the beginning of the agreement. This wasn’t “pay-as-you-use” as much as it was “just-pay-us-no-matter-what”.

NetApp has recently taken a novel approach to this problem, by eliminating the need for equipment title to change from NetApp to the financial entity backing the agreement. With NetApp’s new NetApp OnDemand, NetApp retains title of the equipment, and simply delivers what’s needed.

An even more interesting feature of this program is that the customer pays NOT for storage, but for capacity within three distinct performance service levels, each defined by a guaranteed amount of IOPS/TB, and each of these service levels has a $/GB/Mo associated with it.

To determine how much of each service level is needed at a given customer, NetApp will perform a free “Service Design Workshop” that uses the Netapp OnCommand Insight (OCI) tool to examine each workload and show what the IO Density (IOPS/TB) is for each. From there, NetApp simply delivers storage that is designed to meet those workloads (along with consideration for growth, after consulting with the customer). They include the necessary software tools to monitor the service levels (Workflow Automation, OnCommand Unified Manager, and OCI), as well as Premium support and all of the ONTAP features that are available in their Flash and Premium bundles.

Customers can start as low as $2k/month, and go up AND DOWN with their usage, paying only for what they use from a storage perspective AFTER efficiencies such as dedupe, compression, and compaction are taken into account. More importantly, the agreement can be month-to-month, or annually; the shorter the agreement duration of course, the higher the rate. This is America, after all.

The equipment can sit in the customer premises, or a co-location facility- even a near-cloud situation such as Equinix, making the Netapp Private Storage economics a true match for the cloud compute that will attach to it.

A great use case for NetApp OnDemand is with enterprise data management software, such as Commvault, which can be sold as a subscription as well as as a function of capacity. Since the software is now completely an OpEx, the target storage can be sold with the same financial model – allowing the customer to have a full enterprise data management solution with the economics of SaaS. Further, there would be no need to over-buy storage for large target environments, it would grow automatically as a function of use. This would be the case with any software sold on subscription, making an integrated solution easier to budget for as there is no need to cross the CapEx/OpEx boundary within the project.

This new consumption methodology creates all sorts of new project options. The cloud revolution is forcing companies such as NetApp to rethink how traditional offerings can be re-spun to fit the new ways of thinking in the front offices of enterprises. In my opinion, NetApp has gotten something very right here.

NetApp gets the OpEx model right

Intel Storage – Storage Field Day 12

On the last day of Storage Field Day 12, we got to visit Intel Storage, who gave us some insight into what they were seeing and how they are driving the future of high performance storage.  Just prior to the Intel Storage session, we heard from SNIA’s Mark Carlson about trends and requirements mainly driven by the needs of the Hyperscalers, and Intel’s session detailed some very interesting contributions that line up very neatly with those needs.

One of the big takeaways from the SNIA session was that the industry is looking to deal with the issue of “Tail Latency events”, or as Jonathan Stern (a most engaging presenter) put it, “P99’s”.  Tail Latency occurs when a device returns data 2x-10x slower than normal for a given I/O request.  Surprisingly, SSD drives are 3 times more likely to have a tail latency event for a given I/O than spinning media.  Working the math out, that means that a Raid stripe of SSD drives has a 2.2% chance of experiencing tail latency- and the upper layers of the stack have to deal with that event by either waiting for that data or repairing/calculating that late data via parity.

Now one would think that when you’re dealing with latencies on NVM of 90-150 microseconds, even going to 5x keeps you within 1ms or so.  But what the industry (read: Hyperscalers who purchase HALF of all shipped storage bytes) is looking for is CONSISTENCY of latency- they want to provide service levels and be sure that their architectures can deliver rock-solid, stable performance characteristics.

Intel gave us a great deep dive of the Storage Performance Development Kit  (SPDK), which is an answer to getting much closer to that lower standard deviation of latency.  The main difference in their approach, which is the most interesting development (that could drive OTHER efficiencies in non-storage areas, IMO), is that they have found that isolating a CPU core for storage I/O in NVMe environments provided a MUCH better performance consistency, primarily because they eliminate the costs of context switching/polling .

The results they showed by using this approach were staggering.  By using 100% of ONE CPU core with their USER space driver, they were able to get 3.6 Million IOPs with 277ns overhead per I/O from the transport protocol.  Of course that’s added to the latency of the media, but that is a small fraction of what’s seen when using the regular Linux kernel-mode drivers that run across multiple CPUs.   We’re talking nearly linear scalability when you add additional NVMe SSDs using that same single core.

This is still relatively young, but the approach Intel is taking with the single-core, user space driver is already being seen in the marketplace (E8 Storage comes to mind, it’s unknown if they are using Intel’s SPDK or their own stuff).

Intel’s approach of stealing a dedicated core may sound somewhat backwards; however as Intel CPUs get packed with more and more cores, the cores start to become the cheap commodity, and the cost of stealing a core will start to go below the performance cost of context switching/polling (it may have already), as the media we’re working with now has become so responsive and performant that the storage doesn’t want to wait for the CPU anymore!

This is also consistent with a trend seen across more than a few of the new storage vendors out there, which is to bring the client into the equation with either software (like the SPDK) or a combination of hardware (like R-Nics) and software to help achieve both the high performance AND the latency consistency desired by the most demanding of storage consumers.

We may see this trend of dedicating cores become more popular across domains, as the CPU speeds aren’t improving but core counts are- and the hardware around the CPU becomes more performant and dense.  If you play THAT out long term, virtualized platform architectures such as VMWare that run hypervisors across all cores (usually) may get challenged by architectures that simply isolate and manage workloads on dedicated cores. It’s an interesting possibility.

By giving the SPDK away, Intel is sparking (another) storage revolution that storage startups are going to take advantage of quickly, and change the concept of what we consider “high performance” storage.

 

**NOTE: Article edited to correct Mark Carlson’s name. My apologies.

Intel Storage – Storage Field Day 12

HPE buys Nimble for $1.09B – Trying to make sense of this

So per IDC, HPE was statistically tied for the #2 spot in the External Storage Array market in Q3 ’16, with $549M in sales vs $650M the prior year’s Q3.  That’s quite a downward trend. Included in that number are the multiple storage offerings that HP owns: 3PAR, LeftHand, its own arrays, etc.  

Today we find out that HPE has paid $1.09B for a company that has total revenues of around $500M, was losing money at a rate of around $10M per quarter, and had no appreciable market share in the external array market. Nimble made its initial bet on hybrid flash architecture, which became a problem as the market moved to all-flash. Nimble changed course and provided all-flash, but many other vendors were far ahead here. 

So what gives?  How can Nimble fit into a long-term strategy for HP?

Nimble isn’t really part of a hyperconverged play, so in the context of the recent Simplvity acquisition, this seems a parallel move. 

There’s InfoSight, which provides predictive analytics and ease of management for Nimble Arrays; perhaps HP sees a platform it can expand to its enterprise customers. But a BILLION dollars for that??

Nimble has a lot of small customers (over 10,000 at this point based on a recent press release), but 10,000 customers is a pittance compared to HP’s existing customer base across all customer sizes. 

In the short term, this acquisition will bump up HP into the #2 spot alone in external arrays, but not by much, and given the current trajectory of their external storage array revenue, it’s likely they won’t hold that spot for long. 

When you consider this acquisition from all the angles (and I’m sure I’m missing some and welcome the discussion), I struggle to make sense of this acquisition from HPE’s standpoint.  I only see cannibalization of existing storage business, disruption of the storage sales organization, and no added value to HPE’s overall storage offering.  Did HPE simply have $1B lying around with nothing better to do with it?  Guys, call ME next time.

HPE buys Nimble for $1.09B – Trying to make sense of this

Do GUIs really make things simpler?

I was at a technical breakout at a Netapp partner event, and heard a question from an SE (from another partner) that implied that most IT folks don’t have the chops for managing/deploying their environment without using GUIs and wizards.  He postulated that in order to further simplify concepts around complex infrastructure, most admins (especially in small-to-medium enterprises) need the wizard-driven spoon-feeding that GUIs provide so they can get back to the business of doing what they normally do…which is typically putting out all the fires they suffer from on a daily basis.

Ironically, you can trace many of the aforementioned daily fires directly to the use of those GUIs!  Notice I used the term “spoon-feeding” before, that was purposeful- GUIs used to deploy and configure resources (by engineers and not consumers) are just like SUGAR.  It’s sweet going down but you’re going to pay for that in many ways later.

Why is that?

When you use a GUI, Wizard, (or even a non-idempotent script) to deploy something like a VM, server, OS, APP, or even a VLAN, the next time you deploy one you are essentially attempting to rebuild a snowflake with no reference to the original.  This introduces all sorts of unpredictable inconsistencies into the environment, which ultimately results in faults.  Also, testing your GUI-based deployment on a test instance beforehand won’t guarantee you won’t have problems later, since you- the Gui/wizard user – are the most unpredictable component of the deployment, and can’t even guarantee that the test instance matched what you’ll end up with in production.

What’s even worse is that the whole idea of using a deployment wizard implies that you don’t need to know how your [insert tech here] fundamentally works in order to get it deployed.  THIS IS WRONG IS SO MANY WAYS.   How will you know if the [insert tech here] is optimized for your environment?  ..that the configuration the wizard chose won’t cause performance or availability issues down the road? ..that you won’t get boxed into a configuration that limits your flexibility later?   I mean, how can you call yourself an architect/engineer/administrator if you don’t actually LEARN the details of the system you’re going to architect/engineer/administer?

If you decide to use a tool like Puppet or Chef, with which you declare what your [insert tech here] should look like, you MUST by definition know how that technology fundamentally works, at least to some point, right?  You have make all of your choices up front, in the recipe file for instance, and this forces you to understand the available configuration options, etc, and also allows you to apply that configuration to a test instance FIRST, prior to production deployment.  Go try THAT with wizard-based deployment!

Of course, this is HARDER and it’s MORE WORK.  Up front, at least. It also requires research and knowledge.  You need to learn how the configuration/automation tools work.  It implies testing, which many don’t believe they have the time for (but somehow always find the time to fix stuff when it goes down).  So yes, the first time you do something this way, it WILL take longer.  The first time.  After that, the things you do most often become more and more trivial, and they’re done RIGHT, consistently.  The daily fires start turning into weekly and perhaps monthly fires.  Life starts getting more enjoyable.

And YES…this smacks of “DevOps”.  I’m not talking about this from a development or even an enterprise perspective though.  I’m talking about all of those small-medium sized businesses who have 1-2 IT folks who run the show, who are always running around with their hair on fire.  I’ve worked with those people for over 20 years and I feel awful at how many personal weekends and nights they lose because stuff is down.

I like to think of it this way: Would you drive on a suspension bridge that was architected and engineered by a gui-driven wizard, with the architect flying through screen by screen guessing if the default choice in every screen is “ok for now” and clicking “next”?  Or would you feel better knowing that an architect designed that bridge meticulously with great forethought using knowledge of bridges and physics in general, and the engineers thoughtfully built that bridge using the plans but also the knowledge of best practices applied to the specific environment? 

Don’t we as IT infrastructure architects and engineers owe it to our employers and our teammates to apply the same rigor to our work?

Do GUIs really make things simpler?

Willful Ignorance?

I recently had the honor of speaking to a large group of storage and network engineers on the topic of devOps. My segment was squeezed in between some other content that I’m pretty sure was much more important to them, like product announcements, demos, calls to action, etc.

Why do I think that?

Well, during the segment I asked the crowd a question- “How many of you have read The Phoenix Project by Gene Kim [et al]?”

I counted maybe 15 hands out of…a LOT more. I was flabbergasted.

If you haven’t read this book, it’s highly likely that you do not understand your customers’ problems, and therefore do not understand your customers. One day, you’re going to walk into your biggest customers and they’re going to be very sorry to tell you that you’re not needed anymore, as your offerings (and perhaps sales model) don’t align with their new strategy.

The worst part is, you probably think you’re pretty darn good at this IT stuff. You know your tech (for years now!), you’ve got your speeds and feeds down pat, you have gobs of expertise in this technology or that. Your customers (almost all of them IT folks) come to you with their problems. Perhaps you even socialize with many of these people, and consider these relationships completely safe.

It’s not that your customers won’t need the products and services you currently offer. It’s just that the way that they CONSUME these will require an understanding (on your part) of their new (or soon-to-be new) models and processes that will drive their “accelerating acceleration”, and yes, I’m talking devOps here. Your services need to be updated to align the products with these new ways, which means making automation, scripting, and infrastucture-as-code major core competencies. Show them how your products and services assist or enable their transformative efforts, or somebody else will. “Somebody else” could be another department (app dev, for instance) that will transform their use of technology outside of IT and REALLY put the screws to your offering. If the products you specialize in can’t align with this philosophy, it’s time to focus on obtaining new expertise in the technologies that will replace them.

Just yesterday I visited with two customers, both of which had “Shadow IT” instances turn into permanent business transformations, as the “Shadow IT” folks were able to deliver value to the business within days, where the IT ops folks were taking weeks into months to deliver the same services. Times have changed, people.

I wouldn’t even start down the road of learning automation or Infrastructure-as-Code until you’ve READ THIS BOOK. (There are many others, this one will be the most entertaining and therefore most likely to be completed). You need to know WHY all of this is important, remind yourself WHY we do what we do as IT professionals, and understand the nature of your customers’ desires to transform in order to steer your own efforts, both personal and organizational, in the right direction.

Willful Ignorance?

Netapp SolidFire: FlashFoward notes

I had the privilege of attending the Netapp SolidFire Analyst day last Thursday- and rather than go through what the company told us (which was all great stuff), what I heard from SolidFire’s customers while there was probably more relevant and important.

I won’t repeat what’s been detailed elsewhere about the new capacity licensing model from Netapp SolidFire, which breaks the storage appliance model by decoupling the software license from the hardware it runs on.  This new model, called “FlashForward”, allows for a flexibility and a return on investment previously unavailable in the enterprise storage market.

The service provider customers that participated in the breakouts unanimously agreed that this new licensing model was going to be a huge win for them.  The most striking point was one service provider who had different depreciation schedules for software versus hardware –  something that couldn’t be taken advantage of with the appliance model.  Now, since the FlashForward program allows licenses to be moved between hardware instances, the software can now be depreciated over a longer timeframe (in this SP’s case it was 7 years).  Hardware refreshes now come at a much lower cost in year 3-4.  This all results in the ability to provision resources to tenants with a lower incremental cost per resource.

This could also have a major impact in the ability of companies to finance/lease SolidFire solutions- If you’re financing a given set of software over 6-7 years, obviously the monthly bill for that will be less than if you’re amortizing it over 3 years.   Hardware remains at a 36 month lease with its typical residual.  In a given year, that has the possibility of reducing cash out considerably.

Enterprise customers weren’t quite as giddy about FlashFoward as they were about the SolidFire technology itself; however the folks there were all IT, not finance.  Service Provider folks tend to be more focused on the economics of resource delivery.  The Enterprise IT folks were all about the “set-it and forget-it” benefits of their SolidFire implementations, with one customer stating that they only had to call support once in two years, for an event that wasn’t even SolidFire-related.  Certainly, we had happy customers at the Analyst event, but their stories were all those of the challenges of choosing a smaller storage vendor (at the time), against major industry headwinds and having to justify that decision with full proof-of-concepts.  Impressive stuff.

Of course SolidFire has made its name in the Service Provider market; their embrace of automation technology and the devops philosophy is recognized as leading the market.  This is precisely why we should be keeping a very close eye on these folks, as Enterprise IT looks to become its own Service Provider, and automate its on-premises resources in the same manner in which it automates its cloud resources.  Given this automation advantage and the acquisitional flexibility now offered,  SolidFire is going to align very well with enterprises that have historically implemented the straight SAN storage appliance model but are looking to transform and modernize.

Netapp SolidFire: FlashFoward notes