NetApp connects Hadoop to NFS

The link to Val Bercovici’s article is here.

Here’s the gist-

Hadoop natively uses HDFS, which is a file system that’s made to be node-level redundant. Data is replicated by default THREE times across nodes in a Hadoop cluster.  The nodes themselves, at least in the “tradition” of Hadoop, do not perform any RAID at all, if node’s filesystem fails, the data is already contained elsewhere and any running MapReduce jobs are simply started over.

This is great if you have a few thousand nodes and the people you’re crunching data for are at-large consumers who aren’t paying for your service and as such cannot expect service levels of any kind.

Enterprise, however, is a different story.  Once business units start depending on reduced results from Hadoop, they start depending on the timeframe in which it’s delivered as well.  Simply starting jobs over is NOT going to please anyone and could interrupt business processes.  Further, Enterprises don’t have the space or budget to put up Hadoop clusters with the scale the Facebooks and Yahoos do (they also don’t typically have the justifiable use cases). In fact, the Enterprises I’m working with are taking a “build it and the use cases will come” approach to Hadoop.

NetApp’s NFS connector for Hadoop significantly reduces the entry point for businesses who want to vet out Hadoop and justify use cases.  One of the traditional problems with Hadoop is that one needs to create a silo’ed architecture- servers, storage, and network, in a scale that prove the worth of Hadoop.

Now, businesses can throw compute (physical OR virtual) into a Hadoop cluster and connect to existing NFS datastores – whether they are on NetApp or not!   NetApp has created this connector and thrown it upon the world as open source on GitHub.

This removes a huge barrier to entry for any NetApp (or NFS!) customer who is looking to perform analytics against an existing dataset without moving it or creating duplicate copies.

Great move.

NetApp connects Hadoop to NFS

Rant #1: Data-At-Rest Encryption

Subtitled: Data-at-rest encryption compare and contrast, Netapp FAS & VNX2

So every once in a while I run across a situation at a client where I get to see the real differences in approach between technology manufacturers.

The specific focus is on data-at-rest encryption.  Encrypting data that resides on hard drives is a good practice, provided it can be done cost-effectively with a minimal effect on performance, availability, and administrative complexity. Probably the best reason I can come up with for implementing data-at-rest encryption is the drive failure case- you’re expected to send back that ‘failed’ hard drive to the manufacturer, where they will mount that drive to see just how ‘failed’ it really is.  Point is, you’ve still got data on there. If you’re a service provider, you’ve got your client’s data on there. Not good. Unless you’ve got an industrial-grade degausser, you don’t have many options here.  Some manufacturers have a special support upgrade that lets you throw away failed drives instead of return them, but that’s a significant bump up in cost.

OK, so now you’ve decided, sure, I want to do data-at-rest encryption.  Great!  Turn it on!

Not so fast.

The most important object in the world of encryption is the key.  Without the key, all that business-saving data you have on that expensive enterprise-storage solution is useless.  Therefore, you need to implement a key-management solution to make sure that keys are rotated, backed up, remembered, and most importantly available for a secure restore.  The key management solution, like every other important piece of IT equipment, needs a companion in DR in case the first one takes a dive.

Wait, secure restore?  Well, what’s the point of encrypting data if you make it super-simple to steal the data and then decrypt it?  Most enterprise-grade key management solutions implement quorums for key management operations, complete with smart cards, etc.   This helps prevent the all-too-often occurrence of “inside man” data theft.

NetApp’s answer to data-at-rest encryption is the self-encrypting drive.  The drives themselves perform all the encryption work, and pass data to the controller as any drive would.  The biggest caveat here is that all drives in a Netapp HA pair must be of the NSE drive type, and you can’t use Flash Pool or MetroCluster.

Netapp partners with a couple of key management vendors, but OEM the SafeNet KeySecure solution for key management.  Having the keys stored off-box ensures that if some bad guy wheels away with your entire storage device, they’ve got nuthin’.  It won’t boot without being able to reach the key management system.  SafeNet’s KeySecure adheres to many (if not all) industry standards, and can simultaneously manage keys for other storage and PKI-enabled infrastructure.  I consider this approach to be strategic as it thinks holistically- one place to manage keys across a wide array of resources.  Scalable administration, high-availability, maximum security.  Peachy.

I had the opportunity to contrast approach this against EMC’s VNX2 data-at-rest solution.  EMC took its typical tactical approach, as I will outline next.

Instead of using encrypting drives, they chose the path of the encrypting SAS adapter – they use PMC-Sierra’s Tachyon adapter for this with the inline ASIC-based encryption.  So it’s important to note that this encryption technology has nothing to actually do with VNX2. More on this architectural choice later.

Where the encryption is done in a solution is equally as important as the key management portion of the solution- without the keys, you’re toast.  EMC, owner of RSA, took a version of RSA key management software and implemented it in the storage processors of the VNX2.  This is something they tout as a great feature- “embedded key management”.   The problem is, they have totally missed the point.  Having a key manager on the box that contains the data you’re encrypting is a little like leaving your keys in your car.  If someone takes the entire array, they have your data.  Doesn’t this go against the very notion of encrypting data? Sure, you’re still protected from someone swiping a drive.  But entire systems get returned off lease all the time.

Now, of course if you’ve got a VNX2, you’ve got a second one in DR.  That box has its OWN embedded key manager.  Great.  Now I’ve got TWO key managers to backup every time I make a drive config change to the system (and if I change one, I’m probably changing the other).

What?  You say you don’t like this and you want to use a third-party key manager?  Nope.  VNX2 will NOT support any third-party, industry-standard compliant key manager.  You’re stuck with the embedded one.  This embedded key manager sounds like more of an albatross than a feature.  Quite frankly, I’m very surprised that EMC would limit clients in this way, as the PMC-Sierra encrypting technology that’s in VNX2 DOES support third-party key managers!  Gotta keep RSA competitors away though, that’s more important than doing right by clients, right?

OK. On to the choice of the SAS encrypting adapter vs. the encrypting hard drives.

Encrypting at the SAS layer has the great advantage of being able to encrypt to any drive available on the market.  That’s a valid architectural advantage from a cost and product choice perspective. That’s where the advantages stop, however.

It should seem obvious that having many devices working in parallel on a split-up data set is much more efficient than having 100% of the data load worked on at one point (I’ll call it the bottleneck!) in the data chain.  Based on the performance hit data supplied by the vendors, I’m probably correct.  EMC states a <5% performance hit using encryption- but has a caveat that “large block operations > 256KB” and high-throughput situations could result in higher performance degradation.  Netapp has no such performance restriction (its back end ops are always smaller), and the encryption jobs are being done by many, many spindles at a time, not a single ASIC (even if there are multiples, same point applies).  However, I could see how implementing an encrypting SAS adapter would be much easier to get to-market quickly, and the allure of encrypt-enabling all existing drives is strong.  Architecturally it’s way too risky to purposely introduce a single bottleneck that affects an entire system.

Coming to the end of this rant.  It just never ceases to amaze me that when you dig into the facts and the architecture, you’ll find that some manufacturers always think strategically, and others can’t stop thinking tactically.

Rant #1: Data-At-Rest Encryption

Death of the Expert

Last week, I had the pleasure of attending the first #RWDevCon, an iOS Development/Tutorial conference in Washington, DC.  About 200 developers converged on the Liaison Hotel, and for two days engaged in some intensive demo- and lab-oriented learning.  Sessions also included some great inspirational talks from folks that have experienced, survived, and thrived in the coding business.

While I’ve done some coding myself over the years (and secretly wish I could do more of it), I was there for other reasons, primarily accompanying my 15-year-old son who has achieved a level of coding I can’t even dream of attaining.  I figured instead of holing up in my hotel room while my kid received all the golden knowledge, I’d attend the ‘beginner’ sessions and get what I could out of the experience.

One thing I definitely noticed is that programming, er.. coding, has changed.  I remember the days where a programmer would write a program, and would be responsible for all facets of it over time.  This included the user interface, the data flow, communications, etc.  Even in the beginning of the “App” explosion, independent developers were the cool guys to emulate.

The term that was used to describe the change to this paradigm was the “Indiepocalypse” (credit for that goes to the esteemed Ray Wenderlich, the “RW” in RWDevCon and principal at  Projects are now all done by teams.  The ability to work effectively in a development team is now just as, if not more important than the ability to code.  No more lone wolves, no more genius-in-a-closed-closet writing awesome code nobody else can understand.  Different team members now focus on what they do best- user experience, architectural design, database, etc.   It’s gotten too complex for one person to be “the expert” at all of these concepts.

So what does this have to do with Infrastructure? Infrastructure has gotten so converged that each discipline within IT architecture affects the others in ways the players and managers don’t often recognize.  There are a few lessons we can take from the “Indiepocalypse” and apply to our world:

  • Don’t rely on a single “Expert”

    All of your organization’s talent can’t be locked in one person’s head, no matter how good they are.  There are obvious “hit by a bus” consequences to this, but it creates a huge imbalance in the organization, and every decision that gets made will have that person’s stamp all over it, at the cost of everyone else’s opinions.
  • Constant learning and team building

    Learning, especially when done as a team, ensures that everyone gets the same opportunities to effectively contribute, and gives the team a common “code base” from which to “code” the future of the infrastructure.
  • Embracing Change

    Developers at this conference were there to embrace Apple’s move from Objective C to Swift, as well as newer technologies and best practices.  Staying put is not an option for anyone that wants to be needed for the next project.  Why should infrastructure be any different?  Teams need to understand ALL new technologies, whether they’re going to use them or not- otherwise how are they to effectively provide the business with the best options going forward?  How are they to adequately filter through all of the tech marketing thrown at them and bring real solutions to real problems?

There are plenty more parallels that can be laid out; I’ll leave to the reader to think those out.  The point is, we need to start treating our IT architecture more like a software development environment than we have, especially since everything is becoming “Software-Defined” – there are tons of learned lessons out there, but they’re all “over the wall” in the software development departments.  Time to tear down those walls.

Death of the Expert

IT architecture – Tactical vs. Strategic

I decided this should be the primary focus of my writings, or at least the filter through which I discuss the technical and architectural aspects of what I do – which is consult with enterprises of all sizes regarding their IT architecture. There are so many blogs written by folks in my space that will give amazing insight into specific technical features and “how-to”s, and I’m sure I’ll throw a few of these in here and there.  What I do find lacking a bit is the WHY, and also WHY IT MATTERS.

First, some definitions are in order.

What do I mean by the word “tactical”?  When I refer to an organization as behaving tactically in regards to IT architecture, that means that they are solving problems and tackling projects as they are presented to them.  Each problem/project has its own set of demands and requirements, and for a variety of reasons, the organization is set up to deal with these in an isolated way- perhaps budgets are allocated by project, the political structure is such that people protect their fiefdoms jealously, or it’s just easier and faster to deal with in this fashion.   For whatever reason, each issue that comes up is handled as distinct.  This leads to data silos, multiplied administrative and training requirements, and more than the minimum number of data copies (which presents real risk).

The “Strategic” organization, in contrast, will take a step back and see that most IT problems and projects have much, if not everything, in common.  These firms are most concerned with creating and maintaining a responsive, reliable, measurable, and operational IT ecosystem in which efficiencies are realized, and that can provide management with the confidence that future projects (such as acquisitions or new go-to-market initiatives) can be handled with relative ease.  I’m proud to work with more than a few of these companies, and they are truly enjoying the benefits of this approach.

In my role as a technical evangelist, I find that most often I am evangelizing this latter approach to clients more than I speak about specific technologies.  The technologies I choose to offer clients, and I am lucky enough to have the freedom to choose any technologies I wish, all provide my clients with the tools they need to go down the Strategic Path.  Of course, my clients can continue to use these tools tactically, which while functional, misses the point.  Technology alone can’t transform an organization; it takes patience and discipline from the top down, and usually a few iterations to get it right.  The wrong technologies, however, can BLOCK the Strategic Path, and it’s important to recognize this up front because in the IT industry, you’re typically stuck with your tools for a minimum of three years.

So I’ll conclude this first chapter with this- I’ll make no secret that the core set of products I have recommended to my clients includes NetApp, Commvault, and Riverbed.  There are a few others I add to that mix when appropriate, and other sets of technology I steer clear away from as they tend to block the strategic path.  I’ll try to justify those decisions as well, from a strategic perspective.

IT architecture – Tactical vs. Strategic