Technique/SAP HANA2012. 5. 8. 10:42

HANA vs. Exalytics: an Analyst's View

Posted by David Dobrin in Blog on May 6, 2012 10:00:28 PM
Introduction

 

Some people at SAP have asked me to comment on the "HANA vs. Exalytics" controversy from an analyst's point of view.  It's interesting to compare, so I'm happy to go along.  In this piece, I'll try to take you through my thinking about the two products.  I can see that Vishal Sikka just posted a point-by-point commentary on Oracle's claims, so I won't try to go him one better.  Instead, what I'll try to do is make the comparison as simple and non-techie friendly as I can.  Note that SAP did not commission this post, nor did it ask to edit it.

 

Exalytics

 

To begin with, let's start with something that every analyst knows:  Oracle does a lot of what in the retail trade are called "knockoffs."

 

They think this is good business, and I think they're right.  The pattern is simple.  When someone creates a product and proves a market, Oracle (or, for that matter, SAP) creates a quite similar product.  So, when VMWare (and others) prove Vsphere, Oracle creates Oracle VM; when Red Hat builds a business model around Linux, Oracle creates "Oracle Linux with Unbreakable Enterprise Kernel."

 

We analysts know this because it's good business for us, too.  The knockoffs--oh, let's be kind and call them OFPs for "Oracle Followon Products"--are usually feature-compatible with the originals, but have something, some edge, which (Oracle claims) makes them better.

 

People get confused by these claims, and sometimes, when they get confused, they call us.

 

Like any analyst, I've gotten some of these calls, and I've looked at a couple of the OFPs in detail.  Going in, I usually expect that Oracle is offering pretty much what you see at Penneys or Target, an acceptable substitute for people who don't want to or can't pay a premium, want to limit the number of vendors they work with, aren't placing great demands on the product, etc., etc.

 

I think this, because it's what one would expect in any market.  After all, if you buy an umbrella from a sidewalk vendor when it starts to rain, you're happy to get it, it's a good thing, and it's serviceable, but you don't expect it to last through the ages.

 

Admittedly, software is much more confusing than umbrellas. With a $3 umbrella, you know what you're getting.  With an OFP (or a follow-on product from any company), you don't necessarily know what you're getting.  Maybe the new product is a significant improvement over what's gone before. If the software industry were as transparent as the sidewalk umbrella industry, maybe it would be clear.  But as it is, you have to dig down pretty deep to figure it all out. And then you may still be in trouble, because when you get "down in the weeds," as my customers have sometimes accused me of being, you might be right, but you might also fail to be persuasive.

 

Which brings me to the current controversy.  To me, it has a familiar ring. SAP releases HANA, an in-memory database appliance.  Now Oracle has released Exalytics, an in-memory database appliance.  And I'm getting phone calls.

 

HANA:  the Features that Matter

 

I'm going to try to walk you through the differences here, while avoiding getting down in the weeds. This is going to involve some analogies, as you'll see.  If you find these unpersuasive, feel free to contact me.

 

To do this, I'm going to have to step back from phrases like "in-memory" and "analytics," because now both SAP and Oracle are using this language. I'll look instead at the underlying problem that "in-memory" and "analytics" are trying to solve.

 

This problem is really a pair of problems.  Problem 1.  The traditional row-oriented database is great at getting data in, not so good at getting data out.  Problem 2.  The various "analytics databases," which were designed to solve Problem 1--including, but not limited to the column-oriented database that SAP uses--are great at getting data out, not so good at getting data in.

 

What you'd really like is a column-oriented (analytics) database that is good at getting data in, or else a row-oriented database that is good at getting data out.

 

HANA addresses this problem in a really interesting way.  It is a database that can be treated as either row-oriented or column-oriented.  (There is literally a software switch that you can throw.)  So, if you want to do the very fast and flexible analytic reporting that column-oriented databases are designed to do, you throw the switch and run the reports.  And if you want to do the transaction processing that row-oriented databases are designed to do, you throw the switch back.

 

Underneath, it's the same data; what the switch throws is your mode of access to it.

 

In extolling this to me, my old analyst colleague, Adam Thier, now an executive at SAP, said, "In effect, it's a trans-analytic database."  (This is, I'm sure, not official SAP speak.  But it works for me.)  How do they make the database "trans-analytic?"  Well, this is where you get down into the weeds pretty quickly.  Effectively, they use the in-memory capabilities to do the caching and reindexing much more quickly than would have been possible before memory prices fell.

 

There's one other big problem that the in-memory processing solves.  In traditional SQL databases, the only kind of operation you can perform is a SQL operation, which is basically going to be manipulation of rows and fields in rows.  The problem with this is that sometimes you'd like to perform statistical functions on the data:  do a regression analysis, etc., etc.  In a traditional database, though, you're kind of stymied; statistical analysis in a SQL database is complicated and difficult.

 

In HANA, "business functions" (what marketers call statistical analysis routines) are built into the database.  So if you want to do a forecast, you can just run the appropriate statistical function.  It's nowhere near as cumbersome as it would be in a pure SQL database.  And it's very, very fast;  I have personally seen performance improvements of three orders of magnitude.

 

Exalytics:  the Features that Matter

 

Now when I point out that HANA is both row-oriented (for transactions) and column-oriented (so that it can be a good analytics database) and then I point out that it has business functions built-in, I am not yet making any claim about the relative merits of HANA and Exalytics.

 

Why?  Well, it turns out that Exalytics, too, lets you enter data into a row-oriented database and allows you to do reporting on the data from an analytics database.  And in Exalytics, too, you have a business function library.

 

But the way it's done is different.

 

In Exalytics, the transactional, row-oriented capabilities come from an in-memory database (the old TimesTen product that Oracle bought a little more than a decade ago).  The analytics capabilities come from Essbase (which Oracle bought about 5 years ago), and the business function library is an implementation of the open-source R statistical programming language.

 

So what, Oracle would argue. It has the features that matter.  And, Oracle would argue, it also has an edge, something that makes this combination of databases clearly better.  What makes it better, according to Oracle? In Exalytics, you're getting databases and function libraries that are tested, tried, and true.  TimesTen has been at the heart of Salesforce.com since its inception.  Essbase is at the heart of Hyperion, which is used by much of the Global 2000.  And R is used at every university in the country.

 

Confused?  Well, you should be.  That's when you call the analyst.

 

HANA vs. Exalytics

 

So what is the difference between the two, and does it matter?  If you are a really serious database dweeb, you'll catch it right away:

 

In HANA, all the data is stored in one place. In Exalytics, the data is stored in different places.

 

So, in HANA, if you want to report on data, you throw a switch.  In Exalytics, you extract the data from the Times10 database, transform it, and load it into the Essbase database.  In HANA, if you want to run a statistical program and store the results, you run the program and store the results.  In Exalytics, you extract the data from, say, Times10, push it into an area where R can operate on it, run the program, then push the data back into Times10.

 

So why is that a big deal?  Again, if you're a database dweeb, you just kind of get it.  (In doing research for this article, I asked one of those dweeb types about this, and I got your basic shrug-and-roll-of-eye.)

 

I'm not that quick. But I think I sort of get what their objection is.  Moving data takes time.  Since the databases involved are not perfectly compatible, one needs to transform the data as well as move it. (Essbase, notoriously, doesn't handle special characters, or at least didn't use to.)  Because it's different data in each database, one has to manage the timing, and one has to manage the versions.  When you're moving really massive amounts of data around (multi-terabytes), you have to worry about space.  (The 1TB Exalytics machine only has 300 GB of actual memory space, I believe.)

 

One thing you can say for Oracle.  They understand these objections, and in their marketing literature, they do what they can to deprecate them.  "Exalytics," Oracle says, "has Infiniband pipes" that presumably make data flow quickly between the databases, and "unified management tools," that presumably allow you to keep track of the data. Yes, there may be some issues related to having to move the data around.  But Oracle tries to focus you on the "tried and true" argument. You don't need to worry about having to move the data between containers, not when each of the containers is so good, so proven, and has so much infrastructure already there, ready to go.

 

As long as the multiple databases are in one box, it's OK, they're arguing, especially when our (Oracle's) tools are better and more reliable.

 

Still confused?  Not if you're a database dweeb, obviously.  Otherwise, I can see that you might be.  And I can even imagine that you're a little irritated. "Here this article has been going on for several hundred lines," I can hear you saying, "and you still haven't explained the differences in a way that's easy to understand."

 

HANA:  the Design Idea

 

So how can you think of HANA vs. Exalytics in a way that makes the difference between all-in-one-place and all-in-one-box-with-Infiniband-pipes-connecting-stuff completely clear?  It seems to me, the right way, is to look at the design idea that's operating in each.

 

Here, I think, there is a very clear difference.  In TimesTen or Essbase or other traditional databases, the design idea is roughly as follows: if you want to process data, move it inside engines designed for that kind of processing. Yes, there's a cost. You might have to do some processing to get the data in, and it take some time.  But those costs are minor, because once you get it into the container, you get a whole lot of processing that you just couldn't get otherwise.

 

This is a very normal, common design idea.  You saw much the same idea operating in the power tools I used one summer about forty years ago, when I was helping out a carpenter.  His tools were big and expensive and powerful--drill presses and table saws and such like--and they were all the sort of thing where you brought the work to the tool. So if you were building, say, a kitchen, you'd do measuring at the site, then go back to the shop and make what you needed.

 

In HANA, there's a different design idea:  Don't move the data.  Do the work where the data is.  In a sense, it's very much the same idea that now operates in modern carpentry.  Today, the son of the guy I worked for drives up in a truck, unloads a portable table saw and a battery-powered drill, and does everything on site and it's all easier, more convenient, more flexible, and more reliable.

 

So why is bringing the tools to the site so much better in the case of data processing (as well as carpentry?)  Well, you get more flexibility in what you do and you get to do it a lot faster.

 

To show you what I mean, let me give you an example.  I'll start with a demo I saw a couple of years ago of a relatively light-weight in-memory BI tool.

 

The salesperson/demo guy was pretty dweeby, and he traveled a lot.  So he had downloaded all the wait times at every security gate in every airport in America from the TSA web site.  In the demo, he'd say, "Let's say you're in a cab.  You can fire up the database and a graph of the wait-times at each security checkpoint.  So now you can tell which checkpoint to get out at."

 

The idea was great, and so were the visualization tools.   But at the end of the day, there were definite limitations to what he was doing.  Because the system is basically just drawing data out of the database, using SQL, all he was getting was a list of wait times, which were a little difficult to deal with.  What one would really want is the probability that a delay would occur at each of the checkpoints, based on time of day and a couple of other things.  But that wasn't available, not from this system, not in a cab.

 

Perhaps even worse, he wasn't really working with real-time data. If you're sitting in the cab, what you really want to be working with is recent data, but he didn't have that data; his system couldn't really handle an RSS feed.

 

Now, consider what HANA's far more extensive capabilities do for that example.  First of all, in HANA, data can be imported pretty much continuously.  So if he had an RSS feed going, he could be sure the database was up-to-date.  Second, in HANA, he could use the business functions to do some statistical analysis of the gate delay times.  So instead of columns of times, he could get a single, simple output containing the probability of a delay at each checkpoint.  He can do everything he might want to do in one place.  And this gives him better and more reliable information.

 

So What Makes It Better?

 

Bear with me.  The core difference between HANA and Exalytics is that in HANA, all the data is in one place.  Is that a material difference?  Well, to some people it will be; to some people, it won't be.  As an analyst, I get to hold off and say, "We'll see."

 

Thus far, though, I think the indications are that it is material.  Here's why.

 

When I see a new design idea--and I think it's safe to say that HANA embodies one of those--I like to apply two tests.  Is it simplifying?  And is it fruitful?

 

Back when I was teaching, I used to illustrate this test with the following story:

 

A hundred years ago or so, cars didn't have batteries or electrical systems.  Each of the things now done by the electrical system were thought of as entirely separate functions that were performed in entirely different ways.  To start the car, you used a hand crank.  To illuminate the road in front of the car, you used oil lanterns mounted where the car lights are now.

 

Then along came a new design idea: batteries and wires.  This idea passed both tests with flying colors.  It was simplifying.  You could do lots of different things (starting the car, lighting up the road) with the same apparatus, in an easier and more straightforward way (starting the car or operating the lights from the dashboard).  But it was also fruitful.  Once you had electricity, you could do entirely new things with that same idea, like power a heater motor or operate automatic door locks.

 

So what about HANA?  Simplifying and fruitful?  Well, let's try to compare it with Exalytics. Simplifying?  Admittedly, it's a little mind-bending to be thinking about both rows and columns at the same time.  But when you think about how much simpler it is conceptually to have all the data in one database and think about the complications involved when you have to move data to a new area in order to do other operations on it, it certainly seems simplifying.

 

And fruitful?

 

Believe it or not, it took me a while to figure this one out, but Exalytics really helped me along.  The "Aha!" came when I started comparing the business function library in HANA to the "Advanced Visualization" that Oracle was providing.  When it came to statistics, they were pretty much one-to-one; the HANA developers very self-consciously tried to incorporate the in-database equivalents of the standard statistical functions, and Oracle very self-consciously gave you access to the R function library.

 

But the business function library also does…ta da…business functions, things like depreciation or a year-on-year calculation.  Advanced Visualization doesn't. 

 

This is important not because HANA's business function library has more features than R, but because HANA is using the same design idea (the Business Function Library) to enrich various kinds of database capabilities.  On the analytics side, they're using the statistical functions to enrich analytics capabilities.  On the transaction side, they're using the depreciation calculations to enrich the transaction capabilities.  For either, they're using the same basic enrichment mechanism.

 

And that's what Oracle would find hard to match, I think. Sure, they can write depreciation calculation functionality; they've been doing that for years.  But to have that work seamlessly with the Times10 database, my guess is that they'd have to create a new data storage area in Exalytics, with new pipes and changes in the management tools.

 

Will HANA Have Legs?

 

So what happens when you have two competing design ideas and one is simpler and more fruitful than the other?

 

Let me return to my automobile analogy.

 

Put yourself back a hundred years or so and imagine that some automobile manufacturer or other, caught short by a car with a new electrical system, decides to come to market ASAP with a beautiful hand-made car that does everything that new battery car does, only with proven technology.  It has crisp, brass oil lanterns, mahogany cranks, and a picture of a smiling chauffeur standing next to the car in the magazine ad.

 

The subtext of the ad is roughly as follows. "Why would you want a whole new system, with lots and lots of brand-new failure points, when we have everything they have.  Look, they've got light; we've got light, but ours is reliable and proven.  They've got a starter; we've got a starter, but ours is beautiful, reliable, and proven, one that any chauffeur can operate."

 

I can see that people might well believe them, at least for a while.  But at some point, everybody figures out that the guys with the electrical system have the right design idea.  Maybe it happens when the next version comes out with a heater motor and an interior light.  Maybe it happens when you realize that the chauffeur has gone the way of the farrier. But whenever it happens, you realize that the oil lantern and the crank will eventually fall by the wayside.

 

About David Dobrin

 

I run a small analyst firm that in Cambridge, Massachusetts that does strategy consulting in most areas of enterprise applications.  I am not a database expert, but for the past year, I have been doing a lot of work with SAP related to HANA, so I'm reasonably familiar with it.  I don't work with Oracle, but I know a fair amount about both the Times 10 database and the Essbase database, because I covered both Salesforce (which uses Times 10) and Hyperion (Essbase) for many years.

 

SAP is a current customer of B2B Analysts, Inc., the firm I run.

  

 









https://www.experiencesaphana.com/community/blogs/blog/2012/05/06/hana-vs-exalytics-an-analysts-view 

Posted by AgnesKim