SAP HANA and the strange details of hardware certification

I’ve been pondering SAP’s HANA hardware certification strategy this morning and I have to say, it makes little sense. This may well be because I don’t understand the hardware behind it, so if there are greater minds than me out there, then please correct me.

Let’s break down the SAP HANA hardware stack and discuss what’s required – and then try to understand why so few hardware configurations are supported. The hardware vendors tell me that SAP are very prescriptive as to what hardware can be used – presumably so that SAP HANA appliances are as fast as can be, and perform consistently. But if that’s the case, why do the supported platforms vary so widely?

Server Platform

There are two server platforms supported right now – the Nehalem EX, from Dell, Fujitsu, HP and IBM, and also the newer (and 40% faster) Westmere EX platform from Cisco and IBM. This makes sense but why not support any Nehalem EX or Westmere EX platform? They’re all made by the same people in the end, so should perform very similarly.

And to add to this, SAP only certified the Nehalem EX Intel X7560 CPU (which was the fastest) but now the Westmere EX is out – why not support all of that range, and not just the very expensive Intel E7-X870 range? All of them are faster than the X7560.

Memory Requirements

There’s an issue of memory volume – some servers like the IBM x3690 (which is supported) only support up to 256GB RAM – and I thought this was the reason Blades weren’t supported: only HP has a 2TB blade, and it has just 40 cores.

Certainly it seems for now that you need 1 CPU (8-10 cores) per 128GB RAM – why not just have this as a standard. This would mean that certain systems would only support a certain amount of capacity, and customers would have to procure systems accordingly. This choice should be a good thing.

Log Volume

Here’s where things get really weird. There is almost no rhyme or reason to the standards for log volumes for SAP HANA. Basically Log Volumes are somewhere to store transient information so that in the event of a database crash, recent transactions can be replayed.

Due to the large data volumes in SAP HANA, a lot of logs can be produced really quickly, and you need at least the log volume that you have RAM size. But Log Volumes create sequential writes, which with the right disk subsystem can be supported quite easily.

It appears that SAP HANA requires something like 600MB/sec sequential write performance. If I were architecting this then I would use a RAID10 SAS array with 8x 146GB disks (for example for a 512GB appliance). It would be cheap and work well – RAID10 is excellent for write performance.

However all the appliances use solid-state storage for writes. Some use 1 (or 2) Fusion-IO cards which cost about $15,000 per 320GB disk. Yes – that’s roughly $60,000 for the log storage alone for a 1TB appliance.

This provides the relevant performance but they are insanely expensive and are bad for several reasons. First, they really shine for random I/O – and Log Volumes don’t require Random I/O by their nature.

But most of all, the Fusion-io ioDrives that appear to be used in SAP HANA appliances are based on MLC SSD technology, which has a write limit of 4PB. Based on the data written to a SAP HANA appliance, these will last no time at all and the appliances will start to fail.

And to add to that, all the SAP HANA 1TB appliances have 2 640GB Fusion-io drives configured in RAID-0 – so there is no data redundancy and if one fails, you lose the log storage, appliance and have to restore from backup. Seriously.

Data Storage

Data storage makes the most sense – although again, they are pretty random based on memory size. Basically you need 4x RAM and performance doesn’t really matter: the faster the storage subsystem, the faster the appliance starts from cold.

Current storage subsystems are all Direct Attached Storage with SAS arrays but they vary from IBM with a 256GB model with 8x300GB 10k SAS – to HP who require 24x146GB 15k SAS for the same 256GB appliance. Such discrepancies make no sense – presumably HP wanted their appliance to start faster: the disk subsystem performs 4-5x faster than IBM’s!

The good news is that architecturally, using SAS base storage makes sense – at least for a single appliance. It’s cheap and cheerful and works well. But why don’t SAP just issue some guidelines for storage performance requirements and let the vendors meet it?

Conclusions

SAP HANA hardware certification is quite new and I’m sure this will bed down, but on the one hand SAP appear to have been prescriptive about what is required for SAP HANA – but at the same time, there is a huge variation in the configurations provided by different vendors, and therefore presumably the relative performance of different SAP HANA appliances.

And if this is the case, then why don’t SAP just create a performance benchmarking tool that runs on Linux. It measures the size of your main memory, and then sees if the appliance you built is fast enough to run SAP HANA reasonably. If your system passes, then you are supported.

By the way this is exactly what SAP did for the BWA appliance: when you install it, it measures the system and lets you know that CPU, memory and disk meet the minimum requirements. If they don’t then it won’t allow the installer to continue.

Isn’t this certification enough?

Advertisements
This entry was posted in Technology. Bookmark the permalink.

5 Responses to SAP HANA and the strange details of hardware certification

  1. John,

    I always wondered this back even in the prime days of BWA. What I do know is that Intel have full time staff that actually sit next to the HANA/BWA dev team. From my discussions with them, the Intel chipsets are precisely calibrated to memory segments in the various systems. Which means they go through rigorous testing to test different configurations to make sure performance is absolutely optimal. It also means specific Intel chipsets can support different sets of overal memory. For example it used to be that you needed 4GB per core and then this was pushed up to 8GB and then 16GB. So the integration between OS, application layer, memory segments and chipset are all very important.

    I also reckon there are some politics to be played here. There were some customers who ran the install script (you used to be able bypass the hardware check of BWA on the install script – by the way) and got in to some serious trouble with SAP over licensing. I assume there are some sensitive pricing agreements between SAP and the HW vendors.

    I get the sense that you believe all of these hardware components are more or less commoditized at this point. From my discussions with dev its not that simple. At least in BWA there were a lot of intricacies that made the solutions different from vendor to vendor (for example – OCFS vs GPFS). I haven’t looked at HANA hardware specs but I assume there are some application components that the HW use that are different from vendor to vendor.

    And I think this brings up another bigger argument – what is the competitive advantage to using one HW vendor over the other?

  2. John,

    You’re being very unfair to Fusion-io. The prices you mention are at least double our MSRP (as of the date this article was written), so Fusion-io’s products are not nearly as expensive as you assert.

    Also, when it comes to Endurance, the 4PB written is the rating for the 320GB MLC ioDrive. The 640GB ioDrive is 10PB written, and the 1.28TB ioDrive Duo is 20PB written. These are actually the best endurance ratings in the industry for MLC NAND Flash.

    To give you an idea of what this means, the 4PB written on the 320GB iodrive means you can fully overwrite the drive 4 times a day for 8 years before it’s exhausted. That’s 1.2TB per day for 8 years. Considering the average lifespan of a system is 3-5 years, we’re well within the useful lifespan of these systems. Don’t forget that if they’re striped (raid-0), then the endurance is additive – 2 320GB MLC ioDrives would be 8PB written.

    I have yet to even hear of an ioDrive wearing out.

    All those OEM vendors use Fusion-io for one simple reason – it’s the best Flash Memory in the industry, and a good value for the results received.

    If you have any questions, please feel free to contact us directly – we would be happy to answer any questions and provide you with accurate pricing and technical specifications.

    Regards,
    Vince

    • John Appleby says:

      Hi Vince,

      Good to see you engaging. I don’t believe I’m being unfair – I took this information from publicly available information and it’s primary research: you need to take this up with IBM and not me – I took both the prices and the specs from their US website on the day I wrote the article. And by the way I got a quote from Fujitsu and the price is identical – $15,000 for 320GB MLC.

      I have no problem with the Fusion-io products, please don’t get me wrong. However you do have SLC models that have much higher wear rates and I believe those should be used for log drives for in-memory appliances. What’s more none of the HANA vendors have RAID for the log volumes that means that if one of the drives fails then you lose the whole HANA environment and have to restore from backup. 4PB isn’t that much for a system which has 1TB of RAM – just 4096 cycles of the memory. You haven’t seen them wear out because nothing has ever loaded an ioDrive like HANA will and it’s just my prediction/opinion, not fact.

      Here is the information from IBM.com (yes, it is self-inconsistent but I copied and pasted!):

      Note: 640GB MLC DUO, 5985, contains 2 x 320GB MLC memory modules, each with a write limitation of 4PB
      Note: 160GB MLC , 0096, contains 1 x 160GB SLC memory module with a write limitation of 75PB
      Note: 320GB SLC DUO, 0097, contains 2 x 160GB SLC memory modules, each with a write limitation of 75PB
      Note: 320GB MLC, 1649, contains 1 x 320GB MLC memory module, with a write limitation of 4PB
      Note: 1 Peta-Byte = 1024 TeraBytes of writes. Writes are tracked and reported by the adapter’s management utility and may be affected by application writes, data patterns, and maintenance designed to maximize data integrity.

      Regards,

      John

  3. Hi John,

    First off, I forgot to say that I’m speaking my own opinion, and not that of my employer. 🙂 One must be politically correct these days.

    Anyway, I believe that you accidentally picked up pricing on the SLC product. I was unable to locate the prices on the IBM website, but a Shopping search produced the following results for the “IBM High IOPS MS Class SSD PCIe Adapter”:
    http://www.google.com/search?q=IBM+High+IOPS+MS+Class+SSD+PCIe+Adapter&hl=en&tbm=shop&aq=f
    Please note that the prices listed (from 21 stores) are in the $7500 range, in case the link doesn’t come through. This is the MLC product at their usual List Price. Their “MS Class” is MLC, and their “SS Class” is SLC.

    I’m sure if you compare the List price/GB of Flash to that of a high performance EMC or NetApp Enterprise disk array, you’ll find them to be in the same price range, so “very expensive” may not be an accurate assessment in an Enterprise environment.

    You are correct that SLC has a much higher Endurance rating than MLC, however, Fusion-io products have unique technology that extends the life of the Memory. It also alerts (via SNMP and other means) when the Reserves drop below 10%, so they can be replaced proactively, unlike some other SSDs on the market.

    I believe that it’s worth noting that HP, IBM, Cisco, and Fujitsu are selling HANA “Appliances”. This means that they’ve spent some time Engineering these Appliances. Yes, I’m implying that they’ve reviewed these specifications and they deemed that Fusion-io MLC is appropriate for this environment. I have worked for more than one of those vendors, and all tested exhaustively before releasing an Appliance, as there’s nothing worse than buying an Appliance that was poorly engineered. 😉

    So, perhaps your estimation of the write load on the Flash is a little high.

    Just my humble opinion. 😉 (yes, another disclaimer 🙂

    Regards,
    Vince

  4. The FIO 320gb cards aren’t being used, they’re using the 365gb gen 2 cards, which are half of the cost and 3x the speed as Gen one.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s