Analysis of data warehouse DBMS vendor Exasol. Related subjects include:
Comments on Gartner’s 2012 Magic Quadrant for Data Warehouse Database Management Systems — evaluations
To my taste, the most glaring mis-rankings in the 2012/2013 Gartner Magic Quadrant for Data Warehouse Database Management are that it is too positive on Kognitio and too negative on Infobright. Secondarily, it is too negative on HP Vertica, and too positive on ParAccel and Actian/VectorWise. So let’s consider those vendors first.
Gartner seems confused about Kognitio’s products and history alike.
- Gartner calls Kognitio an “in-memory” DBMS, which is not accurate.
- Gartner doesn’t remark on Kognitio’s worst-in-class* compression.
- Gartner gives Kognitio oddly high marks for a late, me-too Hadoop integration strategy.
- Gartner writes as if Kognitio’s next attempt at the US market will be the first one, which is not the case.
- Gartner says that Kognitio pioneered data warehouse SaaS (Software as a Service), which actually has existed since the pre-relational 1970s.
Gartner is correct, however, to note that Kognitio doesn’t sell much stuff overall.
In the cases of HP Vertica, Infobright, ParAccel, and Actian/VectorWise, the 2012 Gartner Magic Quadrant for Data Warehouse Database Management’s facts are fairly accurate, but I dispute Gartner’s evaluation. When it comes to Vertica: Read more
In a call Monday with a prominent company, I was told:
- Teradata, Netezza, Greenplum and Vertica aren’t relational.
- Teradata, Netezza, Greenplum and Vertica are all data warehouse appliances.
That, to put it mildly, is not accurate. So I shall try, yet again, to set the record straight.
In an industry where people often call a DBMS just a “database” — so that a database is something that manages a database! — one may wonder why I bother. Anyhow …
1. The products commonly known as Oracle, Exadata, DB2, Sybase, SQL Server, Teradata, Sybase IQ, Netezza, Vertica, Greenplum, Aster, Infobright, SAND, ParAccel, Exasol, Kognitio et al. all either are or incorporate relational database management systems, aka RDBMS or relational DBMS.
2. In principle, there can be difficulties in judging whether or not a DBMS is “relational”. In practice, those difficulties don’t arise — yet. Every significant DBMS still falls into one of two categories:
- Was designed to do relational stuff* from the get-go, even if it now does other things too.
- Supports a lot of SQL.
- Was designed primarily to do non-relational things.*
- Doesn’t support all that much SQL.
*I expect the distinction to get more confusing soon, at which point I’ll adopt terms more precise than “relational things” and “relational stuff”.
3. There are two chief kinds of relational DBMS: Read more
I’m frequently asked to generalize in some way about in-memory or memory-centric data management. I can start:
- The desire for human real-time interactive response naturally leads to keeping data in RAM.
- Many databases will be ever cheaper to put into RAM over time, thanks to Moore’s Law. (Most) traditional databases will eventually wind up in RAM.
- However, there will be exceptions, mainly on the machine-generated side. Where data creation and RAM data storage are getting cheaper at similar rates … well, the overall cost of RAM storage may not significantly decline.
Getting more specific than that is hard, however, because:
- The possibilities for in-memory data storage are as numerous and varied as those for disk.
- The individual technologies and products for in-memory storage are much less mature than those for disk.
- Solid-state options such as flash just confuse things further.
Consider, for example, some of the in-memory data management ideas kicking around. Read more
This year’s Gartner Magic Quadrant for Data Warehouse Database Management Systems is out.* I shall now comment, just as I did on the 2010, 2009, 2008, 2007, and 2006 Gartner Data Warehouse Database Management System Magic Quadrants, to varying extents. To frame the discussion, let me start by saying:
- In general, I regard Gartner Magic Quadrants as a bad use of good research.
- Illustrating the uselessness of — or at least poor execution on — the overall quadrant metaphor, a large majority of the vendors covered are lined up near the line x = y, each outpacing the one below in both of the quadrant’s dimensions.
- I find fewer specifics to disagree with in this Gartner Magic Quadrant than in previous year’s versions. Two factors jump to mind as possible reasons:
- This year’s Gartner Magic Quadrant for Data Warehouse Database Management Systems is somewhat less ambitious than others; while it gives as much company detail as its predecessors, it doesn’t add as much discussion of overall trends. So there’s less to (potentially) disagree with.
- Merv Adrian is now at Gartner.
- Whatever the problems may be with Gartner’s approach, the whole thing comes out better than do Forrester’s failed imitations.
*As of February, 2012 — and surely for many months thereafter — Teradata is graciously paying for a link to the report.
Specific company comments, roughly in line with Gartner’s rough single-dimensional rank ordering, include: Read more
I last wrote about Exasol in 2008. After talking with the team Friday, I’m fixing that now. The general theme was as you’d expect: Since last we talked, Exasol has added some new management, put some effort into sales and marketing, got some customers, kept enhancing the product and so on.
Top-level points included:
- Exasol’s technical philosophy is substantially the same as before, albeit not with as extreme a focus on fitting everything in RAM.
- Exasol believes its flagship DBMS EXASolution has great performance on a load-and-go basis.
- Exasol has 25 EXASolution customers, all in Germany.*
- 5 of those are “cloud” customers, at hosting providers engaged by Exasol.
- EXASolution database sizes now range from the low 100s of gigabytes up to 30 terabytes.
- Pretty much the whole company is in Nuremberg.
|Categories: Benchmarks and POCs, Columnar database management, Data warehousing, Database compression, Exasol, Market share and customer counts, Pricing, Software as a Service (SaaS), Specific users, Sybase, Workload management||1 Comment|
I need to finalize an already-too-long slide deck on how to select an analytic DBMS by late Thursday night. Anybody see something I’m overlooking, or just plain got wrong?
Edit: The slides have now been finalized.
I talk with lots of vendors of MPP data warehouse DBMS. I’ve now heard enough different approaches to MPP architecture that I think it might be interesting to contrast some of the alternatives.
|Categories: Aster Data, Calpont, Exasol, Greenplum, Parallelization, Theory and architecture, Vertica Systems||22 Comments|
It took 5 ½ months after my non-technical introduction, but I finally got a briefing from Exasol’s technical folks (specifically, the very helpful Mathias Golombek and Carsten Weidmann). Here are some highlights: Read more
|Categories: Analytic technologies, Benchmarks and POCs, Columnar database management, Data warehousing, Exasol, In-memory DBMS, Memory-centric data management, Pricing||1 Comment|
I talked with Exasol today – at 5:00 am! — and of course want to blog about it. For clarity, I’d like to start by comparing/contrasting the fundamental data structures at Vertica, ParAccel, and Exasol. And it feels like that should be a separate post. So here goes.
- Exasol, Vertica, and ParAccel all store data in columnar formats.
- Exasol, Vertica, and ParAccel all compress data heavily.
- Exasol and Vertica operate on in-memory data in compressed formats. ParAccel decompresses the data when it gets to RAM. Exasol, Vertica, and ParAccel all — perhaps to varying extents — operate on in-memory data in compressed formats.
- ParAccel and Exasol write data to what amounts to the in-memory part of their basic data structures; the data then gets persisted to disk. Vertica, however, has a separate in-memory data structure to accept data and write it to disk.
- Vertica is a disk-centric system that doesn’t rely on there being a lot of RAM.
- ParAccel can be described that way too; however, in some cases (including on the TPC-H benchmarks), ParAccel recommends loading all your data into RAM for maximum performance.
- Exasol is totally optimized for the assumption that queries will be run against data that had already been previously loaded into RAM.
Beyond the above, I plan to discuss in a separate post how Exasol does MPP shared-nothing software-only columnar data warehouse database management differently than Vertica and ParAccel do shared-nothing software-only columnar data warehouse database management.
|Categories: Columnar database management, Data warehousing, Database compression, Exasol, ParAccel, Vertica Systems||12 Comments|
I had a non-technical introduction today to Exasol, a data warehouse specialist that has gotten a little buzz recently for publishing TPC-H results even faster than ParAccel’s. Here are some highlights:
- Exasol was founded back in 2000.
- Exasol is a German company, with 60 employees. While I didn’t ask, the vast majority are surely German.
- Exasol has two customers. 6-8 more are Coming Real Soon. Most or all of those are in Germany, although one may be in Asia.
- Karstadt (big German retailer) has had Exasol deployed for 3 years. The other deployed customer is the German subsidiary of data provider IMS Health.
- [Redacted for confidentiality] is a strategic investor in and partner of Exasol. [Redacted for confidentiality]’s only competing partnership is with Oracle.
- Exasol’s system is more completely written from scratch than many. E.g., all they use from Linux are some drivers, and maybe a microkernel.
- Exasol runs in-memory. There doesn’t seem to be a disk-centric mode.
- Exasol’s data access methods are sort of like columnar, but not exactly. I look forward to a more technical discussion to sort that out.
- Exasol’s claimed typical compression is 5-7X. As in the Vertica story, database operations are carried out on compressed data.
- Exasol says it has performed a very fast TPC-H inhouse at the 30 terabyte level. However, its deployed sites are probably a lot smaller than that. IMS Health is cited in its literature as 145 gigabytes.
- Oracle and Microsoft are listed as Exasol partners, so there may be some kind of plug-compatibility or back-end processing story.