Discussion of Kognitio – formerly Whitecross – and what it dubiously claims is its in-memory analytic DBMS. Related subjects include:
I went to Bracknell Wednesday to spend time with the Kognitio team. I think I came away with a better understanding of what the technology is all about, and why certain choices have been made.
Like almost every other contender in the market,* Kognitio WX-2 queries disk-based data in the usual way. Even so, WX-2’s design is very RAM-centric. Data gets on and off disk in mind-numbingly simple ways – table scans only, round-robin partitioning only (as opposed to the more common hash), and no compression. However, once the data is in RAM, WX-2 gets to work, happily redistributing as seems optimal, with little concern about which node retrieved the data in the first place. (I must confess that I don’t yet understand why this strategy doesn’t create ridiculous network bottlenecks.) How serious is Kognitio about RAM? Well, they believe they’re in the process of selling a system that will include 40 terabytes of the stuff. Apparently, the total hardware cost will be in the $4 million range.
*Exasol is the big exception. They basically use disk as a source from which to instantiate in-memory databases.
Other technical highlights of the Kognitio WX-2 story include: Read more
A year ago, Mike Stonebraker observed that conventional DBMS don’t necessarily do a great job on scientific data, and further pointed out that different kinds of science might call for different data access methods. Even so, some of the largest databases around are scientific ones, and they have to be managed somehow. For example:
- Microsoft just put out an overwrought press release. The substance seems to be that Pan-STARRS — a Jim Gray legacy also discussed in an August, 2008 Computerworld article — is adding 1.4 terabytes of image data per night, and one not so new database adds 15 terabytes per year of some kind of computer simulation output used to analyze protein folding. Both run on SQL Server, of course.
- Kognitio has an astronomical database too, at Cambridge University, adding 1/2 a terabyte of data per night.
- Oracle is used for a McGill University proteonomics database called CellMapBase. A figure of 50 terabytes of “mass storage” is included, which doesn’t include tape backup and so on.
- The Large Hadron Collider, once it actually starts functioning, is projected to generate 15 petabytes of data annually, which will be initially stored on tape and then distributed to various computing centers around the world.
- Netezza is proud of its ability to serve images and the like quickly, although off the top of my head I’m not thinking of a major customer it has in that area. (But then, if you just sell software, your academic discount can approach 100%; but if like Netezza you have an actual cost of goods sold, that’s not as appealing an option.)
Long-term, I imagine that the most suitable DBMS for these purposes will be MPP systems with strong datatype extensibility — e.g., DB2, PostgreSQL-based Greenplum, PostgreSQL-based Aster nCluster, or maybe Oracle.
|Categories: Aster Data, Data types, Greenplum, IBM and DB2, Kognitio, Microsoft and SQL*Server, Netezza, Oracle, Parallelization, PostgreSQL, Scientific research||1 Comment|
22% of Netezza’s revenue comes from outside the US, at least if we use last quarter’s figures as a guide. At first blush, that doesn’t sound like much. Indeed, percentage-wise it surely lags behind Teradata, Greenplum (which has sold a lot in Asia/Pacific under Netezza’s former head of that region), and a few smaller competitors headquartered outside the US. But a few conversations I had today suggest a rosier view. Read more
|Categories: Data warehouse appliances, Data warehousing, Greenplum, Kognitio, Market share and customer counts, Netezza, Teradata||Leave a Comment|
There now are four hardware vendors that each offer or seem about to announce two different tiers of data warehouse appliances: Sun, HP, EMC, and Teradata. Specifically:
Sun partners with both Greenplum and ParAccel.
HP sells Neoview, and also is partnered with Vertica.
EMC (together with Dell in North America and Bull in Europe) sells DATAllegro. Now EMC is also entering a partnership with ParAccel.
Teradata is pretty far down the road toward releasing a low-end product.
|Categories: Analytic technologies, Data warehouse appliances, Data warehousing, DATAllegro, Dataupia, EMC, Greenplum, HP and Neoview, IBM and DB2, Infobright, Kognitio, Microsoft and SQL*Server, Netezza, Oracle, ParAccel, Sybase, Teradata||6 Comments|
After a flurry of recent announcements of database SaaS (Software as a Service), eWeek has published a backlash article. The angle is that database SaaS is too expensive, because you can get decent DBMS for free and per-gig usage charges might be expensive for big databases.
I think that’s missing the point. Most OLTP databases are pretty small. Or, if they’re big, they get that way through a lot of transactions. In the first case, hosted management is cheap. In the second case, hosted management is taking care of a large burden for you. Read more
I had a call today with Kognitio execs Paul Groom and John Thompson. Hopefully I can now clear up some confusion that was created in this comment thread. (Most of what I wrote about Kognitio in October, 2006 still applies.) Here are some highlights. Read more
|Categories: Analytic technologies, Data warehouse appliances, Data warehousing, Kognitio||12 Comments|
There are at least 16 different vendors offering appliances and/or software that do database management primarily for analytic purposes.* That’s a lot to keep up with,. So I’ve thrown together a little overview of the analytic data management landscape, liberally salted with links to information about specific vendors, products, or technical issues. In some ways, this is a companion piece to my prior post about data warehouse appliance myths and realities.
*And that’s just the tabular/alphanumeric guys. Add in text search and you run the total a lot higher.
Numerous data warehouse specialists offer traditional row-based relational DBMS architectures, but optimize them for analytic workloads. These include Teradata, Netezza, DATAllegro, Greenplum, Dataupia, and SAS. All of those except SAS are wholly or primarily vendors of MPP/shared-nothing data warehouse appliances. EDIT: See the comment thread for a correction re Kognitio.
Numerous data warehouse specialists offer column-based relational DBMS architectures. These include Sybase (with the Sybase IQ product, originally from Expressway), Vertica, ParAccel, Infobright,
Kognitio (formerly White Cross), and Sand. Read more
|Categories: Analytic technologies, Cognos, Data warehouse appliances, Data warehousing, DATAllegro, Dataupia, Greenplum, IBM and DB2, Kognitio, Netezza, Oracle, ParAccel, SAS Institute, Sybase, Teradata, Vertica Systems||11 Comments|
Netezza reported a big October quarter, ahead of expectations. And official guidance for next quarter is essentially flat quarter-over-quarter, suggesting Q3 was indeed surprisingly big. However, Netezza’s year-over-year growth for Q3 was a little under 50%, suggesting the quarter wasn’t so remarkable after all. (Netezza has a January fiscal year.)
Tentative conclusion: Netezza just tends to have big October quarters, perhaps by timing sales cycles to finish soon after the late September user conference. If Netezza’s user conference ever moves to later in the fall, expect Q3 to be weak that year.
Netezza reported 18 new customers, double last year’s figure. Read more
|Categories: Analytic technologies, Data warehouse appliances, Data warehousing, Greenplum, Kognitio, Netezza||3 Comments|
February, 2011 edit: I’ve now commented on Gartner’s 2010 Data Warehouse Database Management System Magic Quadrant as well.
It’s early autumn, the leaves are turning in New England, and Gartner has issued another Magic Quadrant for data warehouse DBMS. (Edit: As of January, 2009, that link is dead but this one works.) The big winners vs. last year are Greenplum and, secondarily, Sybase. Teradata continues to lead. Oracle has also leapfrogged IBM, and there are various other minor adjustments as well, among repeat mentionees Netezza, DATAllegro, Sand, Kognitio, and MySQL. HP isn’t on the radar yet; ditto Vertica. Read more
|Categories: Analytic technologies, Data warehouse appliances, Data warehousing, DATAllegro, Greenplum, HP and Neoview, IBM and DB2, Kognitio, MySQL, Netezza, Oracle, Sybase, Teradata, Vertica Systems||8 Comments|
The fourth Monash Letter is now posted for Monash Advantage members (just 3 pages this time). It’s about forthcoming M&A in data warehouse DBMS, something that seems likely just because of the large number of current players. Some of the observations are:
- Oracle needs to buy somebody, because of its rather dire product problems at the data warehouse high end. And it’s very much in keeping with their recent behavior to do so.
- Teradata could be acquired sooner than people think. While there are tax considerations preventing an outright sale, these should be obviated if all of the current NCR is taken private. What’s more NCR minus Teradata is exactly the kind of healthy, slow-growth, niche company that private equity loves.
- DATAllegro is a natural merger partner for somebody. Their technical differentiation is almost DBMS-independent, so it could be easy to roll them into a larger overall product strategy. And they have enough market traction to have proved some non-trivial value.
- Kognitio seems desperate these days, with several odd or even underhanded marketing tactics. But they do have MPP bitmap software, something Sybase sorely lacks. So there’s an obvious potential combination between those two.
|Categories: Data warehouse appliances, Data warehousing, DATAllegro, Kognitio, Oracle, Sybase, Teradata||3 Comments|