April 5th, 2008 Curt Monash
There now are four hardware vendors that each offer or seem about to announce two different tiers of data warehouse appliances: Sun, HP, EMC, and Teradata. Specifically:
Read the rest of this entry »
Posted in Analytics and analytic technologies, DATAllegro, Data warehouse appliances, Data warehousing, Dataupia, Greenplum, HP and Neoview, IBM and DB2, Infobright and Brighthouse, Kognitio and WX2, Microsoft and SQL*Server, Netezza, Oracle, ParAccel, Relational database management systems, Sybase, Teradata | 4 Comments »
March 25th, 2008 Curt Monash
While talking with EnterpriseDB about today’s Postgres Plus announcements, I took the chance to clear up a point of confusion. Somebody told Seth Grimes that EnterpriseDB is out to compete with Greenplum, but that person was wrong. EnterpriseDB fondly hopes to manage multi-terabyte data warehouses, just as Oracle and Microsoft do with their respective general-purpose DBMS. However, EnterpriseDB is not going after the 10s-100s of terabytes sized DBMS that are the province of specialists such as Greenplum, Teradata, Netezza, or columnar DBMS vendors.
Even so, in GridSQL EnterpriseDB does seem to be open-sourcing MPP shared-nothing basics. There’s a lightweight optimizer that does a little (but only a little) more to minimize data movement beyond just optimizing queries on each node. And GridSQL knows how to replicate small tables across each node, a key aspect of many MPP designs. (Partition your facts; replicate your dimensions.)
Please subscribe to our feed!
Technorati Tags: GridSQL
Posted in Analytics and analytic technologies, Data warehousing, EnterpriseDB and Postgres Plus, Greenplum, Open source RDBMS, Relational database management systems | No Comments »
March 6th, 2008 Curt Monash
The relational DBMS industry is filled with startups. In some way or other, most of them are based on or make use of the open source project PostgreSQL. (Not all, of course; exceptions include DATAllegro and Infobright, which are based on Ingres and MySQL respectively.) But how they use PostgreSQL varies greatly. Read the rest of this entry »
Posted in EnterpriseDB and Postgres Plus, Greenplum, Open source RDBMS, PostgreSQL, Relational database management systems, Vertica Systems | 9 Comments »
December 14th, 2007 Curt Monash
There are at least 16 different vendors offering appliances and/or software that do database management primarily for analytic purposes.* That’s a lot to keep up with,. So I’ve thrown together a little overview of the analytic data management landscape, liberally salted with links to information about specific vendors, products, or technical issues. In some ways, this is a companion piece to my prior post about data warehouse appliance myths and realities.
*And that’s just the tabular/alphanumeric guys. Add in text search and you run the total a lot higher.
Numerous data warehouse specialists offer traditional row-based relational DBMS architectures, but optimize them for analytic workloads. These include Teradata, Netezza, DATAllegro, Greenplum, Dataupia, and SAS. All of those except SAS are wholly or primarily vendors of MPP/shared-nothing data warehouse appliances. EDIT: See the comment thread for a correction re Kognitio.
Numerous data warehouse specialists offer column-based relational DBMS architectures. These include Sybase (with the Sybase IQ product, originally from Expressway), Vertica, ParAccel, Infobright, Kognitio (formerly White Cross), and Sand. Read the rest of this entry »
Posted in Analytics and analytic technologies, Cognos and Applix TM1, DATAllegro, Data warehouse appliances, Data warehousing, Dataupia, Greenplum, IBM and DB2, Kognitio and WX2, Netezza, Oracle, ParAccel, Relational database management systems, SAS Institute, Sybase, Teradata, Vertica Systems | 10 Comments »
November 29th, 2007 Curt Monash
Netezza reported a big October quarter, ahead of expectations. And official guidance for next quarter is essentially flat quarter-over-quarter, suggesting Q3 was indeed surprisingly big. However, Netezza’s year-over-year growth for Q3 was a little under 50%, suggesting the quarter wasn’t so remarkable after all. (Netezza has a January fiscal year.)
Tentative conclusion: Netezza just tends to have big October quarters, perhaps by timing sales cycles to finish soon after the late September user conference. If Netezza’s user conference ever moves to later in the fall, expect Q3 to be weak that year.
Netezza reported 18 new customers, double last year’s figure. Read the rest of this entry »
Posted in Analytics and analytic technologies, Data warehouse appliances, Data warehousing, Greenplum, Kognitio and WX2, Netezza, Relational database management systems | 3 Comments »
October 19th, 2007 Curt Monash
I was at the Business Objects conference this week, and as usual went to very few sessions. But one I did stroll into was on “Managing Rapid Growth With the Right BI Strategy.” This was by Reliance Telecommunications, an outfit in India that is adding telecom subscribers very quickly, and consequently banging 100-150 gigs of data per day into a 35 terabyte warehouse.
The beginning of the talk astonished me, as the presenter seemed to be saying they were doing all this on Oracle. Hah. Oracle is what they moved away from; instead, they got Greenplum. I couldn’t get details; indeed, as a BI guy he was far enough away from DBMS to misspeak and say that Greenplum was brought in by ‘HP’, before quickly correcting himself when prompted. Read the rest of this entry »
Posted in Analytics and analytic technologies, Business Objects, Data warehouse appliances, Data warehousing, Greenplum, Oracle, Specific users | No Comments »
October 19th, 2007 Curt Monash
It’s early autumn, the leaves are turning in New England, and Gartner has issued another Magic Quadrant for data warehouse DBMS. The big winners vs. last year are Greenplum and, secondarily, Sybase. Teradata continues to lead. Oracle has also leapfrogged IBM, and there are various other minor adjustments as well, among repeat mentionees Netezza, DATAllegro, Sand, Kognitio, and MySQL. HP isn’t on the radar yet; ditto Vertica. Read the rest of this entry »
Posted in Analytics and analytic technologies, DATAllegro, Data warehouse appliances, Data warehousing, Greenplum, HP and Neoview, IBM and DB2, Kognitio and WX2, MySQL, Netezza, Oracle, Relational database management systems, Sybase, Teradata, Vertica Systems | 6 Comments »
October 12th, 2007 Curt Monash
I’ve been arguing for a while that Oracle and Microsoft are screwed in high-end data warehousing. The reason is that they’re stuck with SMP (Symmetric Multi-Processing) architectures, while Teradata, Netezza, DATAllegro, and many others enjoy the benefits of MPP (Massively Parallel Processing). Thus, Teradata and DATAllegro boast installations in the hundreds of terabytes each, while Oracle and Microsoft users usually have to perform unnatural acts of hard-coded partitioning even to reach the 10 terabyte level.
That said, there are at least three ways Oracle and/or Microsoft could get out of this technical box:
1. They could buy or just partner with MPP vendors such as Dataupia, who offer plug-compatibility with their respective main DBMS.
2. They could buy whoever they want, plug-compatibility be damned. Presumably, they’d quickly add a light-weight data federation front-end to give the appearance of integration, then merge the products more closely over time.
3. They could develop or buy technology like DATAllegro’s, which essentially federates instances of an ordinary SMP DBMS across nodes of an MPP grid (Greenplum does something similar). I imagine that, for example, ripping Ingres out of DATAllegro and slotting in Oracle instead would be a pretty straightforward exercise; even without dramatic change to any of the optimizations, the resulting port would be something that ran pretty quickly on Day 1.
Bottom line: Oracle and Microsoft are hemorrhaging at the data warehouse high end now. But there are ways they could stanch the bleeding.
Posted in Analytics and analytic technologies, DATAllegro, Data warehouse appliances, Data warehousing, Dataupia, Greenplum, Microsoft and SQL*Server, Oracle, Portability, transparency, and plug-compatibility, Relational database management systems, Teradata | 1 Comment »
October 5th, 2007 Curt Monash
I’ve been talking a lot to text mining vendors this week, as per a series of posts over on Text Technologies. Specifically, I’ve focused on the two with exhaustive extraction strategies, namely Attensity and Clarabridge. (Exhaustive extraction is Attensity’s term for separating the linguistic-analysis part of text mining from the DBMS-based BI/analytics part.)
So I asked each of Attensity and Clarabridge the side question as to which data warehouse software or appliances they were seeing. The answers were almost identical — Oracle, Microsoft SQL*Server, Teradata, and Netezza. One also mentioned MySQL and 2 HP prospects — but the HP sites were running NonStop SQL, not NeoView. Amazingly, there were no mentions of DB2. There also weren’t any mentions of the smaller specialist startups, such as DATAllegro, Greenplum, or Vertica.
Want to continue getting great research about text mining, data warehouse appliances, and other hot analytics-related topics? Then subscribe to our comprehensive feed, by RSS/Atom or e-mail! We recommend taking the integrated feed for all our blogs, but blog-specific ones are also easily available.
Posted in Analytics and analytic technologies, Business intelligence, Data warehouse appliances, Data warehousing, Greenplum, HP and Neoview, IBM and DB2, Microsoft and SQL*Server, MySQL, Oracle, Relational database management systems, Teradata | 7 Comments »
July 25th, 2007 Curt Monash
DATAllegro Stuart Frost called in for a prebriefing/feedback/consulting session. (I love advising my DBMS vendor clients on how to beat each other’s brains in. This was even more fun in the 1990s, when combat was generally more aggressive. Those were also the days when somebody would change jobs to an arch-rival and immediately explain how everything they’d told me before was utterly false …)
While I had Stuart on the phone, I did manage to extract some stuff I’m at liberty to use immediately. Here are the highlights: Read the rest of this entry »
Posted in DATAllegro, Data warehouse appliances, Data warehousing, Database compression, Greenplum, Netezza, Relational database management systems, Teradata | 4 Comments »
March 16th, 2007 Curt Monash
I talk to a lot of data warehouse software and/or appliance start-ups. Naturally, they’re all gunning for Netezza, and regale me with stories about competitive replacements, competitive wins, benchmark wins, and the like. And there have been a couple of personnel departures too, notably development chief Bill Blake. Netezza insists this is because he got a CEO offer he couldn’t refuse, he’s still friendly with the company, development plans are entirely on track, and news of some sort is coming out in a few weeks. Also, Greenplum brags that its Asia/Pacific manager was snagged from Netezza.
On the other hand, Netezza claims lots of sales momentum, and that’s certainly consistent with what I hear from its competitors. Read the rest of this entry »
Posted in Business Objects, Data warehouse appliances, Data warehousing, Greenplum, Netezza, Relational database management systems | No Comments »
March 13th, 2007 Curt Monash
I talked with Greenplum honchos Bill Cook and Scott Yara yesterday. Bill is the new CEO, formerly head of Sun’s field operations. Scott is president, and in effect the marketing-guy co-founder. I still don’t know whether I really believe their technical story. But I do think I have a feel for what they’re trying to do. Key aspects of the Greenplum strategy include:
- Greenplum rewrote a lot of PostgreSQL to parallelize it, in the correct belief that MPP is the best way to go for high-end data warehousing.
- Indeed, Greenplum claims to have a general solution to DBMS parallelization. Unlike Netezza, DATallegro, Vertica, and Kognitio, Greenplum offers a row-oriented data store with a fairly full set of indexing techniques. You want star indices or bitmaps? They have them. (They even claimed to be used for some text management when last we talked, although that was for O’Reilly and Mark Logic seems to be O’Reilly’s main text-indexing vendor.)
- Greenplum’s main sales strategy is to be part of Sun’s product line, bundled into Thumper boxes as single-part-number Sun offerings. They certainly could add other hardware OEMs, just like Checkpoint sells firewalls through multiple appliance vendors. But at least for now it’s all about Sun.
Read the rest of this entry »
Posted in Data warehouse appliances, Data warehousing, Greenplum, Open source RDBMS, PostgreSQL, Relational database management systems | 1 Comment »
February 23rd, 2007 Curt Monash
Business Intelligence Lowdown has a well-dugg post listing what it claims are the 10 largest databases in the world. The accuracy leaves much to be desired, as is illustrated by the fact that #10 on the list is only 20 terabytes, while entirely unmentioned is eBay’s 2-petabyte database (mentioned here, and also here). Read the rest of this entry »
Posted in DATAllegro, Data warehouse appliances, Data warehousing, Database theory and practice, Greenplum, IBM and DB2, Netezza, Oracle, SAS Institute, Teradata | 3 Comments »
January 27th, 2007 Curt Monash
Recently, I’ve done extensive research into the hardware strategies of computing appliance vendors, across multiple functional areas. Data warehousing, firewall/unified threat management, antispam, data integration – you name it, I talked to them. Of course, each vendor has a unique twist. But some architectural groupings definitely emerged.
The most common approaches seem to be:
Type 1: Custom assembly from off-the-shelf parts. In this model, the only unusual (but still off-the-shelf) parts are usually in the area of network acceleration (or occasionally encryption). Also, the box may be balanced differently than standard systems, in terms of compute power and/or reliability.
Type 2 (Virtual): We don’t need no stinkin’ custom hardware. In this model, the only “appliancy” features are in the area of easy deployment, custom operating systems, and/or preconfigured hardware.
And of course there are also appliances of Type 0: Custom hardware including proprietary ASICs or FPGAs.
Different markets had different emphases; e.g., firewall appliances are typically Type 1, while antispam devices cluster in Type 2. But the data warehouse appliance market is highly diverse, which maybe shouldn’t be a surprise. After all, the revenue market leader is non-appliance software vendor Oracle, while noisy upstart Netezza is famous for its FPGA.
Read the rest of this entry »
Posted in DATAllegro, Data warehouse appliances, Data warehousing, Greenplum, IBM and DB2, Kognitio and WX2, Netezza, Relational database management systems, Teradata | 4 Comments »
October 5th, 2006 Curt Monash
Kognitio called me for a briefing this morning on their WX-2 product. Technical highlights included:
- Their core technology is MPP/shared-nothing data warehousing.
- Unlike most other vendors (but like Greenplum), they are available software-only.
- Like DATallegro and Netezza, they have no global indexing.
- Unlike the other MPP players, they don’t hash partition the data and lead with hash joins. Rather, they have local compressed bitmap indices on every node.
- Similarly, they distribute data utterly randomly and have no concept of range partitioning whatsoever.
- Probably for that reason, WX-2 reads data in small 32K blocks. This forfeits the benefit of sequential reads, unless David Aldridge is correct that Linux can take care of that on its own.
- They seem more chip-heavy than DATallegro and Netezza. A dual-core Opteron blade with 16 or 32 gigabytes of RAM talks to 144, 288, or in some cases 600 gigabytes of disk (before mirroring).
- The position themselves somewhat as being a memory-centric product supplier. While I suspect this is exaggerated, it probably indicates that they’ve put some work into managing RAM as well as disk.
Much like the other “new” MPP data warehouse vendors, Kognitio claims to never have knowingly been outbenchmarked, whether on performance or on TCO factors such as ease of installation.
Read the rest of this entry »
Posted in Data warehouse appliances, Data warehousing, Greenplum, Kognitio and WX2, Memory-centric data management, Relational database management systems | 9 Comments »
September 22nd, 2006 Curt Monash
The last person I spoke with at the Netezza conference on Tuesday was a customer/presenter that the company had picked out for me. One thing he said baffled me — he claimed that Netezza was a real appliance vendor, but DATallegro wasn’t, presumably due to administrability issues. Now, it wasn’t clear to me that he’d ever evaluated DATallegro, so I didn’t take this too seriously, but still the exchange brought into focus the great differences between data warehouse products in the area of administration. For example:
- Netezza has no indices at all. And no caches. And the hardware is preconfigured. This all makes administration pretty simple.
- DATallegro has almost no indices, and also has preconfigured hardware. But it has some partitioning, optionally.
- Teradata also has preconfigured hardware. It does have indices, but rather simple ones. Plus it has join indices. And it has a few more configuration options in other areas (e.g., block size) than the other appliance vendors. (Yes, I count Teradata among the appliances.)
- If you go through all the fuss of installing SAP’s applications and BI technology anyway, the incremental administration of just SAP BI Accelerator is pretty light.
- Oracle and IBM have mammothly complex indexing options, but have put large amounts of work into tools to lessen the resulting administrative burden.
- IBM offers preconfigured hardware units to simplify some installation issues.
- Come to think of it, I don’t really know how hard it is to administer columnar systems (e.g., Sybase IQ).
Posted in DATAllegro, Data warehouse appliances, Data warehousing, Greenplum, IBM and DB2, Netezza, Oracle, Relational database management systems, SAP, BI Accelerator, and MaxDB, Teradata | 2 Comments »
August 12th, 2006 Curt Monash
Netezza relies on FPGAs. DATallegro essentially uses standard components, but those include Infiniband cards (and there’s a little FPGA action when they do encryption). Greenplum, however, claims to offer a highly competitive data warehouse solution that’s so software-only you can download it from their web site. That said, their main sales mode seems to also be through appliances, specifically ones branded and sold by Sun, combining Greenplum and open source software on a “Thumper” box. And the whole thing supposedly scales even higher than DATallegro and Netezza, because you can manage over a petabyte if you chain together a dozen of the 100 terabyte racks.
Read the rest of this entry »
Posted in DATAllegro, Data warehouse appliances, Greenplum, Ingres, Netezza, Open source RDBMS, PostgreSQL, Relational database management systems | 3 Comments »