Analytic technologies
Discussion of technologies related to information query and analysis. Related subjects include:
- Business intelligence
- Data warehousing
- (in Text Technologies) Text mining
- (in The Monash Report) Data mining
- (in The Monash Report) General issues in analytic technology
Will Brighthouse become the MySQL data warehouse of choice?
As I’ve previously noted:
- Infobright is about to make more noise about its MySQL-based data warehouse software, Brighthouse.
- Brighthouse has some very interesting technical features.
- A Sun/Infobright partnership would make a lot of sense.
Talking with Infobright today, I was again struck by how close their relationship with MySQL (the company is). Stay tuned.
| Categories: Analytic technologies, Data warehousing, Infobright, MySQL | Leave a Comment |
Infobright is gearing up for a press push
There’s another TDWI conference coming up, so it’s time for data warehouse-related press rollouts. Infobright (one of my many clients in this area) will be doing one of them, and ran an early version by me. Customer announcements, vendor partnerships, and so on are still being finalized, but anyhow Infobright has 7 revenue-recognized customers and a bunch more that are sold and in the implementation cycle. There’s a Release 3 of Brighthouse coming up. As one would expect, Release 3’s major claims to fame are the general addition of features (including some which elicit a “You didn’t have that already?” reaction), plus huge performance improvements in some queries (i.e., the biggest bottlenecks in Brighthouse Release 2).
On that level, it’s all standard stuff, as is Infobright’s core pitch — ease, simplicity, low cost, etc., and the benefits of same. But drilling down, there are some rather unique technical claims. Read more
| Categories: Analytic technologies, Data warehousing, Infobright | 1 Comment |
MapReduce for data mining? Maybe for variable-schema analytics.
Rich Skrenta is quite a successful entrepreneur, so it’s likely that he doesn’t really mean the more ridiculous parts of this rant on the MapReduce debate. E.g., he cheerfully disregards the fact that the data warehouse appliance vendors have ALREADY disrupted the market he’s focusing on. Index-light row-based and columnar systems are both super fast at data mining extracts.
But let’s go straight to the one interesting thing he said, Read more
| Categories: Analytic technologies, MapReduce, Parallelization, SAS Institute | 2 Comments |
Things could get interesting for Infobright
Of the many new specialty data warehouse DBMS and appliances, Infobright’s BrightHouse is the only leading one based on MySQL. I expect Sun and Infobright to have some interesting conversations now. Conversely, I wouldn’t be optimistic about any partnering discussions Infobright might have with, say, HP.
The most directly competitive relationship Sun now has to any future Infobright partnership is with ParAccel.
| Categories: Analytic technologies, Data warehousing, Infobright, MySQL, Open source, ParAccel | 2 Comments |
Forrester collects business intelligence buzzwords
Forrester says “It’s time to reinvent your BI strategy.” No argument there. And they have an article, charts, and a white paper to back it up. A lot of the details are quite dubious, like the chart in which they declared that columnar RDBMS aren’t relational. Still, the article is worth surveying to see if you have any “I hadn’t thought of that!” moments.
I particularly like this diagram, which has 27 layers, containing approximately 2 1/2 BI-related buzzphrases each.
| Categories: Analytic technologies, Business intelligence | 3 Comments |
Flash-based data warehousing is getting ever closer
EMC is rolling out solid-state drives later this quarter. The press release mentions the word “terabyte”, so this is for non-trivial systems. And by the way, 100,000 write/erase cycles before something wears out is several per hour, so that’s a non-problem for data warehousing.
ParAccel and SAP already offer RAM-based appliances. I suspect we’ll see appliances based on solid-state drives before long. I also wouldn’t be shocked if a non-appliance vendor such as Oracle suddenly jumped into this area, trying to use it as a way to leapfrog the appliance vendors.
| Categories: Data warehouse appliances, Data warehousing | 1 Comment |
Netezza targets 1 petabyte
Netezza is promising petabyte-scale appliances later this year, up from 100 terabytes. That’s user data (I checked), and assumes 2-3X compression, or a little less than they think is actually likely. I.e., they’re describing their capacity in the same kinds of terms other responsible vendors do. They haven’t actually built and tested any 1 petabyte systems internally yet, but they’ve gone over 100 terabytes.
Basically, this leaves Netezza’s high-end capability about 10X below Teradata’s. On the other hand, it should leave them capable of handling pretty much every Teradata database in existence. Read more
| Categories: Analytic technologies, Data warehouse appliances, Data warehousing, Netezza, Petabyte-scale data management, Teradata | Leave a Comment |
A quick survey of data warehouse management technology
There are at least 16 different vendors offering appliances and/or software that do database management primarily for analytic purposes.* That’s a lot to keep up with,. So I’ve thrown together a little overview of the analytic data management landscape, liberally salted with links to information about specific vendors, products, or technical issues. In some ways, this is a companion piece to my prior post about data warehouse appliance myths and realities.
*And that’s just the tabular/alphanumeric guys. Add in text search and you run the total a lot higher.
Numerous data warehouse specialists offer traditional row-based relational DBMS architectures, but optimize them for analytic workloads. These include Teradata, Netezza, DATAllegro, Greenplum, Dataupia, and SAS. All of those except SAS are wholly or primarily vendors of MPP/shared-nothing data warehouse appliances. EDIT: See the comment thread for a correction re Kognitio.
Numerous data warehouse specialists offer column-based relational DBMS architectures. These include Sybase (with the Sybase IQ product, originally from Expressway), Vertica, ParAccel, Infobright, Kognitio (formerly White Cross), and Sand. Read more
Netezza rolls out its compression story
The proximate cause for today’s flurry of Netezza-related posts is that the company has finally rolled out its compression story. In a nutshell, Netezza has developed its own version of columnar delta compression, slated to ship May, 2008. It compresses 2-5X, with the factor sometimes going up into double digits. Netezza estimates this produces a 2-3X improvement in overall performance, with the core marketing claim being that performance will “double” from compression alone. Read more
| Categories: Analytic technologies, Data warehouse appliances, Data warehousing, Database compression, Netezza, Theory and architecture | Leave a Comment |
ANALYTIC is the antonym of TRANSACTIONAL
In 1993, Ted Codd introduced the term OLAP (OnLine Analytic Processing) to describe data management that wasn’t optimized for OLTP (OnLine Transaction Processing). Later in the 1990s, Henry Morris of IDC introduced the term analytic applications to describe apps that weren’t transactional. Since then, no better word than “analytic” has emerged to cover the broad class of IT apps and technologies that aren’t focused on transactional processing.
In the latest incarnation, analytic appliances are coming to the fore. Read more
