Data warehousing
Analysis of issues in data warehousing, with extensive coverage of database management systems and data warehouse appliances that are optimized to query large volumes of data. Related subjects include:
More Oracle notes
When I went to Oracle in October, the main purpose of the visit was to discuss Exadata. And so my initial post based on the visit was focused accordingly. But there were a number of other interesting points I’ve never gotten around to writing up. Let me now remedy that, at least in part. Read more
Kickfire reports a few customer wins
Kickfire has the kind of blog I emphatically advise my clients to publish even when they don’t have management bandwidth to do something “sexier.” If nothing else, at least they record their customer wins when they can.
The current list of cited customers is two application appliance OEM vendors (unnamed, but with some detail), plus one Web 2.0 company (ditto). They’ve also posted about a Sun partnership.
Categories: Data warehouse appliances, Data warehousing, Kickfire, Market share and customer counts | 1 Comment |
Intelligent Enterprise’s Editors’/Editor’s Choice list
I have a blog on Intelligent Enterprise, which actually amounts to editor Doug Henschen’s selections of a few posts a month from DBMS2 (I still haven’t persuaded him to take anything from Text Technologies). Accordingly, I was asked to contribute thoughts this year for his annual article Editors’ Choice article. It’s out now, and as usual is a good piece. Read more
Categories: Analytic technologies, Business intelligence, Data warehousing | 2 Comments |
Database SaaS gains a little visibility
Way back in the 1970s, a huge fraction of analytic database management was done via timesharing, specifically in connection with the RAMIS and FOCUS business-intelligence-precursor fourth-generation languages. (Both were written by Gerry Cohen, who built his company Information Builders around the latter one.) The market for remoting-computing business intelligence has never wholly gone away since. Indeed, it’s being revived now, via everything from the analytics part of Salesforce.com to the service category I call data mart outsourcing.
Less successful to date are efforts in the area of pure database software-as-a-service. It seems that if somebody is going for SaaS anyway, they usually want a more complete, integrated offering. The most noteworthy exceptions I can think of to this general rule are Kognitio and Vertica, and they only have a handful of database SaaS customers each. To wit: Read more
Gartner’s 2008 data warehouse database management system Magic Quadrant is out
February, 2011 edit: I’ve now commented on Gartner’s 2010 Data Warehouse Database Management System Magic Quadrant as well.
Gartner’s annual Magic Quadrant for data warehouse DBMS is out. Thankfully, vendors don’t seem to be taking it as seriously as usual, so I didn’t immediately hear about it. (I finally noticed it in a Greenplum pay-per-click ad.) Links to Gartner MQs tend to come and go, but as of now here are two working links to the 2008 Gartner Data Warehouse Database Management System MQ. My posts on the 2007 and 2006 MQs have also been updated with working links. Read more
ParAccel’s market momentum
After my recent blog post, ParAccel is once again angry that I haven’t given it proper credit for it accomplishments. So let me try to redress the failing.
- ParAccel has disclosed the names of two customers, LatiNode and Merkle (presumably as an add-on to Merkle’s Netezza environment). And ParAccel has named two others under NDA. Four disclosed or semi-disclosed customers is actually more than DATAllegro has/had, although I presume DATAllegro’s three known customers are larger, especially in terms of database size.
- ParAccel sports a long list of partners, and has put out quite a few press releases in connection with these partnerships. While I’ve never succeeded in finding a company that took its ParAccel partnership especially seriously, I’ve only asked three or four of them, which is a small fraction of the total number of partners ParAccel has announced, so in no way can I rule out that somebody, somewhere, is actively helping ParAccel try to sell its products.
- ParAccel repeatedly says it has beaten Vertica in numerous proofs-of-concept (POCs), considerably more than the two cases in which it claims to have actually won a deal against Vertica competition.
- ParAccel has elicited favorable commentary from such astute observers as Seth Grimes and Doug Henschen.
- ParAccel has been noted for running TPC-H benchmarks in memory much more quickly than other vendors run them on disk.
Uh, that’s about all I can think of. What else am I forgetting? Surely that can’t be ParAccel’s entire litany of market success!
Categories: Data warehousing, Market share and customer counts, ParAccel | 6 Comments |
ParAccel actually uses relatively little PostgreSQL code
I often find it hard to write about ParAccel’s technology, for a variety of reasons:
- With occasional exceptions, ParAccel is reluctant to share detailed information.
- With occasional exceptions, ParAccel is reluctant to say anything for attribution.
- In ParAccel’s version of an “agile” development approach, product details keep changing, as do plans and schedules. (The gibe that ParAccel’s product plans are whatever their current sales prospect wants them to be — while of course highly exaggerated — isn’t wholly unfounded.)
- ParAccel has sold very few copies of its products, so it’s hard to get information from third parties.
ParAccel is quick, however, to send email if I post anything about them they think is incorrect.
All that said, I did get careless when I neglected to doublecheck something I already knew. Read more
Categories: Data warehousing, Netezza, ParAccel, PostgreSQL | 3 Comments |
More grist for the column vs. row mill
Daniel Abadi and Sam Madden are at it again, following up on their blog posts of six months arguing for the general superiority of column stores over row stores (for analytic query processing). The gist is to recite a number of bases for superiority, beyond the two standard ones of less I/O and better compression, and seems to be based largely on Section 5 of a SIGMOD paper they wrote with Neil Hachem.
A big part of their argument is that if you carry the processing of columnar and/or compressed data all the way through in memory, you get lots of advantages, especially because everything’s smaller and hence fits better into Level 2 cache. There also is some kind of join algorithm enhancement, which seems to be based on noticing when the result wound up falling into a range according to some dimension, and perhaps using dictionary encoding in a way that will help induce such an outcome.
The main enemy here is row-store vendors who say, in effect, “Oh, it’s easy to shoehorn almost all the benefits of a column-store into a row-based system.” They also take a swipe — for being insufficiently purely columnar — at unnamed columnar Vertica competitors, described in terms that seemingly apply directly to ParAccel.
Categories: Columnar database management, Data warehousing, Database compression, ParAccel, Vertica Systems | 2 Comments |
Introduction to SAND Technology
SAND Technology has a confused history. For example:
- SAND has been around in some form or other since 1982, starting out as a Hitachi reseller in Canada.
- In 1992 SAND acquired a columnar DBMS product called Nucleus, which originally was integrated with hardware (in the form of a card). Notwithstanding what development chief Richard Grodin views as various advantages vs. Sybase IQ, SAND has only had limited success in that market.
- Thus, SAND introduced a second, similarly-named product, which could also be viewed as a columnar DBMS. (As best I can tell, both are called SAND/DNA.) But it’s actually focused on archiving, aka the clunkily named “near-line storage.” And it’s evidently not the same code line; e.g., the newer product isn’t bit-mapped, while the older one is.
- The near-line product was originally focused on the SAP market. Now it’s moving beyond.
- Canada-based SAND had offices in Germany and the UK before it did in the US. This leads to an oddity – SAND is less focused on the SAP aftermarket in Germany than it still is in the US.
SAND is publicly traded, so its numbers are on display. It turns out to be doing $7 million in annual revenue, and losing money.
OK. I just wanted to get all that out of the way. My main thoughts about the DBMS archiving market are in a separate post.
Categories: Archiving and information preservation, Columnar database management, Data warehousing, SAND Technology | 6 Comments |
How to buy an analytic DBMS (overview)
I went to London for a couple of days last week, at the behest of Kognitio. Since I was in the neighborhood anyway, I visited their offices for a briefing. But the main driver for the trip was a seminar Thursday at which I was the featured speaker. As promised, the slides have been uploaded here.
The material covered on the first 13 slides should be very familiar to readers of this blog. I touched on database diversity and the disk-speed barrier, after which I zoomed through a quick survey of the data warehouse DBMS market. But then I turned to material I’ve been working on more recently – practical advice directly on the subject of how to buy an analytic DBMS.
I started by proposing a seven-part segmentation self-assessment: Read more
Categories: Buying processes, Data warehousing, Presentations | 10 Comments |