Analytic technologies
Discussion of technologies related to information query and analysis. Related subjects include:
- Business intelligence
- Data warehousing
- (in Text Technologies) Text mining
- (in The Monash Report) Data mining
- (in The Monash Report) General issues in analytic technology
Netezza is finally opening the kimono
I’ve bashed Netezza repeatedly for secrecy and obscurity about its technology and technical plans. Well, they’re getting a lot better. The latest post in a Netezza company blog, by marketing exec Phil Francisco, lays out their story clearly and concisely. And it’s backed up by a white paper that does more of the same. In particular, Page 11 of that white paper spells out possible future directions for enhancement, such as better compression, encryption, join filtering, and Netezza Developer Network stuff. Read more
Data warehouse appliances – fact and fiction
Borrowing the “Fact or fiction?” meme from the sports world:
- Data warehouse appliances have to have specialized hardware. Fiction. Indeed, most contenders except Teradata and Netezza — for example, DATAllegro, Vertica, ParAccel, Greenplum, and Infobright — offer Type 2 appliances. (Dataupia is another exception.)
- Specialized hardware is a dead-end for data warehouse appliances. Fiction. If it were easy for Teradata to replace its specialized switch technology, it would have done so a decade ago. And Netezza’s strategy has a lot of appeal.
- Data warehouse appliances are nothing new, and failed long ago. Fiction, but only because of Teradata. 1980s appliance pioneer Britton-Lee didn’t do so well (it was actually bought by Teradata). IBM and ICL (Britain’s national-champion hardware company) had content-addressable data store technology that went nowhere.
- Since data warehouse appliances failed long ago, they’ll fail now too. Fiction. Shared-nothing MPP is a fundamental advantage of appliances. So are various index-light strategies. Data warehouse appliances are here to stay.
- Data warehouse appliances only make sense if your main database management system can’t handle the job. Fiction. There are dozens of data warehouse appliances managing under 5 terabytes of user data, if not under 1 terabyte. True, some of them are legacy installations, dating back to when Oracle couldn’t handle that much data well itself. But new ones are still going in. Even if Oracle or Microsoft SQL Server can do the job, a data warehouse appliance is often a far superior — cheaper, easier to deploy and keep running, and/or better performing — alternative.
- Data warehouse appliances are just for data marts. For your full enterprise data warehouse, use a conventional DBMS. Part fact, part fiction. It depends on the appliance, and on the complexity of your needs. Teradata systems can do pretty much everything. Netezza and DATAllegro, two of the oldest data warehouse appliance startups, have worked hard on their concurrency issues and now can support fairly large user or reporting loads. They also can handle reasonable volumes of transactional or trickle-feed updates, and probably can support full EDW requirements for decent-sized organizations. Even so, there are some warehouse use cases for which they’re ill-suited. Newer appliance vendors are more limited yet.
- Analytic appliances are just renamed data warehouse appliances. Fact, even if misleading. Netezza is using the term “analytic appliance” to highlight additional things one can do on its boxes beyond answering queries. But those are still operations on a data mart or data warehouse.
- Teradata is the leading data warehouse appliance vendor. More fact than fiction. Some observers say that Teradata systems aren’t data warehouse appliances. But I think they are. Competitors may be superior to Teradata in one or the other characteristic trait of appliances – e.g., speed of installation – but it’s hard to define “appliances” in an objective way that excludes Teradata.
If you liked this post, you might also like one on text mining fact and fiction.
Netezza has another big October quarter
Netezza reported a big October quarter, ahead of expectations. And official guidance for next quarter is essentially flat quarter-over-quarter, suggesting Q3 was indeed surprisingly big. However, Netezza’s year-over-year growth for Q3 was a little under 50%, suggesting the quarter wasn’t so remarkable after all. (Netezza has a January fiscal year.)
Tentative conclusion: Netezza just tends to have big October quarters, perhaps by timing sales cycles to finish soon after the late September user conference. If Netezza’s user conference ever moves to later in the fall, expect Q3 to be weak that year.
Netezza reported 18 new customers, double last year’s figure. Read more
| Categories: Analytic technologies, Data warehouse appliances, Data warehousing, Greenplum, Kognitio, Netezza | 3 Comments |
One of the coolest visualizations I’ve seen
An obscure little company called Ward Analytics was displaying a Teradata performance management tool at the recent Teradata Partners conference, and I just found the visualization to be very cool. Yes, it’s full-screen, but there’s a LOT of information on the screen — basically, what amounts to about four graphs or charts, each of them complex. Plus there are lots of widgets to adjust what you see. And I actually don’t think full-screen is much of a drawback; you just have to be smart about the simpler elements you put in a portal-based UI that then blow up into complex full-screen ones on demand.
This screenshot doesn’t do the product — called Visual Edge — full justice, but it gives a pretty good taste. The weirdest part is that Ward rolled its own technology to create Visual Edge, feeling there were no generally suitable visualizations out there in the market for it to adopt.
| Categories: Analytic technologies, Business intelligence, Teradata | 5 Comments |
Coral8 highlights some key issues with dashboards
Coral8 today is rolling out the Coral8 Portal, offering some BI basics for CEP (Complex Event Processing) filters and queries. In Release 1, this is primitive compared with other BI portals, and of direct interest only to organizations that have already decided they’re using CEP technology. Even so, it serves as a useful illustration of several important issues in dashboarding.
The simplest is that real-time dashboards require different visualizations than others. Most obvious is the ever-popular graph marching from right to left across the screen as time advances along the x-axis. There also are difference in styles between reports and tables that you actually read, vs. read-outs that you merely watch for flickers of change. (Of course those two examples hardly make for a complete list.)
More interesting is the flexibility and parameterization. While Coral8 sells to multiple markets, the design point for the portal is clearly financial trading. So, for example, a query may be registered with one ticker symbol, and an end user can easily customize it to slot in another one instead. In a way, this is a step toward the much greater flexibility that dashboards need overall.
Truth be told, if you put all such Coral8 flexibility features together they’re not yet very impressive. So what’s even more interesting is the overall architecture that could support much greater flexibility in the future. If dashboards gain the flexibility they need, and queries continue to be done in the conventional manner, query volumes will increase enormously. If it further is the case that they are upgraded in some near real-time manner, that’s another huge increase.
How huge? Well, I can make a case that it could be well over three orders of magnitude: Read more
| Categories: Aleri and Coral8, Analytic technologies, Business intelligence, Memory-centric data management, Streaming and complex event processing (CEP) | 3 Comments |
The key problem with dashboard functionality
I keep hinting – or saying outright 🙂 — that I think dashboards need to be revolutionized. It’s probably time to spell that point out a little further.
The key issue, in my opinion, it that dashboards need to be much more personalizable than they are now. This isn’t just me talking. I’ve raised the subject with a lot of users recently, and am getting close to 100% agreement with my viewpoint.
One part of the problem is personalizing what to see, how to visualize it, and how all that’s arranged on the screen. No one product yet fully combines best-of-breed ideas from mainstream BI, specialized visualization tools, and flexible personalized web portals. But that’s not my biggest concern, as I think the BI industry is on a pretty good path in those respects.
Rather, the real issue is that dashboards don’t adequately reflect personal opinions as to what is important. Indeed, that lack is often portrayed as virtue, because supposedly top management can dictate through a few simple metrics what a whole company of subordinates will think and think about. (Balanced scorecard theology is a particularly silly form of this.) But actually that lack is a serious impediment to dashboard success, or indeed to a general analytic/numerate enterprise culture overall.
“One version of the truth” can be a gross oversimplification. Read more
| Categories: Analytic technologies, Business intelligence, OLTP | 6 Comments |
An interesting claim regarding BI openness
Analyst conference calls about merger announcements are generally pretty boring. Indeed, the companies involved tend to feel they are legally barred from saying anything interesting, by mandate of both the antitrust regulators and the SEC.
Still, such calls are joyful events, full of strategic happy talk. If one is really lucky, there may a virtuouso tap dancing exhibition as well. On today’s IBM/Cognos call, Cognos CEO Rob Ashe was asked whether he thought Cognos’ independence or lack thereof was as important today as he said it was after SAP announced its BOBJ takeover. Without missing a beat, he responded that there were two kinds of openness:
- Database openness (not important)
- ERP/business process openness (indeed important)
Hmm. I’m not so sure I agree. To begin with, there aren’t just two major points of potential integration. There’s also a whole lot of middleware: obviously data integration, but also app servers, portals, and query execution acceleration as well. Read more
| Categories: Analytic technologies, Business intelligence, Business Objects, Cognos, IBM and DB2, Memory-centric data management, ParAccel, SAP AG | 1 Comment |
IBM is buying Cognos – quick reactions
Some quick thoughts in connection with IBM’s just-announced plans to acquire Cognos.
1. Ironically, IBM just put out a press release describing a strong-sounding reseller partnership with Business Objects. The deal specified that
Business Objects will begin distributing and reselling IBM DB2 Warehouse with Business Objects XI and CFO Performance Management solutions. In addition, IBM will include a starter edition of Business Objects XI with DB2 and DB2 Warehouse.
Jeff Jones of IBM told me that they also had a partnership with Cognos, but with different details. I guess Cognos will eventually take over that deal, which is an obvious negative for Business Objects.
2. More generally, I can see where Cognos will now likely gain share at DB2 sites, and IBM/Ascential at Cognos sites. I can’t as easily see why Cognos would now lose share at Oracle or Teradata or Netezza sites, or why Ascential would lose share at SAP/BOBJ sites. So there seem to be some genuine synergies here, albeit perhaps modest ones.
3. Thus, I think the negatives in this deal for the remaining independents (Microstrategy, Information Builders, Informatica, etc.) will somewhat outweigh the positives.
4. I’m not a big fan of Cognos’ management, former CEO Ron Zambonini and a few other freethinkers excepted. So from that standpoint I don’t think they have a lot to lose being taken over by Big Blue.
5. Obviously, with most of the dominoes now fallen, the big question is about the future of BI as it – potentially – gets integrated into much larger enterprise technology suites. And I think the answer to that depends a lot more on technology than most people seem to realize. More on that subject later, but here’s one hint:
I think fixing the disappointment that is dashboards will involve taking query volumes up by at least 2 to 3 orders of magnitude. So as great as recent innovations in analytic query performance have been, I hope and trust that so far we’ve only seen the tip of the iceberg.
Links:
1. eWeek on the IBM/Business Objects deal.
2. Press release on the IBM/Business Objects deal.
3. Press release on the IBM/Cognos deal.
| Categories: Analytic technologies, Business intelligence, Business Objects, Cognos, IBM and DB2 | 5 Comments |
Vertica update – HP appliance deal, customer information, and more
Vertica quietly announced an appliance bundling deal with HP and Red Hat today. That got me quickly onto the phone with Vertica’s Andy Ellicott, to discuss a few different subjects. Most interesting was the part about Vertica’s customer base, highlights of which included:
- Vertica’s claim to have “50” customers includes a bunch of unpaid licenses, many of them in academia.
- Vertica has about 15 paying customers.
- Based on conversations with mutual prospects, Vertica believes that’s more customers than DATAllegro has. (Of course, each DATAllegro sale is bigger than one of Vertica’s. Even so, I hope Vertica is wrong in its estimate, since DATAllegro told me its customer count was “double digit” quite a while ago.)
- Most Vertica customers manage over 1 terabyte of user data. A couple have bought licenses showing they intend to manage 20 terabytes or so.
- Vertica’s biggest customer/application category – existing customers and sales pipelines alike – is call detail records for telecommunications companies. (Other data warehouse specialists also have activity in the CDR area.). Major applications are billing assurance (getting the inter-carrier charges right) and marketing analysis. Call center uses are still in the future.
- Vertica’s other big market to date is investment research/tick history. Surely not coincidentally, this is a big area of focus for Mike Stonebraker, evidently at both companies for which he’s CTO. (The other, of course, is StreamBase.)
- Runners-up in market activity are clickstream analysis and general consumer analytics. These seem to be present in Vertica’s pipeline more than in the actual customer base.
| Categories: Analytic technologies, Business Objects, Data warehouse appliances, Data warehousing, DATAllegro, HP and Neoview, RDF and graphs, Vertica Systems | 5 Comments |
Clarifying SAS-in-the-DBMS, and other SAS tidbits
I followed up with Keith Collins of SAS today about SAS-in-the-database, expanding on what I learned or thought I did when we talked last month. Here’s the scoop:
SAS users do a lot of data filtering, aka data preparation, in SAS. These have WHERE clauses, just like SQL. However, only some of them map to actual SQL WHERE clauses. SAS is now implementing many of the rest as UDFs (User-Defined Functions), one DBMS at a time, starting with Teradata. In addition, SAS users can write custom filters that get registered as UDFs. This capability will be released with SAS 9.2. (The timing on SAS 9.2 is in line with the comment thread to my prior post on SAS-in-the-DBMS.) Read more
