Analytic technologies
Discussion of technologies related to information query and analysis. Related subjects include:
- Business intelligence
- Data warehousing
- (in Text Technologies) Text mining
- (in The Monash Report) Data mining
- (in The Monash Report) General issues in analytic technology
The three principal kinds of analytic business benefit
When I tweaked the slide deck for Thursday’s Investigative Analytics webinar — I’ll post an updated version soon — the part that needed the most work was the section on “What business problems do you solve with this stuff anyway?” I’ve posted about that kind of thing at least five times in the past five years, across three different blogs (linked below). But perhaps this time I can really simplify matters, albeit at the cost of being not quite complete.
A large fraction of all analytic efforts ultimately serve one or more of three purposes:
- Marketing
- Problem and anomaly detection and diagnosis
- Planning and optimization
Those areas obviously overlap. Indeed, it can be argued that everything one does in business amounts to “optimization,” and everything in analysis boils down to noticing and understanding anomalies. Still, I am hopeful that this is an instructive categorization, as per the many examples below. Read more
| Categories: Analytic technologies | 8 Comments |
Investigative analytics: Slide deck and March 10 webinar
As previously noted, I’m doing a webinar on investigative analytics on Thursday, March 10, at 2 pm Eastern time. I’ve now uploaded a late-draft slide deck for same. It’s pretty concise; the deck is in no way a substitute for the webinar itself, which I urge you to attend (or catch a recording of after-the-fact). But the slides — and in a couple of cases comments below them — may add some value to the definition of investigative analytics I recently posted.
| Categories: Analytic technologies, Presentations | 4 Comments |
Teradata, Aster Data, and Teradata/Aster
Teradata is acquiring Aster Data. Naturally, the deal is being presented with a Treaty of Tordesillas kind of positioning — Teradata does X, Aster Data does Y, and everybody looks forward to having X and Y in the same product portfolio. That said, my initial positioning and product strategy thoughts on the Teradata/Aster combination go something like this. Read more
| Categories: Analytic technologies, Aster Data, Columnar database management, Data warehouse appliances, Data warehousing, Database compression, RDF and graphs, Specific users, Teradata | 9 Comments |
Terminology: Investigative analytics
In my post on the six useful things you can do with analytic technology, one of the six was
Research, investigate, and analyze in support of future decisions.
I’m calling that investigative analytics, and am hopeful the term will catch on.
I went on to say that the term conflated several disciplines, namely:
- Statistics, data mining, machine learning, and/or predictive analytics. …
- The more research-oriented aspects of business intelligence tools. …
- Analogous technologies as applied to non-tabular data types such as text or graph.
By way of contrast, I don’t regard business activity monitoring (BAM) or other kinds of monitoring-oriented business intelligence (BI) as part of “investigative analytics,” because they don’t seem particularly investigative.
Based on the above, I propose the following simple definition of the investigative analytics activity or process:
Seeking (previously unknown) patterns in data.
| Categories: Analytic technologies, Business intelligence | 22 Comments |
Terminology: Analytic platforms
A few weeks ago, I described the elements of an “analytic computing system” or “analytic platform,” while reserving judgment as to which of the two terms would or should win out. I am now capitulating to the term analytic platform, under the influence of, among others, Sharmila Mulligan (and Aster Data in general), Vertica and a variety of fellow analysts (Merv Adrian, Neil Raden, Seth Grimes, Jim Kobielus, and Colin White). While Google evidence would suggest it’s way too early to make this call, I think it’s time to say “analytic platform” will win.
What’s more, I now think the phrase “analytic platform” should win. While I think the term “platform” is overused to the point of silliness, at least the phrase “analytic platform” is short. Thus, it could be modified in various descriptive or not-so-descriptive ways: “Advanced analytic platform,” “graph analytics platform,” “customer analytics platform,” “social media analytics platform,” “CRM analytics platform,” “text analytics platform,” or whatever. By way of contrast, try doing that with “analytic computing system,” and see if you can keep a straight face.
To take this in the direction of an actual definition, I’ll say that the three essential elements of an analytic platform are: Read more
| Categories: Analytic technologies, Data warehousing | 2 Comments |
Now we know why Vertica has been so weirdly evasive
Communicating with Vertica has been tricky recently. But HP is now announced to be buying Vertica, which pretty much forces me to comment about Vertica. 🙂 So I’ll indulge in a little bit of explanation as to what I know about Vertica, whether for publication or under NDA. My analysis of the HP/Vertica combination, and expectations for same, will go into another post. Read more
| Categories: Analytic technologies, Data warehousing, HP and Neoview, Market share and customer counts, Michael Stonebraker, Vertica Systems | 10 Comments |
Upcoming webinar on investigative analytics
I recently coined the phrase investigative analytics to conflate
- Statistics, data mining, machine learning, and/or predictive analytics.
- The more research-oriented aspects of business intelligence tools:
- Ad-hoc query.
- Drilldown.
- Most things done by BI-using “business analysts”
- Most things within BI called “data exploration.”
- Analogous technologies as applied to non-tabular data types such as text or graph.
This will be be basis for my part of a webcast on March 10 at 11 am Pacific/2 pm Eastern time. The other main part of the webcast will be a demo by the webcast’s joint sponsors Aster Data and Tableau Software.
Some of Aster’s verbiage in describing and titling the webinar is so hyperbolic that I do not want to give the impression of endorsing it. But I am very hopeful that the webinar itself will be interesting and informative, and will point people at least somewhat in the direction of the benefits Aster is claiming.
| Categories: Analytic technologies, Aster Data, Business intelligence, Data warehousing, Presentations, Tableau Software | 3 Comments |
Comments on the 2011 Forrester Wave for Enterprise Data Warehouse Platforms
The Forrester Wave: Enterprise Data Warehouse Platforms, Q1 2011 is now out,* hot on the heels of the Gartner Magic Quadrant. Unfortunately, this particular Forrester Wave is riddled with inaccuracy. Read more
| Categories: Analytic technologies, Columnar database management, Data warehousing, EMC, Exadata, Greenplum, Netezza, Oracle, Pricing, SAP AG, Sybase, Teradata, Vertica Systems | 8 Comments |
Columnar compression vs. column storage
I’m getting the increasing impression that certain industry observers, such as Gartner, are really confused about columnar technology. (I further suspect that certain vendors are encouraging this confusion, as vendors commonly do.) So here are some basic points.
A simple way to think about the difference between columnar storage and columnar (or any other kind of) compression is this:
- Columnar storage is a reference to how data is grouped together on disk (or in solid-state memory).
- (Columnar) compression is a reference to whether the actual data is on disk, or whether you save space by storing some smaller substitute for the actual data.
Specifically, if data in a relational table is grouped together according to what row it’s in, then the database manager is called “row-based” or a “row store.” If it’s grouped together according to what column it’s in, then the database management system is called “columnar” or a “column store.” Increasingly, row-based and columnar storage are being hybridized.
There are two main kinds of compression — compression of bit strings and more intelligent compression of actual data values. Compression of actual data values can reasonably be called “columnar,” in that different columns of data can be compressed in different ways, often depending only on the data in that column.* Read more
| Categories: Columnar database management, Data warehousing, Database compression, Exadata, Vertica Systems | 21 Comments |
Comments on the Gartner 2010/2011 Data Warehouse Database Management Systems Magic Quadrant
Edit: Comments on the February, 2012 Gartner Magic Quadrant for Data Warehouse Database Management Systems — and on the companies reviewed in it — are now up.
The Gartner 2010 Data Warehouse Database Management Systems Magic Quadrant is out. I shall now comment, just as I did to varying degrees on the 2009, 2008, 2007, and 2006 Gartner Data Warehouse Database Management System Magic Quadrants.
Note: Links to Gartner Magic Quadrants tend to be unstable. Please alert me if any problems arise; I’ll edit accordingly.
In my comments on the 2008 Gartner Data Warehouse Database Management Systems Magic Quadrant, I observed that Gartner’s “completeness of vision” scores were generally pretty reasonable, but their “ability to execute” rankings were somewhat bizarre; the same remains true this year. For example, Gartner ranks Ingres higher by that metric than Vertica, Aster Data, ParAccel, or Infobright. Yet each of those companies is growing nicely and delivering products that meet serious cutting-edge analytic DBMS needs, neither of which has been true of Ingres since about 1987. Read more
