Analytic technologies
Discussion of technologies related to information query and analysis. Related subjects include:
- Business intelligence
- Data warehousing
- (in Text Technologies) Text mining
- (in The Monash Report) Data mining
- (in The Monash Report) General issues in analytic technology
High-performance analytics
For the past few months, I’ve collected a lot of data points to the effect that high-performance analytics – i.e., beyond straightforward query — is becoming increasingly important. And I’ve written about some of them at length. For example:
- MapReduce – controversial or in some cases even disappointing though it may be – has a lot of use cases.
- It’s early days, but Netezza and Teradata (and others) are beefing up their geospatial analytic capabilities.
- Memory-centric analytics is in the spotlight.
Ack. I can’t decide whether “analytics” should be a singular or plural noun. Thoughts?
Another area that’s come up which I haven‘t blogged about so much is data mining in the database. Data mining accounts for a large part of data warehouse use. The traditional way to do data mining is to extract data from the database and dump it into SAS. But there are problems with this scenario, including: Read more
| Categories: Aster Data, Data warehousing, EAI, EII, ETL, ELT, ETLT, Greenplum, MapReduce, Netezza, Oracle, Parallelization, SAS Institute, Teradata | 6 Comments |
Beyond query
I sometimes describe database management systems as “big SQL interpreters,” because that’s the core of what they do. But it’s not all they do, which is why I describe them as “electronic file clerks” too. File clerks don’t just store and fetch data; they also put a lot of work into neatening, culling, and generally managing the health of their information hoards.
Already 15 years ago, online backup was as big a competitive differentiator in the database wars as any particular SQL execution feature. Security became important in some market segments. Reliability and availability have been important from the getgo. And manageability has been crucial ever since Microsoft lapped Oracle in that regard, back when SQL Server had little else to recommend it except price.*
*Before Oracle10g, the SQL Server vs. Oracle manageability gap was big.
Now data warehousing is demanding the same kinds of infrastructure richness.* Read more
| Categories: Data warehousing, Microsoft and SQL*Server, Oracle | 1 Comment |
The query from hell, and other stories
I write about a lot of products whose core job boils down to Make queries run fast. Without exception, their vendors tout stories of remarkable performance gains over conventional/incumbent DBMS (reported improvement is usually at least 50-fold, and commonly 100-500+). They further claim at least 2-3X better performance than their close competitors. In making these claims, vendors usually stress that their results come from live customer benchmarks. In few if any of the cases, I judge, are they lying outright. So what’s going on? Read more
| Categories: Benchmarks and POCs, Data warehousing | Leave a Comment |
Carson Schmidt of Teradata on SSDs
Carson Schmidt is, in essence, Teradata’s VP of product development for everything other than applications and database software. For example, he oversees Teradata’s hardware, storage, and switching technology. So when Teradata Chief Development Officer Scott Gnau didn’t have answers at his fingertips to some questions about SSDs (Solid-State Drives), he bucked me over to Carson. A very interesting discussion about SSDs (and other subjects) ensued.
Highlights included: Read more
| Categories: Data warehousing, Solid-state memory, Storage, Teradata | 1 Comment |
How to tell Teradata’s product lines apart
Once Netezza hit the market, Teradata had a classic “disruptive” price problem – it offered a high end product, at a high price, sporting lots of features that not all customers needed or were willing to pay for. Teradata has at times slashed prices in competitive situations, but there are obvious risks to that, especially when a customer already has a number of other Teradata systems for which it paid closer to full price.
This year, Teradata has introduced a range of products that flesh out its competitive lineup. There now are three mainstream Teradata offerings, plus two with more specialized applicability. Teradata no longer has to sell Cadillacs to customers on Corolla budgets.
But how do we tell the five Teradata product lines apart? The names are confusing, both in their hardware-vendor product numbers and their data-warehousing-dogma product names, especially since in real life Teradata products’ capabilities overlap. Indeed, Teradata executives freely admit that the Teradata Data Mart Appliance 551 can run smaller data warehouses, while the Teradata Data Warehouse Appliance 2550 is positioned in large part at what Teradata quite reasonably calls data marts.
When one looks past the difficulties of naming, Teradata’s product lineup begins to make more sense. Let’s start by considering the three main Teradata products. Read more
| Categories: Data warehouse appliances, Data warehousing, Netezza, Pricing, Teradata | 14 Comments |
Update on Aster Data Systems and nCluster
I spent a few hours at Aster Data on my West Coast swing last week, which has now officially put out Version 3 of nCluster. Highlights included: Read more
Introduction to Kickfire
I’ve spent a few hours visiting or otherwise talking with my new clients at Kickfire recently, so I think I have a better feel for their story. A few details are still missing, however, either because I didn’t get around to asking about them, or because an unexplained accident corrupted my notes (and I wasn’t even using Office 2007). Highlights include: Read more
| Categories: Columnar database management, Data warehouse appliances, Data warehousing, Kickfire, MySQL, Theory and architecture | Leave a Comment |
Coral8 proposes CEP as a BI data platform
It used to be that Coral8 and StreamBase were the two complex event/stream processing (CEP) vendors most committed to branching out beyond the super-low-latency algorithmic trading marketing. But StreamBase seems to have pulled in its horns after a management change, focusing much more on the financial market (and perhaps the defense/intelligence market as well). Aleri, Truviso, and Progress Apama, while each showing signs of branching out, don’t seem to have gone as far as Coral8 yet. And so, though it’s a small company with not all that many dozens of customers, my client Coral8 seems to be the one to look at when seeing whether CEP really is relevant to a broad range of mainstream – no pun intended – applications.
Coral8 today unveiled a new product release – the not-so-concisely named “Coral8 Engine and Portal Release 5.5” – and a new buzzphrase — “Continuous Intelligence.” The interesting part boils down to this:
Coral8 is proposing CEP — excuse me, “Continuous Intelligence” — as a data-store-equivalent for business intelligence.
This includes both operational BI (the current sweet spot) and dashboards (the part with cool, real-time-visualization demos). Read more
Oracle notes
I spent about six hours at Oracle today — talking with Andy Mendelsohn, Ray Roccaforte, Juan Loaiza, Cetin Ozbutun, et al. — and plan to write more later. For now, let me pass along a few quick comments. Read more
| Categories: Data warehousing, Exadata, Oracle, Parallelization, Pricing, Storage, Theory and architecture | 10 Comments |
Teradata’s Petabyte Power Players
As previously hinted, Teradata has now announced 4 of the 5 members of its “Petabyte Power Players” club. These are enterprises with 1+ petabyte of data on Teradata equipment. As is commonly the case when Teradata discusses such figures, there’s some confusion as to how they’re actually counting. But as best I can tell, Teradata is counting: Read more
