Analysis of companies, products, and user strategies in the area of business intelligence. Related subjects include:
Merv Adrian and Doug Henschen both reported more details about Amazon Redshift than I intend to; see also the comments on Doug’s article. I did talk with Rick Glick of ParAccel a bit about the project, and he noted:
- Amazon Redshift is missing parts of ParAccel, notably the extensibility framework.
- ParAccel did some engineering to make its DBMS run better in the cloud.
- Amazon did some engineering in the areas it knows better than ParAccel — cloud provisioning, cloud billing, and so on.
“We didn’t want to do the deal on those terms” comments from other companies suggest ParAccel’s main financial take from the deal is an already-reported venture investment.
The cloud-related engineering was mainly around communications, e.g. strengthening error detection/correction to make up for the lack of dedicated switches. In general, Rick seemed more positive on running in the (Amazon) cloud than analytic RDBMS vendors have been in the past.
So who should and will use Amazon Redshift? For starters, I’d say: Read more
|Categories: Amazon and its cloud, Business intelligence, Cloud computing, Data mart outsourcing, Data warehousing, Infobright, ParAccel, Predictive modeling and advanced analytics, Pricing, Vertica Systems||4 Comments|
Business intelligence dashboards are frequently bashed. I slammed them back in 2006 and 2007. Mark Smith dropped the hammer last August. EIS, the most dashboard-like pre-1990s analytic technology, was also the most reviled. There are reasons for this disdain, but even so dashboards shouldn’t be dismissed entirely.
In essence, I’d say:
- Dashboards are overrated and oversold.
- They are useful even so.
- Their usefulness is ebbing as technology advances.
In particular: Read more
Imagine a website whose purpose is to encourage consumers to take actions — for example to click on an ad, click on the next page, or actually make a purchase. Best practices for such a site include:
- An ever-evolving user experience, informed by — among other factors — creativity, brand identity, the vendor’s evolving product line itself, and …
- … predictive modeling.
- Personalization based on predictive modeling.
Those predictive models themselves will keep changing, because:
- Organizations learn.
- Consumer tastes change.
- More or different kinds of data keep becoming available.
In that situation, what would it mean to offer the website owner a predictive modeling “application”? Read more
I can think of seven major reasons not to use an analytic RDBMS. One is good; but the other six seem pretty questionable, niche circumstances excepted, especially at this time.
The good reason to not have an analytic RDBMS is that most organizations can run perfectly well on some combination of:
- SaaS (Software as a Service).
- A low-volume static website.
- A network focused on office software.
- A single cheap server, likely running a single instance of a general-purpose RDBMS.
Those enterprises, however, are generally not who I write for or about.
The six bad reasons to not have an analytic RDBMS all take the form “Can’t some other technology do the job better?”, namely:
- A data warehouse that’s just another instance of your OLTP (OnLine Transaction Processing) RDBMS. If your problem is that big, it’s likely that a specialized analytic RDBMS will be more cost-effective and generally easier to deal with.
- MOLAP (Multi-Dimensional OnLine Analytic Processing). That ship has sailed … and foundered … and been towed to drydock.
- In-memory BI. QlikView, SAP HANA, Oracle Exalytics, and Platfora are just four examples of many. But few enterprises will want to confine their analytics to such data as fits affordably in RAM.
- Non-tabular* approaches to investigative analytics. There are many examples in the Hadoop world — including the recent wave of SQL add-ons to Hadoop — and some in the graph area as well. But those choices will rarely suffice for the whole job, as most enterprises will want better analytic SQL performance for (big) parts of their workloads.
- Tighter integration of analytics and OLTP (OnLine Transaction Processing). Workday worklets illustrate that business intelligence/OLTP integration is a really good idea. And it’s an idea that Oracle and SAP can be expected to push heavily, when they finally get their product acts together. But again, that’s hardly all the analytics you’re going to want to do.
- Tighter integration of analytics and other short-request processing. An example would be maintaining a casual game’s leaderboard via a NoSQL write-optimized database. Yet again, that’s hardly all the analytics a typical enterprise will want to do.
|Categories: Business intelligence, Data warehousing, Games and virtual worlds, Hadoop, In-memory DBMS, MOLAP||10 Comments|
I recently proposed a 2×2 matrix of BI use cases:
- Is there an operational business process involved?
- Is there a focus on root cause analysis?
Let me now introduce another 2×2 matrix of analytic scenarios:
- Is there a compelling need for super-fresh data?
- Who’s consuming the results — humans or machines?
My point is that there are at least three different cool things people might think about when they want their analytics to be very fast:
- Fast investigative analytics — e.g., business intelligence with great query response.
- Computations on very fresh data, presented to humans — e.g. “heartbeat” graphics monitoring a network.
- Computations on very fresh data, presented back to a machine — e.g., a recommendation engine that includes makes good use of data about a user’s last few seconds of actions.
There’s also one slightly boring one that however drives a lot of important applications: Read more
|Categories: Business intelligence, Complex event processing (CEP), Games and virtual worlds, Log analysis, Predictive modeling and advanced analytics, Splunk, WibiData||4 Comments|
Stuart Frost, of DATAllegro fame, has started a small family of companies, and they’ve become my clients sort of as a group. The first one that I’m choosing to write about is Cirro, for which the basics are:
- Cirro does data federation for analytics.
- Cirro has 10 full-time people plus 4 part-timers.
- Cirro launched its product in June.
- Cirro doesn’t have customers yet, but hopes to fix that soon.
Data federation stories are often hard to understand because, until you drill down, they implausibly sound as if they do anything for everybody. That said, it’s reasonable to think of Cirro as a layer between Hadoop and your BI tool that:
- Helps with data transformations.
- Helps join Hadoop data to relational tables, even if the joins are large ones.
In both cases, Cirro is calling on your data management software for help, RDBMS or Hadoop as the case may be.
More precisely, Cirro’s approach is: Read more
|Categories: Business intelligence, Cirro, Data integration and middleware, Hadoop, MapReduce, Tableau Software||4 Comments|
When I wrote last week that I have at least 5 clients claiming they’re uniquely positioned to support BI over Hadoop (most of whom partner with a 6th client, Tableau) the non-partnering exception I had in mind was Platfora, Ben Werther’s oh-so-stealthy startup that is finally de-stealthing today. Platfora combines:
- An interesting approach to analytic data management.
- Business intelligence tools integrated with that.
The whole thing sounds like a perhaps more general and certainly non-SaaS version of what Metamarkets has been offering for a while.
The Platfora technical story starts: Read more
|Categories: Business intelligence, Columnar database management, Data models and architecture, Data warehousing, Database compression, Hadoop, Memory-centric data management, Platfora||6 Comments|
With Strata/Hadoop World being next week, there is much Hadoop discussion. One theme of the season is BI over Hadoop. I have at least 5 clients claiming they’re uniquely positioned to support that (most of whom partner with a 6th client, Tableau); the first 2 whose offerings I’ve actually written about are Teradata Aster and Hadapt. More generally, I’m hearing “Using Hadoop is hard; we’re here to make it easier for you.”
If enterprises aren’t yet happily running business intelligence against Hadoop, what are they doing with it instead? I took the opportunity to ask Cloudera, whose answers didn’t contradict anything I’m hearing elsewhere. As Cloudera tells it (approximately — this part of the conversation* was rushed): Read more
|Categories: Business intelligence, Cloudera, EAI, EII, ETL, ELT, ETLT, Hadoop, HBase, Health care, Investment research and trading, MapR, Market share and customer counts, Telecommunications, Web analytics||4 Comments|
A number of people and companies are using the term “iterative analytics”. This is confusing, because it can mean at least three different things:
- You analyze something quickly, decide the result is not wholly satisfactory, and try again. Examples might include:
- Aggressive use of drilldown, perhaps via an advanced-interface business intelligence tool such as Tableau or QlikView.
- Any case where you run a query or a model, think about the results, and run another one after that.
- You develop an intermediate analytic result, and using it as input to the next round of analysis. This is roughly equivalent to saying that iterative analytics refers to a multi-step analytic process involving a lot of derived data.
- #1 and #2 conflated/combined. This is roughly equivalent to saying that iterative analytics refers to all of to investigative analytics, sometimes known instead as exploratory analytics.
Based both on my personal conversations and a quick Google check, it’s reasonable to say #1 and #3 seem to be the most common usages, with #2 trailing a little bit behind.
But often it’s hard to be sure which of the various possible meanings somebody has in mind.
Monash’s First and Third Laws of Commercial Semantics state:
|Categories: Analytic technologies, Business intelligence, QlikTech and QlikView, Tableau Software||3 Comments|
I’m not at Oracle OpenWorld, but as usual that won’t keep me from commenting. My bottom line on the first night’s announcements is:
- At many large enterprises, Oracle has a lock on much of their IT efforts. (But not necessarily in the internet or investigative analytics areas.) Tonight’s announcements serve to strengthen that.
- Tonight’s announcements do little to help Oracle in other market segments.
1. At the highest level, my view of Oracle’s strategy is the same as it’s been for several years:
Clayton Christensen’s The Innovator’s Solution teaches us that Oracle should focus on selling a thick stack of technology to its highest-end customers, and that’s exactly what Oracle does focus on.
2. Tonight’s news is closely in line with what Oracle’s Juan Loaiza told me three years ago, especially:
- Oracle thinks flash memory is the most important hardware technology of the decade, one that could lead to Oracle being “bumped off” if they don’t get it right.
- Juan believes the “bulk” of Oracle’s business will move over to Exadata-like technology over the next 5-10 years. Numbers-wise, this seems to be based more on Exadata being a platform for consolidating an enterprise’s many Oracle databases than it is on Exadata running a few Especially Big Honking Database management tasks.
3. Oracle is confusing people with its comments on multi-tenancy. I suspect:
- What Oracle is talking about when it says “multi-tenancy” is more like consolidation than true multi-tenancy.
- Probably there are a couple of true multi-tenancy features as well.
4. SaaS (Software as a Service) vendors don’t want to use Oracle, because they don’t want to pay for it.* This limits the potential impact of Oracle’s true multi-tenancy features. Even so: Read more
|Categories: Business intelligence, Cloud computing, Columnar database management, Data warehouse appliances, Data warehousing, Exadata, Memory-centric data management, Oracle, Software as a Service (SaaS), Solid-state memory, Storage||9 Comments|