August 3, 2015

Data messes

A lot of what I hear and talk about boils down to “data is a mess”. Below is a very partial list of examples.

To a first approximation, one would expect operational data to be rather clean. After all, it drives and/or records business transactions. So if something goes awry, the result can be lost money, disappointed customers, or worse, and those are outcomes to be strenuously avoided. Up to a point, that’s indeed true, at least at businesses large enough to be properly automated. (Unlike, for example — 🙂 — mine.)

Even so, operational data has some canonical problems. First, it could be inaccurate; somebody can just misspell or otherwise botch an entry. Further, there are multiple ways data can be unreachable, typically because it’s:

Inconsistent, in which case humans might not know how to look it up and database JOINs might fail.
Unintegrated, in which case one application might not be able to use data that another happily maintains. (This is the classic data silo problem.)

Inconsistency can take multiple forms, including: Read more

Categories: Business intelligence, ClearStory Data, Data integration and middleware, Data warehousing, Databricks, Spark and BDAS, Derived data, EAI, EII, ETL, ELT, ETLT, Gooddata, Greenplum, Hadoop, Log analysis, Streaming and complex event processing (CEP), Web analytics

11 Comments

March 17, 2014

Notes and comments, March 17, 2014

I have ever more business-advice posts up on Strategic Messaging. Recent subjects include pricing and stealth-mode marketing. Other stuff I’ve been up to includes:

The Spark buzz keeps increasing; almost everybody I talk with expects Spark to win big, probably across several use cases.

Disclosure: I’ll soon be in a substantial client relationship with Databricks, hoping to improve their stealth-mode marketing. 😀

The “real-time analytics” gold rush I called out last year continues. A large fraction of the vendors I talk with have some variant of “real-time analytics” as a central message.

Basho had a major change in leadership. A Twitter exchange ensued. 🙂 Joab Jackson offered a more sober — figuratively and literally — take.

Hadapt laid off its sales and marketing folks, and perhaps some engineers as well. In a nutshell, Hadapt’s approach to SQL-on-Hadoop wasn’t selling vs. the many alternatives, and Hadapt is doubling down on poly-structured data*/schema-on-need.

*While Hadapt doesn’t to my knowledge use the term “poly-structured data”, some other vendors do. And so I may start using it more myself, at least when the poly-structured/multi-structured distinction actually seems significant.

WibiData is partnering with DataStax, WibiData is of course pleased to get access to Cassandra’s user base, which gave me the opportunity to ask why they thought Cassandra had beaten HBase in those accounts. The answer was performance and availability, while Cassandra’s traditional lead in geo-distribution wasn’t mentioned at all.

Disclosure: My fingerprints are all over that deal.

In other news, WibiData has had some executive departures as well, but seems to be staying the course on its strategy. I continue to think that WibiData has a really interesting vision about how to do large-data-volume interactive computing, and anybody in that space would do well to talk with them or at least look into the open source projects WibiData sponsors.

I encountered another apparently-popular machine-learning term — bandit model. It seems to be glorified A/B testing, and it seems to be popular. I think the point is that it tries to optimize for just how much you invest in testing unproven (for good or bad) alternatives.

I had an awkward set of interactions with Gooddata, including my longest conversations with them since 2009. Gooddata is in the early days of trying to offer an all-things-to-all-people analytic stack via SaaS (Software as a Service). I gather that Hadoop, Vertica, PostgreSQL (a cheaper Vertica alternative), Spark, Shark (as a faster version of Hive) and Cassandra (under the covers) are all in the mix — but please don’t hold me to those details.

I continue to think that computing is moving to a combination of appliances, clusters, and clouds. That said, I recently bought a new gaming-class computer, and spent many hours gaming on it just yesterday.* I.e., there’s room for general-purpose workstations as well. But otherwise, I’m not hearing anything that contradicts my core point.

*The last beta weekend for The Elder Scrolls Online; I loved Morrowind.

Categories: Basho and Riak, Cassandra, Cloud computing, Clustering, Data models and architecture, Data warehouse appliances, Data warehousing, Databricks, Spark and BDAS, DataStax, Gooddata, Hadapt, Hadoop, HBase, NoSQL, PostgreSQL, Predictive modeling and advanced analytics, Schema on need, SQL/Hadoop integration, Vertica Systems, WibiData

12 Comments

January 4, 2012

Some issues in business intelligence

In November I wrote two parts of a planned multi-post series on issues in analytic technology. Then I got caught up in year-end things and didn’t blog for a month. Well … Happy New Year! I’m back. Let’s survey a few BI-related topics.

Mobile business intelligence — real business value or just a snazzy demo?

I discussed some mobile BI use cases in July 2010, but I’m still not convinced the whole area is a legitimate big deal. BI has a long history of snazzy, senior-exec-pleasing demos that have little to do with substantive business value. For now, I think mobile BI is another of those; few people will gain deep analytic insights staring into their iPhones. I don’t see anything coming that’s going to change the situation soon.

BI-centric collaboration — real business value or just a snazzy demo?

I’m more optimistic about collaborative business intelligence. QlikView’s direct sharing of dashboards will, I think, be a feature competitors must and will imitate. Social media BI collaboration is still in the “mainly a demo” phase, but I think it meets a broader and deeper need than does mobile BI. Over the next few years, I expect numerous enterprises to establish strong cultures of analytic chatter (and then give frequent talks about same at industry conferences). Read more

Categories: Business intelligence, Business Objects, Gooddata, PivotLink, Software as a Service (SaaS)

10 Comments

December 27, 2009

Introduction to Gooddata

Around the end of the Cold War, Esther Dyson took it upon herself to go repeatedly to Eastern Europe and do a lot of rah-rah and catalysis, hoping to spark software and other computer entrepreneurs. I don’t know how many people’s lives she significantly affected – I’d guess it’s actually quite a few – but in any case the number is not zero. Roman Stanek, who has built and sold a couple of software business, cites her as a key influence setting him on his path.

Roman’s latest venture is business intelligence firm Gooddata. Gooddata was founded in 2007 and has been soliciting and getting attention for a while, so I was surprised to learn that Gooddata officially launched just a few weeks ago. Anyhow, some less technical highlights of the Gooddata story include: Read more

Categories: Amazon and its cloud, Analytic technologies, Business intelligence, Cloud computing, Games and virtual worlds, Gooddata, Jaspersoft, Market share and customer counts, Memory-centric data management, Pricing, Software as a Service (SaaS)

13 Comments

Search our blogs and white papers

Monash Research blogs

DBMS 2 covers database management, analytics, and related technologies.
Text Technologies covers text mining, search, and social software.
Strategic Messaging analyzes marketing and messaging strategy.
The Monash Report examines technology and public policy issues.
Software Memories recounts the history of the software industry.

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.

Links
- Monash Research
- White Papers
Admin
- Log in

Data messes

Notes and comments, March 17, 2014

Some issues in business intelligence

Introduction to Gooddata

Search our blogs and white papers

Monash Research blogs

User consulting

Vendor advisory

Monash Research highlights

Recent posts

Categories

Date archives

Admin