Analytic technologies

Discussion of technologies related to information query and analysis. Related subjects include:

September 22, 2011

Hybrid-columnar soundbites

Busy couple of days talking with reporters. A few notes on hybrid-columnar analytic DBMS, all backed up by yesterday’s post on Teradata columnar:

Edit: The Wall Street Journal got this wrong, writing that Teradata was the first-ever hybrid columnar system. Specifically, they wrote

While columnar technology has been around for years, Teradata says its product is unique because it allows users to include both columns and rows in the same database.

Googling on “Teradata To Unveil New Analytics Product To Speed Business Adoption” might get you around the paywall to see the offending piece.

September 22, 2011

Aster Database Release 5 and Teradata Aster appliance

It was obviously just a matter of time before there would be an Aster appliance from Teradata and some tuned bidirectional Teradata-Aster connectivity. These have now been announced. I didn’t notice anything particularly surprising in the details of either. About the biggest excitement is that Aster is traditionally a Red Hat shop, but for the purposes of appliance delivery has now embraced SUSE Linux.

Along with the announcements comes updated positioning such as:

and of course

Read more

September 22, 2011

Teradata Columnar and Teradata 14 compression

Teradata is pre-announcing Teradata 14, for delivery by the end of this year, where by “Teradata 14” I mean the latest version of the DBMS that drives the classic Teradata product line. Teradata 14’s flagship feature is Teradata Columnar, a hybrid-columnar offering that follows in the footsteps of Greenplum (now part of EMC) and Aster Data (now part of Teradata).

The basic idea of Teradata Columnar is:

Read more

September 20, 2011

XLDB: The one conference I like to attend

I’m not a big fan of conferences, but I really like XLDB. Last year I got a lot out of XLDB, even though I couldn’t stay long (my elder care issues were in full swing). The year before I attended the whole thing — in Lyon, France, no less — and learned a lot more. This year’s XLDB conference is at SLAC — the organization formerly known as the Stanford Linear Accelerator Center — on Sand Hill Road in Menlo Park, October 18-19. As of right now, I plan to be there, at least on the first day. XLDB’s agenda and registration details (inexpensive) can be found on the XLDB conference website.

The only reason I wouldn’t go is if that turned out to be a lousy week for me to travel to California.

The people who go XLDB tend to be really smart — either research scientists, hardcore database technologists, or others who can hold their own with those folks. Audience participation can be intense; the most talkative members I can recall were Mike Stonebraker, Martin Kersten, Michael McIntire, and myself. Even the vendor folks tend to the smart — past examples include Stephen Brobst, Jeff Hammerbacher, Luke Lonergan, and IBM Fellow Laura Haas. When we had a datageek bash on my last trip to the SF area, several guys said they were planning to attend XLDB as well.

XLDB stands for eXtremely Large DataBases, and those are indeed what gets talked about there. Read more

September 19, 2011

Are there any remaining reasons to put new OLTP applications on disk?

Once again, I’m working with an OLTP SaaS vendor client on the architecture for their next-generation system. Parameters include:

So I’m leaning to saying:   Read more

September 11, 2011

“Big data” has jumped the shark

I frequently observe that no market categorization is ever precise and, in particular, that bad jargon drives out good. But when it comes to “big data” or “big data analytics”, matters are worse yet. The definitive shark-jumping moment may be Forrester Research’s Brian Hopkins’ claim that:

… typical data warehouse appliances, even if they are petascale and parallel, [are] NOT big data solutions.

Nonsense almost as bad can be found in other venues.

Forrester seems to claim that “big data” is characterized by Volume, Velocity, Variety, and Variability. Others, less alliteratively-inclined, might put Complexity in the mix. So far, so good; after all, much of what people call “big data” is collections of disparate data streams, all collected somewhere in a big bit bucket. But when people start defining “big data” to include Variety and/or Variability, they’ve gone too far.

Read more

September 8, 2011

Aster Data business trends

Last month, I reviewed with the Aster Data folks which markets they were targeting and selling into, subsequent to acquisition by their new orange overlords. The answers aren’t what they used to be. Aster no longer focuses much on what it used to call frontline (i.e., low-latency, operational) applications; those are of course a key strength for Teradata. Rather, Aster focuses on investigative analytics — they’ve long endorsed my use of the term — and on the batch run/scoring kinds of applications that inform operational systems.

Read more

September 7, 2011

Vertica projections — an overview

Partially at my suggestion, Vertica has blogged a threepart series explaining the “projections” that are central to a Vertica database. This is important, because in Vertica projections play the roles that in many analytic DBMS might be filled by base tables, indexes, AND materialized views. Highlights include:

The blog posts contain a lot more than that, of course, both rah-rah and technical detail, including reminders of other Vertica advantages (compression, no logging, etc.). If you’re interested in analytic DBMS, they’re worth a look.

September 6, 2011

Derived data, progressive enhancement, and schema evolution

The emphasis I’m putting on derived data is leading to a variety of questions, especially about how to tease apart several related concepts:

So let’s dive in.  Read more

August 26, 2011

Virtual data marts in Sybase IQ

I made a few remarks about Sybase IQ 15.3 when it became generally available in July. Now that I’ve had a current briefing, I’ll make a few more.

The key enhancement in Sybase IQ 15.3 is distributed query — what others might call parallel query — aka PlexQ. A Sybase IQ query can now be distributed among many nodes, all talking to the same SAN (Storage-Area Network). Any Sybase IQ node can take the responsibility of being the “leader” for that particular query.

In itself, this isn’t that impressive; all the same things could have been said about pre-Exadata Oracle.* But PlexQ goes somewhat further than just removing a bottleneck from Sybase IQ. Notably, Sybase has rolled out a virtual data mart capability. Highlights of the Sybase IQ virtual data mart story include:   Read more

← Previous PageNext Page →

Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.