Analytic technologies

Discussion of technologies related to information query and analysis. Related subjects include:

April 16, 2015

Notes on indexes and index-like structures

Indexes are central to database management.

Perhaps it’s time for a round-up post on indexing. :)

1. First, let’s review some basics. Classically:

2. Further:  Read more

April 9, 2015

Which analytic technology problems are important to solve for whom?

I hear much discussion of shortfalls in analytic technology, especially from companies that want to fill in the gaps. But how much do these gaps actually matter? In many cases, that depends on what the analytic technology is being used for. So let’s think about some different kinds of analytic task, and where they each might most stress today’s available technology.

In separating out the task areas, I’ll focus first on the spectrum “To what extent is this supposed to produce novel insights?” and second on the dimension “To what extent is this supposed to be integrated into a production/operational system?” Issues of latency, algorithmic novelty, etc. can follow after those. In particular, let’s consider the tasks: Read more

March 23, 2015

A new logical data layer?

I’m skeptical of data federation. I’m skeptical of all-things-to-all-people claims about logical data layers, and in particular of Gartner’s years-premature “Logical Data Warehouse” buzzphrase. Still, a reasonable number of my clients are stealthily trying to do some kind of data layer middleware, as are other vendors more openly, and I don’t think they’re all crazy.

Here are some thoughts as to why, and also as to challenges that need to be overcome.

There are many things a logical data layer might be trying to facilitate — writing, querying, batch data integration, real-time data integration and more. That said:

Read more

March 15, 2015

BI for NoSQL — some very early comments

Over the past couple years, there have been various quick comments and vague press releases about “BI for NoSQL”. I’ve had trouble, however, imagining what it could amount to that was particularly interesting, with my confusion boiling down to “Just what are you aggregating over what?” Recently I raised the subject with a few leading NoSQL companies. The result is that my confusion was expanded. :) Here’s the small amount that I have actually figured out.

As I noted in a recent post about data models, many databases — in particular SQL and NoSQL ones — can be viewed as collections of <name, value> pairs.

Consequently, a NoSQL database can often be viewed as a table or a collection of tables, except that:

That’s all straightforward to deal with if you’re willing to write scripts to extract the NoSQL data and transform or aggregate it as needed. But things get tricky when you try to insist on some kind of point-and-click. And by the way, that last comment pertains to BI and ETL (Extract/Transform/Load) alike. Indeed, multiple people I talked with on this subject conflated BI and ETL, and they were probably right to do so.

Read more

March 10, 2015

Some stuff on my mind, March 10, 2015

I found yesterday’s news quite unpleasant.

And by the way, a guy died a few day ago snorkeling at the same resort I like to go to, evidently doing less risky things than I on occasion have.

So I want to unclutter my mind a bit. Here goes.

1. There are a couple of stories involving Sam Simon and me that are too juvenile to tell on myself, even now. But I’ll say that I ran for senior class president, in a high school where the main way to campaign was via a single large poster, against a guy with enough cartoon-drawing talent to be one of the creators of the Simpsons. Oops.

2. If one suffers from ulcerative colitis as my mother did, one is at high risk of getting colon cancer, as she also did. Mine isn’t as bad as hers was, due to better tolerance for medication controlling the disease. Still, I’ve already had a double-digit number of colonoscopies in my life. They’re not fun. I need another one soon; in fact, I canceled one due to the blizzards.

Pro-tip — never, ever have a colonoscopy without some kind of anesthesia or sedation. Besides the unpleasantness, the lack of meds increases the risk that the colonoscopy will tear you open and make things worse. I learned that the hard way in New York in the early 1980s.

3. Five years ago I wrote optimistically about the evolution of the information ecosystem, specifically using the example of the IT sector. One could argue that I was right. After all:  Read more

February 28, 2015

Databricks and Spark update

I chatted last night with Ion Stoica, CEO of my client Databricks, for an update both on his company and Spark. Databricks’ actual business is Databricks Cloud, about which I can say:

I do not expect all of the above to remain true as Databricks Cloud matures.

Ion also said that Databricks is over 50 people, and has moved its office from Berkeley to San Francisco. He also offered some Spark numbers, such as: Read more

February 22, 2015

Data models

7-10 years ago, I repeatedly argued the viewpoints:

Since then, however:

So it’s probably best to revisit all that in a somewhat organized way.

Read more

February 1, 2015

Information technology for personal safety

There are numerous ways that technology, now or in the future, can significantly improve personal safety. Three of the biggest areas of application are or will be:

Implications will be dramatic for numerous industries and government activities, including but not limited to law enforcement, automotive manufacturing, infrastructure/construction, health care and insurance. Further, these technologies create a near-certainty that individuals’ movements and status will be electronically monitored in fine detail. Hence their development and eventual deployment constitutes a ticking clock toward a deadline for society deciding what to do about personal privacy.

Theoretically, humans aren’t the only potential kind of tyrants. Science fiction author Jack Williamson postulated a depressing nanny-technology in With Folded Hands, the idea for which was later borrowed by the humorous Star Trek episode I, Mudd.

Of these three areas, crime prevention is the furthest along; in particular, sidewalk cameras, license plate cameras and internet snooping are widely deployed around the world. So let’s consider the other two.

Vehicle accident prevention

Read more

January 19, 2015

Where the innovation is

I hoped to write a reasonable overview of current- to medium-term future IT innovation. Yeah, right. :) But if we abandon any hope that this post could be comprehensive, I can at least say:

1. Back in 2011, I ranted against the term Big Data, but expressed more fondness for the V words — Volume, Velocity, Variety and Variability. That said, when it comes to data management and movement, solutions to the V problems have generally been sketched out.

2. Even so, there’s much room for innovation around data movement and management. I’d start with:

3. As I suggested last year, data transformation is an important area for innovation.  Read more

December 31, 2014

Notes on machine-generated data, year-end 2014

Most IT innovation these days is focused on machine-generated data (sometimes just called “machine data”), rather than human-generated. So as I find myself in the mood for another survey post, I can’t think of any better idea for a unifying theme.

1. There are many kinds of machine-generated data. Important categories include:

That’s far from a complete list, but if you think about those categories you’ll probably capture most of the issues surrounding other kinds of machine-generated data as well.

2. Technology for better information and analysis is also technology for privacy intrusion. Public awareness of privacy issues is focused in a few areas, mainly: Read more

Next Page →

Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.