Posts about database and analytic technologies applied to the telecommunications industry, especially in call detail record (CDR) applications. Related subjects include:

May 4, 2012

Notes on graph data management

This post is part of a series on managing and analyzing graph data. Posts to date include:

Interest in graph data models keeps increasing. But it’s tough to discuss them with any generality, because “graph data model” encompasses so many different things. Indeed, just as all data structures can be mapped to relational ones, it is also the case that all data structures can be mapped to graphs.

Formally, a graph is a collection of (node, edge, node) triples. In the simplest case, the edge has no properties other than existence or maybe direction, and the triple can be reduced to a (node, node) pair, unordered or ordered as the case may be. It is common, however, for edges to encapsulate additional properties, the canonical examples of which are:

Many of the graph examples I can think of fit into four groups: Read more

May 1, 2012

Thinking about market segments

It is a reasonable (over)simplification to say that my business boils down to:

One complication that commonly creeps in is that different groups of users have different buying practices and technology needs. Usually, I nod to that point in passing, perhaps by listing different application areas for a company or product. But now let’s address it head on. Whether or not you care about the particulars, I hope the sheer length of this post reminds you that there are many different market segments out there.

Last June I wrote:

In almost any IT decision, there are a number of environmental constraints that need to be acknowledged. Organizations may have standard vendors, favored vendors, or simply vendors who give them particularly deep discounts. Legacy systems are in place, application and system alike, and may or may not be open to replacement. Enterprises may have on-premise or off-premise preferences; SaaS (Software as a Service) vendors probably have multitenancy concerns. Your organization can determine which aspects of your system you’d ideally like to see be tightly integrated with each other, and which you’d prefer to keep only loosely coupled. You may have biases for or against open-source software. You may be pro- or anti-appliance. Some applications have a substantial need for elastic scaling. And some kinds of issues cut across multiple areas, such as budget, timeframe, security, or trained personnel.

I’d further say that it matters whether the buyer:

Now let’s map those considerations (and others) to some specific market segments. Read more

November 28, 2011

Agile predictive analytics – the heart of the matter

I’ve already suggested that several apparent issues in predictive analytic agility can be dismissed by straightforwardly applying best-of-breed technology, for example in analytic data management. At first blush, the same could be said about the actual analysis, which comprises:

Numerous statistical software vendors (or open source projects) help you with the second part; some make strong claims in the first area as well (e.g., my clients at KXEN). Even so, large enterprises typically have statistical silos, commonly featuring expensive annual SAS licenses and seemingly slow-moving SAS programmers.

As I see it, the predictive analytics workflow goes something like this Read more

July 27, 2011

MongoDB users and use cases

I spoke with Eliot Horowitz and Max Schierson of 10gen last month about MongoDB users and use cases. The biggest clusters they came up with weren’t much over 100 nodes, but clusters an order of magnitude bigger were under development. The 100 node one we talked the most about had 33 replica sets, each with about 100 gigabytes of data, so that’s in the 3-4 terabyte range total. In general, the largest MongoDB databases are 20-30 TB; I’d guess those really do use the bulk of available disk space.   Read more

July 22, 2011

McObject and eXtremeDB

I talked with McObject yesterday. McObject has two product lines, both of which are something like in-memory DBMS — eXtremeDB, which is the main one, and Perst. McObject has been around since at least 2003, probably has no venture capital, and probably has a very low double-digit number of employees.*

*I could be wrong in those guesses; as small companies go, McObject is unusually prone to secrecy games.

As best I understand:

My guess three years ago that eXtremeDB might emerge as an alternative to solidDB seems to have been borne out. McObject CEO Steve Graves says that the core of McObject’s business is OEMs, in sectors such as telecom equipment and defense/aerospace. That’s exactly solidDB’s traditional market, except that solidDB got acquired by IBM and deemphasized it.

I’ve said before that if I were starting a SaaS effort — and it wasn’t just focused on analytics — I’d look at using a memory-centric OODBMS. Perhaps eXtremeDB is worth looking at in such scenarios.

July 5, 2011

Eight kinds of analytic database (Part 2)

In Part 1 of this two-part series, I outlined four variants on the traditional enterprise data warehouse/data mart dichotomy, and suggested what kinds of DBMS products you might use for each. In Part 2 I’ll cover four more kinds of analytic database — even newer, for the most part, with a use case/product short list match that is even less clear.  Read more

June 20, 2011

Temporal data, time series, and imprecise predicates

I’ve been confused about temporal data management for a while, because there are several different things going on.

In essence, the point of time series/event series SQL functionality is to do SQL against incomplete, imprecise, or derived data.* Read more

June 20, 2011

Columnar DBMS vendor customer metrics

Last April, I asked some columnar DBMS vendors to share customer metrics. They answered, but it took until now to iron out a couple of details. Overall, the answers are pretty impressive.  Read more

June 14, 2011

Infobright 4.0

Infobright is announcing its 4.0 release, with imminent availability. In marketing and product alike, Infobright is betting the farm on machine-generated data. This hasn’t been Infobright’s strategy from the getgo, but it is these days, with pretty good focus and commitment. While some fraction of Infobright’s customer base is in the Sybase-IQ-like data mart market — and indeed Infobright put out a customer-win press release in that market a few days ago — Infobright’s current customer targets seem to be mainly:

Key aspects of Infobright 4.0 include:  Read more

April 21, 2011

Application areas for SAS HPA

When I talked with SAS about its forthcoming in-memory parallel SAS HPA offering, we talked briefly about application areas. The three SAS cited were:

Meanwhile, in another interview I heard about, SAS emphasized retailers. Indeed, that’s what spawned my recent post about logistic regression.

The mobile communications one is a bit scary. Your cell phone — and hence your cellular company — know where you are, pretty much from moment to moment. Even without advanced analytic technology applied to it, that’s a pretty direct privacy threat. Throw in some analytics, and your cell company might know, for example, who you hang out with (in person), where you shop, and how those things predict your future behavior. And so the government — or just your employer — might know those things too.

← Previous PageNext Page →

Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:


Search our blogs and white papers

Warning: include(): php_network_getaddresses: getaddrinfo failed: Name or service not known in /home/dbms2cm/public_html/wp-content/themes/monash/static_sidebar.php on line 29

Warning: include( failed to open stream: php_network_getaddresses: getaddrinfo failed: Name or service not known in /home/dbms2cm/public_html/wp-content/themes/monash/static_sidebar.php on line 29

Warning: include(): Failed opening '' for inclusion (include_path='.:/usr/lib/php:/usr/local/lib/php') in /home/dbms2cm/public_html/wp-content/themes/monash/static_sidebar.php on line 29