Objectivity Infinite Graph
I chatted Wednesday night with Darren Wood, the Australia-based lead developer of Objectivity’s Infinite Graph database product. Background includes:
- Objectivity is a profitable, decades-old object-oriented DBMS vendor with about 50 employees.
- Like some other object-oriented DBMS of its generation, Objectivity is as much a toolkit for building DBMS as it is a real finished DBMS product. Objectivity sales are typically for custom deals, where Objectivity helps with the programming.
- The way Objectivity works is basically:
- You manage objects in memory, in the format of your choice.
- Objectivity bangs them to disk, across a network.
- Objectivity manages the (distributed) pointers to the objects.
- You can, if you choose, hard code exactly which objects are banged to which node.
- Objectivity’s DML for reading data is very different from Objectivity’s DML for writing data. (I think the latter is more like the program code itself, while the former is more like regular DML.)
- The point of Objectivity is not so much to have fast I/O. Rather, it is to minimize the CPU cost of getting the data that comes across the wire into useful form.
- Darren got the idea of putting a generic graph DBMS front-end on Objectivity while doing a relationship analytics project for an Australian intelligence agency.
- Darren redoubled his efforts to sell the project internally at Objectivity after reading what I wrote about relationship analytics back in 2006 or so.
- There is now a 5 or so person team developing Infinite Graph.
- Infinite Graph is just now going out to beta test.
Infinite Graph is an API or language binding on top of Objectivity that:
- Hides a lot of Objectivity’s complexity.
- Is suitable for graph/relationship analytics.
Categories: Analytic technologies, Object, Objectivity and Infinite Graph, RDF and graphs, Surveillance and privacy | 10 Comments |
Best practices for analytic DBMS POCs
When you are selecting an analytic DBMS or appliance, most of the evaluation boils down to two questions:
- How quickly and cost-effectively does it execute SQL?
- What analytic functionality, SQL or otherwise, does it do a good job of executing?
And so, in undertaking such a selection, you need to start by addressing three issues:
- What does “speed” mean to you?
- What does “cost” mean to you?
- What analytic functionality do you need anyway?
Categories: Benchmarks and POCs, Data warehousing, Exadata, Netezza, ParAccel, Teradata | 7 Comments |
The underlying technology of QlikView
QlikTech* finally decided both to become a client and, surely not coincidentally, to give me more technical detail about QlikView than it had when last we talked a couple of years ago. Indeed, I got to spend a couple of hours on the phone not just with Anthony Deighton, but also with QlikTech’s Hakan Wolge, who wrote 70-80% of the code in QlikView 1.0, and remains in effect QlikTech’s chief architect to this day.
*Or, as it now appears to be called, Qlik Technologies.
Let’s start with some quick reminders:
- QlikTech makes QlikView, a widely popular business intelligence (BI) tool suite.
- QlikView is distinguished by the flexibility of navigation through its user interface.
- To support this flexibility, QlikView preloads all data you might want to query into memory.
Let’s also dispose of one confusion right up front, namely QlikTech’s use of the word associative: Read more
Categories: Business intelligence, Database compression, Memory-centric data management, QlikTech and QlikView | 36 Comments |
Kickfire update
A Kickfire competitor tipped me off that he got 3 Kickfire salesmen’s resumes in 24 hours. I ran this by Kickfire CEO Bruce Armstrong, who confirmed that Kickfire has had a layoff, but gave me no further details.
Bruce also told me that Kickfire is now up to 10 paying customers, and that there are repeat deals.
Categories: Data warehouse appliances, Data warehousing, Kickfire, Market share and customer counts | 3 Comments |
Ingres VectorWise technical highlights
After working through problems w/ travel, cell phones, and so on, Peter Boncz of VectorWise finally caught up with me for a regrettably brief call. Peter gave me the strong impression that what I’d written in the past about VectorWise had been and remained accurate, so I focused on filling in the gaps. Highlights included: Read more
Categories: Actian and Ingres, Analytic technologies, Benchmarks and POCs, Columnar database management, Data warehousing, Database compression, Open source, VectorWise | 2 Comments |
Rainstor update
I was tired and cranky when I talked with my former clients at Rainstor (formerly Clearpace) yesterday, so our call was shorter than it otherwise might have been. Anyhow, there’s a new version called Rainstor 4, the two main themes of which are:
- Compliance-specific features.
- Bottleneck Whack-A-Mole.
The point is that Rainstor is focusing its efforts on enterprises that: Read more
Fun with quotes in the VectorWise press release
Ingres forgot to prebrief me on the VectorWise announcement, and despite valiant efforts hasn’t succeeded in connecting with me since they realized the lapse. Meanwhile, I took a look at the VectorWise press release, and found the quotes to be somewhat amusing.
Read more
Categories: Actian and Ingres, VectorWise | 8 Comments |
Algebraix
I talked Friday with Chris Piedemonte and Gary Sherman, respectively the Cofounder/CTO and Chief Mathematician of Algebraix, who hooked up together for this project back in 2003 or 2004. (Algebraix is the company formerly known as XSPRADA.) Algebraix makes an analytic DBMS, somewhat based on the ideas of extended set theory, that runs on SMP (Symmetric MultiProcessing) boxes. Like all analytic DBMS vendors, Algebraix has on some occasions run some queries orders of magnitude faster than they ran on the systems users were looking to replace.
Algebraix’s secret sauce is that the DBMS keeps reorganizing and recopying the data on disk, to optimize performance in response to expected query patterns (automatically inferred from queries it’s seen so far). This sounds a lot like the Infobright story, with some of the more obvious differences being: Read more
Categories: Algebraix, Data warehousing, Database compression, Infobright, Theory and architecture | 3 Comments |
Extended set theory, aka “What is a tuple anyway?”
The Algebraix folks are trying to repopularize David Childs’ idea of “Extended set theory.” In a nutshell, the extended set theory idea is:
A tuple is a set of (field-name, field-value) pairs.
I’ve been fairly negative about the extended set theory concept – but in fairness, that may be because I misunderstood how other people thought of tuples. Any time I’ve had to formalize what I thought of a tuple as being, I came up with something very much like the above, except that if one wants to be relational one needs a requirement like:
In any one tuple, each field-name must be unique.
In line with that definition, I’d say a table is something like: Read more
VoltDB finally launches
VoltDB is finally launching today. As is common for companies in sectors I write about, VoltDB — or just “Volt” — has discovered the virtues of embargoes that end 12:01 am. Let’s go straight to the technical highlights:
- VoltDB is based on the H-Store technology, which I wrote about in February, 2009. Most of what I said about H-Store then applies to VoltDB today.
- VoltDB is a no-apologies ACID relational DBMS, which runs entirely in RAM.
- VoltDB has rather limited SQL. (One example: VoltDB can’t do SUMs in SQL.) However, VoltDB guy Tim Callaghan (Mark Callaghan’s lesser-known but nonetheless smart brother) asserts that if you code up the missing functionality, it’s almost as fast as if it were present in the DBMS to begin with, because there’s no added I/O from the handoff between the DBMS and the procedural code. (The data’s in RAM one way or the other.)
- VoltDB’s Big Conceptual Performance Story is that it does away with most locks, latches, logs, etc., and also most context switching.
- In particular, you’re supposed to partition your data and architect your application so that most transactions execute on a single core. When you can do that, you get VoltDB’s performance benefits. To the extent you can’t, you’re in two-phase-commit performance land. (More precisely, you’re doing 2PC for multi-core writes, which is surely a major reason that multi-core reads are a lot faster in VoltDB than multi-core writes.)
- VoltDB has a little less than one DBMS thread per core. When the data partitioning works as it should, you execute a complete transaction in that single thread. Poof. No context switching.
- A transaction in VoltDB is a Java stored procedure. (The early idea of Ruby on Rails in lieu of the Java/SQL combo didn’t hold up performance-wise.)
- Solid-state memory is not a viable alternative to RAM for VoltDB. Too slow.
- Instead, VoltDB lets you snapshot data to disk at tunable intervals. “Continuous” is one of the options, wherein a new snapshot starts being made as soon as the last one completes.
- In addition, VoltDB will also spool a kind of transaction log to the target of your choice. (Obvious choice: An analytic DBMS such as Vertica, but there’s no such connectivity partnership actually in place at this time.)