July 2, 2012

Introduction to Yarcdata

Cray’s strategy these days seems to be:

At the moment, the main diversifications are:

The last of the three is what Cray subsidiary Yarcdata is all about.

“Yarc” = “Cray” spelled backwards.

To a first approximation, Yarcdata is a bunch of Cray guys, with an overlay out of Informatica/Siperian and other database-oriented software companies. Yarcdata’s first effort is to manage graph data, via an appliance product called uRika.* More precisely, uRika manages RDF triples, with SPARQL as the query language. More precisely yet, uRika manages quadruples, with the fourth field being for “subgraph ID”. Having multiple subgraphs sounds like it’s somewhere between having:

A natural way to wind up with multiple subgraphs is to import data from different sources.

Yarcdata is still trying to figure out exactly which relationship analytics application areas it is pursuing. Yarcdata’s big multi-year design partner was a large intelligence agency, for an unspecified application that obviously has a lot to do with terrorism and national security. Also mentioned, as is appropriate for a Cray subsidiary, are application areas that feel more scientific or technical (life sciences, financial services). Not mentioned much so far — except perhaps by me — are telecom/influencer-detection and anti-fraud.

The last time Yarcdata gave me a customer count, it was 5, but that was some months ago.

As best I understand, uRika has two tiers of servers. One tier features commodity hardware, and runs a stack of data access software from or at least based on the Apache Jena project. The other tier has classic Cray hardware, running a proprietary data store. This data store is in-memory, except that like most in-memory analytic stores, it can be initialized from disk. Notes on the data store part include:

On the graph analytic functionality, there seems to be less in the way of uRika secret sauce at this time. SPARQL 1.0 and Jena get mentioned, but innovative extensions are discussed not so much in the present tense, but rather in future or hypothetical terms. Anyhow, I haven’t spent a lot of time looking at what SPARQL can or can’t do, but I gather that if you want to do a straightforward graph query, SPARQL can handle it. But for graph analytics such as centrality measures or whatever, you need tools or extensions.

Comments

One Response to “Introduction to Yarcdata”

  1. Cray | DBMS 2 : DataBase Management System Services on July 2nd, 2012 4:57 am

    […] I’m now consulting to Cray largely because of Bill Blake, specifically to Cray subsidiary Yarcdata. Along the way, I’ve picked up enough about Cray in general — largely from Bill and […]

Leave a Reply




Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.