September 12, 2009

Introduction to the XLDB and SciDB projects

Before I write anything else about the overlapping efforts known as XLDB and SciDB, I probably should explain and disambiguate what they are as best I can. XLDB was organized and still is run by guys who want to solve a scientific problem in eXtremely Large DataBase Management, most especially Jacek Becla of SLAC (the organization previously known as Stanford Linear Accelerator Center). Becla’s original motivation was that he needs a DBMS to manage what will be 55 petabytes of raw image data and 100 petabytes of astronomical data total for LSST (Large Synoptic Survey Telescope).

XLDB more or less comprises:

The first result or spin-out from the XLDB effort seems to have been the SciDB project. This is an effort to build an open source DBMS called SciDB that will address some of the needs the XLDB effort is uncovering. (More on that in other posts.) Somewhat confusingly, all the use cases the XLDB group is collecting are currently being posted on SciDB’s website, apparently because it’s glitzier and healthier than, say, the excessively sparse XLDB wiki. Some SciDB development has happened, but no large sugar daddy has yet been found. (It’s a fairly open secret that eBay looked seriously and favorably at funding SciDB before the economic downturn.) hit.

Numerous big-name computer scientists are associated with SciDB, indeed more closely (it would seem) than with XLDB. That said, I’m guessing Dave DeWitt’s involvement in the open-source SciDB isn’t what it would be if he hadn’t gone to Microsoft. DeWitt actually skipped XLDB3, although he was in town for VLDB. (XLDB3 was back-to-back with VLDB 2009 in Lyon, France in late August.) Stonebraker just didn’t make the flight for either conference, due to the double-knee “upgrade” he had back in March.

There’s a lot more to be said about the cross-discipline or science-specific requirements that researchers place on data management, but I’ll leave that for later and just get this posted as a start — assuming, of course, that blog outages permit. :(

Related links

Comments

2 Responses to “Introduction to the XLDB and SciDB projects”

  1. Fault-tolerant queries | DBMS2 -- DataBase Management System Services on September 13th, 2009 12:36 am

    [...] et al. trumpet query fault-tolerance as one of the virtues of HadoopDB. Some of the scientists at XLDB spoke of query fault-tolerance as being a good reason to leave 100s or 1000s of terabytes of data [...]

  2. Why you should go to XLDB4 | DBMS2 -- DataBase Management System Services on July 1st, 2010 12:23 am

    [...] when Jacek Becla started the XLDB conferences on the premise that scientific and big data analytic challenges have a lot in common, [...]

Leave a Reply




Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.