October 15, 2015

Cassandra and privacy requirements

For starters:

But when I made that connection and checked in accordingly with my client Patrick McFadin at DataStax, I discovered that I’d been a little confused about how multi-data-center Cassandra works. The basic idea holds water, but the details are not quite what I was envisioning.

The story starts:

In particular, a remote replication factor for Cassandra can = 0. When that happens, then you have data sitting in one geographical location that is absent from another geographical location; i.e., you can be in compliance with laws forbidding the export of certain data. To be clear (and this contradicts what I previously believed and hence also implied in this blog):

The most visible DataStax client using this strategy is apparently ING Bank.

If you have a geo-compliance issue, you’re probably also concerned about security. After all, the whole reason the issue arises is because one country’s government might want to look at another country’s citizens’ or businesses’ data. The DataStax security story is approximately:

While flexible, Cassandra’s multi-data-center features do add some complexity. Tunable-consistency choices are baked into Cassandra programs at each point data is accessed, and more data centers make for more choices. (Default best practice = write if you get a local quorum, running the slight risk of logical data centers being out of sync with each other.)

One way in which the whole thing does seem nice and simple is that you can have different logical data centers running on different kinds of platforms — cloud, colocation, in-house, whatever — without Cassandra caring.

I’m not going to call the DataStax Enterprise approach to geo-compliance the “gold standard”, because some of it seems pretty clunky or otherwise feature-light. On the other hand, I’m not aware of competitors who exceed it, in features or track record, so “silver standard” seems defensible.


3 Responses to “Cassandra and privacy requirements”

  1. clive boulton on October 16th, 2015 1:39 am

    Cassandra’s tunable replication with upcoming cell-level or row-level security ups the ante for Sqrrl / Koverse who add granular security to Accumulo.

    Engineering privacy in “more better” than adding security to keep folks out…

  2. Curt Monash on October 16th, 2015 5:25 am

    Wait a moment. I’m pretty sure that cell-level security is a feature of any version of Accumulo.

    And HBase imitated it, in what I think was the first major use of the co-processor capability.

  3. clive boulton on October 16th, 2015 3:48 pm

    Yes. Cell-level security is a feature of Accumulo, but not the only granular security pattern. https://patents.google.com/?q=sqrrl

Leave a Reply

Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:


Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.