This is probably a good time to disclose that I own a chunk of founders’ stock — no, I didn’t pay cash for it — in LiteStack, the start-up sponsoring ZeroVM.
Jordan Novet posted a survey of Hadoop security, and evidently Merv Adrian is making a big deal about the subject as well. But there’s one point I rarely see mentioned which, come to think of it, could apply to relational analytic platforms as well.
A big use of Hadoop and analytic platforms alike is investigative analytics, and specifically experimentation via hastily-written code. But untrusted code can, at least in theory, compromise the security of the servers it runs on. And when you run the code on the same servers that manage the data, that could compromise the security of your database as well.
Frankly, in most use cases I doubt this is a big deal. Process isolation would probably avert most “accidental attacks”, and a deliberate attack might be hard to pull off in a reliable manner. As for database corruption, also a theoretical danger via the same vector — that danger is much smaller than the risk of bad code being submitted by well-intentioned doofuses.
Still, I’d like to see a forthright discussion of this threat.