January 20, 2011

Notes, links, and comments January 20, 2011

I haven’t done a pure notes/links/comments post for a while. Let’s fix that now. (A bunch of saved-up links, however, did find their way into my recent privacy threats overview.)

First and foremost, the fourth annual New England Database Summit (nee “Day”) is next week, specifically Friday, January 28. As per my posts in previous years, I think well of the event, which has a friendly, gathering-of-the-clan flavor. Registration is free, but the organizers would prefer that you register online by the end of this week, if you would be so kind.

The two things potentially wrong with the New England Database Summit are parking and the rush hour drive home afterwards. I would listen with interest to any suggestions about dinner plans.

One thing I hope to figure out at the Summit or before is what the hell is going on on Vertica’s blog or, for that matter, at Vertica. The recent Mike Stonebraker post that spawned a lot of discussion and commentary has disappeared. Meanwhile, Vertica has had three consecutive heads of marketing leave the company since June, and I don’t know who to talk to there any more. 

Speaking of blog problems, we’ve had performance/reliability glitches here again. Melissa Bradshaw determined that the problem was an apparently activated WP Super Cache not actually caching anything. We should be OK now, so please let me know if there are further difficulties. One interesting step — it turns out that there’s a WordPress plug-in that does automatic EXPLAINs (if you’re the blog administrator).

Another interesting Mike Stonebraker post can be found (at least for now) over on the VoltDB blog. He continued his assault on the CAP Theorem, arguing that availability is an exaggerated concern when there are bug- or other human-error-driven kinds of outages, and also arguing that the concept of “partition tolerance” is misguided. Commenters pushed back, pointing out that in geographically distributed scenarios, the CAP Theorem sense of partitioning is quite a legitimate concern.

When I posted an expansive definition of machine-generated data a few weeks ago, Daniel Abadi shot back advocating a narrower one (see the comment thread, which includes a link to his thoughtful post). The disagreement boils down to conflicting intuitions as to whether the machine-data/true-human-data ratio will keep growing rapidly, in hybrid cases such as web logs or social gaming.

Dave McClure recently offered a survey of hot startup investing themes. High on his list were location-based services, which is a reminder to us all that geo-spatial data is becoming much more important. Ray Wang is savvy enough to understand the privacy dangers location-based services cause, but influential though Ray is, his view will probably remain in the minority. Machine-generated data and video each also make appearances on Dave’s list.

And wait! I have even more links for you!  Several are taken from Thomas Houston’s choices for The Best Tech Writing of 2010. He chose well. I recommend sampling his list further.


In 2005, the online chess-playing site Playchess.com hosted what it called a “freestyle” chess tournament in which anyone could compete in teams with other players or computers. … The surprise came at the conclusion of the event. The winner was revealed to be not a grandmaster with a state-of-the-art PC but a pair of amateur American chess players using three computers at the same time. Their skill at manipulating and “coaching” their computers to look very deeply into positions effectively counteracted the superior chess understanding of their grandmaster opponents and the greater computational power of other participants. Weak human + machine + better process was superior to a strong computer alone and, more remarkably, superior to a strong human + machine + inferior process.


4 Responses to “Notes, links, and comments January 20, 2011”

  1. Joe on January 20th, 2011 1:40 pm

    January 20, 2010?

  2. Curt Monash on January 20th, 2011 5:14 pm

    Yikes. Will fix typo!

  3. Mike Pilcher on January 27th, 2011 11:24 am


    If you want to speak to someone about Vertica’s marketing you could call me. It appears that the Vertica team are afraid of debate when it comes to what should be in a CDBMS (Columnar Database Management System), it makes me wonder what they’re missing. Text Analytics, Right-time search? How do they scale their database for users without GBCC (Generation Based Concurrency Control)? Hmmm. You can read more about the debate that was here on SAND.com. (Though at this point it’s not so much a debate — that takes two and someone at Vertica blinked and ran for cover.)


  4. Curt Monash on January 29th, 2011 6:00 pm

    I don’t know about that, Mike. For all their failings, they still have a clearer story than you do.

Leave a Reply

Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:


Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.