I haven’t done a pure notes/links/comments post for a while. Let’s fix that now. (A bunch of saved-up links, however, did find their way into my recent privacy threats overview.)
First and foremost, the fourth annual New England Database Summit (nee “Day”) is next week, specifically Friday, January 28. As per my posts in previous years, I think well of the event, which has a friendly, gathering-of-the-clan flavor. Registration is free, but the organizers would prefer that you register online by the end of this week, if you would be so kind.
The two things potentially wrong with the New England Database Summit are parking and the rush hour drive home afterwards. I would listen with interest to any suggestions about dinner plans.
One thing I hope to figure out at the Summit or before is what the hell is going on on Vertica’s blog or, for that matter, at Vertica. The recent Mike Stonebraker post that spawned a lot of discussion and commentary has disappeared. Meanwhile, Vertica has had three consecutive heads of marketing leave the company since June, and I don’t know who to talk to there any more.
Speaking of blog problems, we’ve had performance/reliability glitches here again. Melissa Bradshaw determined that the problem was an apparently activated WP Super Cache not actually caching anything. We should be OK now, so please let me know if there are further difficulties. One interesting step — it turns out that there’s a WordPress plug-in that does automatic EXPLAINs (if you’re the blog administrator).
Another interesting Mike Stonebraker post can be found (at least for now) over on the VoltDB blog. He continued his assault on the CAP Theorem, arguing that availability is an exaggerated concern when there are bug- or other human-error-driven kinds of outages, and also arguing that the concept of “partition tolerance” is misguided. Commenters pushed back, pointing out that in geographically distributed scenarios, the CAP Theorem sense of partitioning is quite a legitimate concern.
When I posted an expansive definition of machine-generated data a few weeks ago, Daniel Abadi shot back advocating a narrower one (see the comment thread, which includes a link to his thoughtful post). The disagreement boils down to conflicting intuitions as to whether the machine-data/true-human-data ratio will keep growing rapidly, in hybrid cases such as web logs or social gaming.
Dave McClure recently offered a survey of hot startup investing themes. High on his list were location-based services, which is a reminder to us all that geo-spatial data is becoming much more important. Ray Wang is savvy enough to understand the privacy dangers location-based services cause, but influential though Ray is, his view will probably remain in the minority. Machine-generated data and video each also make appearances on Dave’s list.
And wait! I have even more links for you! Several are taken from Thomas Houston’s choices for The Best Tech Writing of 2010. He chose well. I recommend sampling his list further.
- In an article about new electronic exchanges, the New York Times shared some numbers — 56% of trading volume “high speed” in stocks, 1/3 or so when looking at domestic futures, .1 milliseconds to do a NASDAQ trade, 13 milliseconds for a trade that involves Chicago/NYC communication, 60 milliseconds for NYC/Frankfurt. Slashdot offers photos and other context.
- James Taylor caught up with once-hot KXEN, and evidently got the impression KXEN was focusing a lot of its efforts on the tedious, time-consuming data-preparation side of modeling.
- Richard Tibbetts is being pretty funny on his blog.
- (Slashdot) The Russian government seems to be getting into open source software in a big way. Well, PostgreSQL is already big in Russia (close to 1 million installations, I was once told), so this might conceivably add some energy to its development.
- In Drupal 7, Drupal now has “a built-in test environment, version upgrade manager, and a database abstraction layer for use with MariaDB, SQL Server, MongoDB, Oracle, MySQL, PostgreSQL, and SQLite.” That may explain how MongoDB can hope to further penetrate the Drupal market.
- The Boston Phoenix argues that government lacks the manpower, budget, and expertise to keep up with its responsibilities in preserving and exposing information. Fixing that problem sounds like a pretty worthy open source development effort to me.
- Clay Shirky reminded us that modern machine learning is what replaced old-style AI.
- Nominally reviewing a book he obviously disdains, Garry Kasparov — in my opinion the most admirable world chess champion ever — surveyed computer chess in quick, nontechnical way. The whole thing is a bit wordy even so, so I’ll quote one part:
In 2005, the online chess-playing site Playchess.com hosted what it called a “freestyle” chess tournament in which anyone could compete in teams with other players or computers. … The surprise came at the conclusion of the event. The winner was revealed to be not a grandmaster with a state-of-the-art PC but a pair of amateur American chess players using three computers at the same time. Their skill at manipulating and “coaching” their computers to look very deeply into positions effectively counteracted the superior chess understanding of their grandmaster opponents and the greater computational power of other participants. Weak human + machine + better process was superior to a strong computer alone and, more remarkably, superior to a strong human + machine + inferior process.