Presentations
Posts focused on live presentations, typically by Curt Monash.
My talk this morning
Netezza’s Enzee Universe conference is now almost over, and I still haven’t figured out what my gig as “conference blogger” entails. More precisely, I’m operating from our unspoken fallback plan, namely “If all else fails, do what you’d do anyway, but do more of it.” For me to live up to that, all Netezza had to do was find interesting things to write about — and as far as I’m concerned, they already did that last Thursday in spades; the five interesting meetings they set up for with users and partners on Tuesday were just gravy.
Another part of the deal was that I’d give a talk this morning at 9:30 am. And when I give talks, I like to put up posts that cover whatever material I haven’t written up before, while also offering the talk’s listeners convenient links to materials I have already covered previously at length.
| Categories: Analytic technologies, Business intelligence, Data warehousing, Netezza, Presentations | 3 Comments |
Notes on a spate of Netezza-related blog posts
Fearing that last year’s tight travel budgets would hamper attendance, Netezza – like a number of other vendors – decided to forgo a traditional user conference. Instead, it took its Enzee Universe show on the road, essentially spreading the conference across eight cities. I was asked to keynote six of the installments.
After the first one, Netezza Marketing VP Tim Young took me aside for two pieces of constructive criticism. The surprising one* was that he felt I had been INSUFFICIENTLY critical of Netezza. Since then, every other conversation we’ve had about content creation has also featured ringing reassurances that Tim truly wants independent, non-pandering work.
*The unsurprising one was that I’d rushed. Well, duh. After months of telling me I had a 1 hour slot, Netezza cut me to ½ hour a few days beforehand. And my talk had been designed to be high-speed even in the longer time slot …
As a result, I accepted a subsequent gig from Netezza that I would barely consider from most other vendors. Namely, for this year’s Enzee Universe – June 21-23, aka Monday-Wednesday of this week, at the Westin Waterfront Hotel in Boston – I would do some contemporaneous blogging. The parameters we agreed on included: Read more
| Categories: Data warehouse appliances, Data warehousing, Netezza, Presentations | 3 Comments |
Notes and cautions about new analytic technology
As previously noted, I headlined Aster’s Big Data Summit in Washington, DC last Thursday. More than others, that talk did reuse material I’d presented before. I promised the audience that when I got back I’d put up a blog post linking to supporting material for the talk.
Part of the time, I talked about things I’ve written about before. For example: Read more
| Categories: Analytic technologies, Aster Data, Business intelligence, Data warehousing, Presentations | 2 Comments |
I’ll be speaking in Washington, DC on May 6
My clients at Aster Data are putting on a sequence of conferences called “Big Data Summit(s)”, and wanted me to keynote one. I agreed to the one in Washington, DC, on May 6, on the condition that I would be allowed to start with the same liberty and privacy themes I started my New England Database Summit keynote with. Since I already knew Aster to be one of the multiple companies in this industry that is responsibly concerned about the liberty and privacy threats we’re all helping cause, I expected them to agree to that condition immediately, and indeed they did.
On a rough-draft basis, my talk concept is:
Implications of New Analytic Technology in four areas:
- Liberty & privacy
- Data acquisition & retention
- Data exploration
- Operationalized analytics
I haven’t done any work yet on the talk besides coming up with that snippet, and probably won’t until the week before I give it. Suggestions are welcome.
If anybody actually has a link to a clear discussion of legislative and regulatory data retention requirements, that would be cool. I know they’ve exploded, but I don’t have the details.
| Categories: Analytic technologies, Archiving and information preservation, Aster Data, Data warehousing, Liberty and privacy, Presentations | 1 Comment |
Liberty and privacy, once again
I’ve long argued three points:
- It is inevitable* that governments and other constituencies will obtain huge amounts of information, which can be used to drastically restrict everybody’s privacy and freedom.
- To protect against this grave threat, multiple layers of defense are needed, technical and legal/regulatory/social/political alike.
- One particular layer is getting insufficient attention, namely restrictions upon the use (as opposed to the acquisition or retention) of data.
*And indeed in many ways even desirable
I surprised people by leading with the liberty/privacy subject at my New England Database Summit keynote; considerable discussion ensued, largely supportive. I hope for a similar outcome when I keynote the Aster Big Data Summit in Washington, DC in May. And I expect to do even more to advance the liberty/privacy discussion as 2010 unfolds.
Fortunately, I’m not the only only thinking or talking about these liberty/privacy issues. Read more
| Categories: Analytic technologies, Liberty and privacy, Presentations | 9 Comments |
Open issues in database and analytic technology
The last part of my New England Database Summit talk was on open issues in database and analytic technology. This was closely intertwined with the previous section, and also relied on a lot that I’ve posted here. So I’ll just put up a few notes on that part, with lots of linkage to prior discussion of the same points. Read more
Interesting trends in database and analytic technology
My project for the day is blogging based on my “Database and analytic technology: State of the union” talk of a few days ago. (I called it that because of when it was given, because it mixed prescriptive and descriptive elements, and because I wanted to call attention to the fact that I cover the union of database and analytic technologies – the intersection of those two sectors is an area of particular focus, but is far from the whole of my coverage.)
One section covered recent/ongoing/near-future trends that I thought were particularly interesting, including: Read more
Flash, other solid-state memory, and disk
If there’s one subject on which the New England Database Summit changed or at least clarified my thinking,* it’s future storage technologies. Here’s what I now think:
- Solid-state memory will soon be the right storage technology for a large fraction of databases, OLTP and analytic alike. I’m not sure whether the initial cutoff in database size is best thought of as terabytes or 10s of terabytes, but it’s in that range. And it will increase over time, for the usual cheaper-parts reasons.
- That doesn’t necessarily mean flash. PCM (Phase-Change Memory) is coming down the pike, with perhaps 100X the durability of flash, in terms of the total number of writes it can tolerate. On the other hand, PCM has issues in the face of heat. More futuristically, IBM is also high on magnetic racetrack memory. IBM likes the term storage-class memory to cover all this — which I find regrettable, since the acronym SCM is way overloaded already.
- Putting a disk controller in front of solid-state memory is really wasteful. It wreaks havoc on I/O rates.
- Generic PCIe interfaces don’t suffice either, in many analytic use cases. Their I/O is better, but still not good enough. (Doing better yet is where Petascan – the stealth-mode company I keep teasing about – comes in.)
- Disk will long be useful for very large databases. Kryder’s Law, about disk capacity, has at least as high an annual improvement as Moore’s Law shows for chip capacity, the disk rotation speed bottleneck notwithstanding. Disk will long be much cheaper than silicon for data storage. And cheaper silicon in sensors will lead to ever more machine-generated data that fills up a lot of disks.
- Disk will long be useful for archiving. Disk is the new tape.
*When the first three people to the question microphone include both Mike Stonebraker and Dave DeWitt, your thinking tends to clarify in a hurry.
Related links
- A slide deck by Mohan of IBM similar to the one he presented at the NEDB Summit about storage-class memories.
- A much more detailed IBM presentation on storage-class memories.
- Oracle’s and Teradata’s beliefs about the importance of solid-state memory.
Other posts based on my January, 2010 New England Database Summit keynote address
- Data-based snooping — a huge threat to liberty that we’re all helping make worse
- Interesting trends in database and analytic technology
- Open issues in database and analytic technology
| Categories: Data warehousing, Michael Stonebraker, Presentations, Solid-state memory, Storage, Theory and architecture | 2 Comments |
Data-based snooping — a huge threat to liberty that we’re all helping make worse
Every year or two, I get back on my soapbox to say:
- Database and analytic technology, as they evolve, will pose tremendous danger to individual liberties.
- We in the industry who are creating this problem also have a duty to help fix it.
- Technological solutions alone won’t suffice. Legal changes are needed.
- The core of the needed legal changes are tight restrictions on governmental use of data, because relying on restrictions about data acquisition and retention clearly won’t suffice.
But this time I don’t plan to be so quick to shut up.
My best writing about the subject of liberty to date is probably in a November, 2008 blog post. My best public speaking about the subject was undoubtedly last Thursday, early in my New England Database Summit keynote address; I got a lot of favorable feedback on that part from the academics and technologists in attendance.
My emphasis is on data-based snooping rather than censorship, for several reasons:
- My work and audience are mainly in the database and analytics sectors. Censorship is more a concern for security, networking, and internet-technology folks.
- After censorship, I think data-based snooping is the second-worst technological threat to liberty.
- In the US and other fairly free countries, data-based snooping may well be the #1 threat.
New England Database Summit (January 28, 2010)
New England Database Day has now, in its third year, become a “Summit.” It’s a nice event, providing an opportunity for academics and business folks to mingle. The organizers are basically the local branch of the Mike Stonebraker research tree, with this year’s programming head being Daniel Abadi. It will be on Thursday, January 28, 2010, once again in the Stata Center at MIT. It would be reasonable to park in the venerable 4/5 Cambridge Center parking lot, especially if you’d like to eat at Legal Seafood afterwards.
So far there are two confirmed speakers — Raghu Ramakrishnan of Yahoo and me. My talk title will be something like “Database and analytic technology: The state of the union”, with all wordplay intended.
There’s more information at the official New England Database Summit website. There’s also a post with similar information on Daniel Abadi’s DBMS Musings blog.
Edit after the event:
Posts based on my January, 2010 New England Database Summit keynote address
