More on NoSQL and HVSP (or OLRP)
Since posting last Wednesday morning that I’m looking into NoSQL and HVSP, I’ve had a lot of conversations, including with (among others):
- Dwight Merriman of 10gen (MongoDB)
- Damien Katz of Couchio (CouchDB)
- Matt Pfeil of Riptano (Cassandra)
- Todd Lipcon of Cloudera (HBase committer)
- Tony Falco of Basho (Riak)
- John Busch of Schooner
- Ori Herrnstadt of Akiban
| Categories: Akiban, Basho and Riak, Cache, Cassandra, Cloudera, Clustrix, CouchDB, Facebook, HBase, Hadoop, MySQL, NoSQL, OLTP, Object, Open source, Parallelization, Riptano, Schooner, Theory and architecture, Tokutek, memcached | Leave a Comment |
Workday comments on its database architecture
In my discussion of Workday’s technology, I gave an estimate that Workday’s database, if relationally designed, would require “1000s” of tables. That estimate came from Workday, Inc. CTO Stan Swete, in a thoughtful email that made several points about Workday’s database strategy. Workday kindly gave me permission to quote it below.
Read more
| Categories: Data models and architecture, OLTP, Object, Software as a Service (SaaS), Specific users, Theory and architecture, Workday | 2 Comments |
The Workday architecture — a new kind of OLTP software stack
One of my coolest company visits in some time was to SaaS (Software as a Service) vendor Workday, Inc., earlier this month. Reasons included:
- Workday has forward-thinking ideas about SaaS enterprise applications and the integration of business intelligence into same.
- Workday has highly innovative ideas in how it manages data.
- Companies founded by Dave Duffield tend to feature smart, likeable people who talk to one pleasantly and forthrightly. Workday is no exception; CTO Stan Swete and the other Workday folks present were a delight to talk with.
- I’d invited Merv Adrian to come along with me. He asked great questions, and I could gather myself a bit despite how sleep-deprived I was for the first part of that trip.
Workday kindly allowed me to post this Workday slide deck. Otherwise, I’ve split out a quick Workday, Inc. company overview into a separate post.
The biggie for me was the data and object management part. Specifically: Read more
I’m collecting data points on NoSQL and HVSP adoption
I was asked to do a magazine article on NoSQL, where by “NoSQL” is meant “whatever they talk about at NoSQL conferences.” By now the number of publications planning to run the article is up to 2, the deadline is next week and, crucially, it has been agreed that I may talk about HVSP in general, NoSQL and SQL alike.
It also is understood that, realistically, I can’t be expected to know and mention the very latest news for all the many products in the categories. Even so, I think this would be fine time to check just where NoSQL and HVSP adoption stand. Here is most of what I know, or links to same; it would be great if you guys would contribute additional data in the comment thread.
In the NoSQL area: Read more
Big Data is Watching You!
There’s a boom in large-scale analytics. The subjects of this analysis may be categorized as:
- People
- Financial trades
- Electronic networks
- Everything else
The most varied, interesting, and valuable of those four categories is the first one.
| Categories: Analytic technologies, Aster Data, Data warehousing, Investment research and trading, Log analysis, MapReduce, RDF and graphs, Specific users, Telecommunications, Web analytics | 3 Comments |
Links and observations
I’m back from a trip to the SF Bay area, with a lot of writing ahead of me. I’ll dive in with some quick comments here, then write at greater length about some of these points when I can. From my trip: Read more
Nested data structures keep coming up, especially for log files
Nested data structures have come up several times now, almost always in the context of log files.
- Google has published about a project called Dremel. Per Tasso Agyros, one of Dremel’s key concepts is nested data structures.
- Those arrays that the XLDB/SciDB folks keep talking about are meant to be nested data structures. Scientific data is of course log-oriented. eBay was very interested in that project too.
- Facebook’s log files have a big nested data structure flavor.
I don’t have a grasp yet on what exactly is happening here, but it’s something.
| Categories: Facebook, Google, Log analysis, Scientific research, Theory and architecture, eBay | 5 Comments |
dbShards — a lot like an MPP OLTP DBMS based on MySQL or PostgreSQL
I talked yesterday w/ Cory Isaacson, who runs CodeFutures, makers of dbShards. dbShards is a software layer that turns an ordinary DBMS (currently MySQL or PostgreSQL) into an MPP shared-nothing ACID-compliant OLTP DBMS. Technical highlights included: Read more
| Categories: Facebook, MySQL, OLTP, Parallelization, PostgreSQL, dbShards and CodeFutures, dbShards and CodeFutures | 3 Comments |
Sybase SQL Anywhere
After Powersoft acquired Watcom and its famed Fortran compiler, marketing VP Tom Herring told me that the hidden jewel of the acquisition might well be a little DBMS, Watcom SQL. To put it mildly, Tom was right. Watcom SQL became SQL Anywhere; Powersoft was acquired by Sybase; Powersoft’s and Sybase’s main products both fell on hard times; Sybase built a whole mobile technology division around SQL Anywhere; and the whole thing just got sold for billions of dollars to SAP. Chris Kleisath recently briefed me on SQL Anywhere Version 12 (released to manufacturing this month), which seemed like a fine opportunity to catch up on prior developments as well.
The first two things to understand about SQL Anywhere is that there actually are three products:
- Sybase SQL Anywhere, a mid-range relational DBMS.
- Sybase UltraLite, a DBMS for mobile devices.
- Sybase MobiLink, a replication/sync tool.
and also that there are three main deployment/use cases:
- Generic desktop or server computers. This was the original market for SQL Anywhere.
- Laptop/handheld computers. This was the original growth market for SQL Anywhere. In particular, Siebel Systems’ first growth spurt was selling sales force automation software on laptop computers with SQL Anywhere underneath.
- Specialized devices. Earlier this decade, Sybase thought SQL Anywhere’s big growth market was on specialized devices. (I recall a video featuring some kind of automated pill dispensing machine for hospitals.)
| Categories: Mid-range, Progress, Apama, and DataDirect, Specific users, Sybase | Leave a Comment |
Riptano, and Cassandra adoption
Tonight’s Cassandra technology post got plenty long enough on its own, so I’m separating out business and adoption issues here. For starters, known Cassandra users include:
- Facebook, which has said it has 150 or so Cassandra nodes (but see below)
- Twitter, which has said it has 45 or so Cassandra nodes
- Rackspace, which used to be Jonathan Ellis’ employer, and now is backing Cassandra company Riptano
- Digg, which along with Twitter and Rackspace was one of the three major users helping advance the Cassandra project
- OpenX, Simple Geo, Digital Reasoning, who Jonathan cited as production users in March
- Cloudkick, as noted and linked in my other post
- Two customers Riptano named at launch (but I’ve forgotten who they were*)
Fetlife, Meebo, and others seem to at least have a healthy interest in Cassandra, based on their level of involvement in a forthcoming Cassandra Summit. That said, the @Fetlife tweetstream features numerous yelps of pain, and I don’t mean the recreational kind. Read more
| Categories: Cassandra, Facebook, Market share, NoSQL, Open source, Parallelization, Pricing, Riptano, Specific users | 3 Comments |
