Theory and architecture
Analysis of design choices in databases and database management systems. Related subjects include:
- Any subcategory
- Database diversity
- Explicit support for specific data types
- (in Text Technologies) Text search
XtremeData update
I talked with Geno Valente of XtremeData tonight. Highlights included:
- XtremeData still hasn’t sold any dbX stuff (they’ve had a side business in generic FPGA-based boards paying the bills for years). Well, there may have been some paid POCs (proofs of concept) or something, but real sales haven’t come through yet.
- XtremeData does have three prospects who have said “Yes”, and expects one order to come through this month.
- XtremeData continues to believe it shines when:
- Data models are complex
- In particular, there are complex joins
- In particular, two large tables have to be joined with each other, under circumstances where no product can avoid doing vast data redistribution
- XtremeData insists that all the nice things Bill Inmon – including in webinars — has said about it has not been for pay or other similar business compensation. That’s quite unusual.
- XtremeData is coming out with a new product, codenamed the Personal Data Warehouse (PDW), which:
- Is ready to go into beta test
- Should be launched in a month and a half or so
- Will have a different name when it is launched
Naming aside, Read more
| Categories: Analytic technologies, Benchmarks and POCs, Data warehouse appliances, Data warehousing, Database compression, Kickfire, Market share, Netezza, Pricing, XtremeData | Leave a Comment |
Memcached-based company NorthScale launches
NorthScale, a start-up based around memcached, has just launched, two weeks after the Todd Hoff’s post arguing the MySQL/memcached combo is passe’. NorthScale wouldn’t necessarily argue with Todd, arguing that what you really should use instead is NorthScale’s combo of memcached and MemBase, a memcached-like DBMS …
… or something like that. I don’t intend to write seriously about NorthScale until I have a better idea of what MemBase is.
In the mean time,
- VentureBeat put up a solid post on NorthScale’s company history and so on
- Om Malik bought into the NorthScale memcached pitch
- TechCrunch has a low-quality post about NorthScale (although it wasn’t as error-riddled as the same author’s post about nStein, which Seth Grimes properly blasted)
| Categories: Cache, Clustering, NoSQL, Parallelization | Leave a Comment |
Toward a NoSQL taxonomy
I talked Friday with Dwight Merriman, founder of 10gen (the MongoDB company). He more or less convinced me of his definition of NoSQL systems, which in my adaptation goes:
NoSQL = HVSP (High Volume Simple Processing) without joins or explicit transactions
Within that realm, Dwight offered a two-part taxonomy of NoSQL systems, according to their data model and replication/sharding strategy. I’d be happier, however, with at least three parts to the taxonomy:
- How data looks logically on a single node
- How data is stored physically on a single node
- How data is distributed, replicated, and reconciled across multiple nodes, and whether applications have to be aware of how the data is partitioned among nodes/shards. Read more
| Categories: Cassandra, Data models and architecture, NoSQL, Parallelization, RDF and graphs, Structured documents, Theory and architecture | 4 Comments |
The Naming of the Foo
Let’s start from some reasonable premises. Read more
| Categories: Data models and architecture, Database diversity, Hadoop, MapReduce, Mark Logic, NoSQL, OLTP, Theory and architecture | 23 Comments |
Some NoSQL links
I plan to post a few things soon about MongoDB, Cassandra, and NoSQL in general. So I’m poking around a bit reading stuff on the subjects. Here are some links I found. Read more
| Categories: Amazon and its cloud, Cassandra, Continuent, Google, MySQL, NoSQL, Open source, RDF and graphs, Tokutek | 5 Comments |
Cassandra and the NoSQL scalable OLTP argument
Todd Hoff put up a provocative post on High Scalability called MySQL and Memcached: End of an Era? The post itself focuses on observations like:
- Facebook invented and is adopting Cassandra.
- Twitter is adopting Cassandra.
- Digg is adopting Cassandra.
- LinkedIn invented and is adopting Voldemort.
- Gee, it seems as if the super-scalable website biz has moved beyond MySQL/Memcached.
But in addition, he provides a lot of useful links, which DBMS-oriented folks such as myself might have previously overlooked. Read more
| Categories: Cassandra, Data models and architecture, NoSQL, OLTP, Open source, Parallelization, Specific users, Theory and architecture | 11 Comments |
Another reason to expect number-crunching and big-data management to converge
Dan Olds argues that Oracle is likely to pursue commercially-substantive high performance computing (HPC), emphasis mine: Read more
| Categories: Analytic technologies, Data warehousing, Exadata, Oracle, Theory and architecture | Leave a Comment |
Chris Bird’s blog is brilliant, and update-in-place is increasingly passe’
I wouldn’t say every post in Chris Bird’s occasionally-updated blog is brilliant. I wouldn’t even say every post is readable. But I’d still recommend his blog to just about anybody who reads here as, at a minimum, a consciousness-raiser.
One of the two posts inspiring me to mention this is a high-level one on “technical debt“, reminding us why things don’t always get done right the first time, and further reminding us that circling back to fix them sooner rather than later is usually wise. The other connects two observations that individually have great merit (at least if you don’t take them to extremes):
- Update-in-place is passe’
- So is elaborate up-front database design
Specific points of interest here include: Read more
| Categories: Theory and architecture | 7 Comments |
Vertica 4.0
Vertica briefed me last month on its forthcoming Vertica 4.0 release. I think it’s fair to say that Vertica 4.0 is mainly a cleanup/catchup release, washing away some of the tradeoffs Vertica had previously made in support of its innovative DBMS architecture.
For starters, there’s a lot of new analytic functionality. This isn’t Aster/Netezza-style ambitious. Rather, there’s a lot more SQL-99 functionality, plus some time series extensions of the sort that financial services firms – an important market for Vertica – need and love. Vertica did suggest a couple of these time series extensions are innovative, but I haven’t yet gotten detail about those.
Perhaps even more important, Vertica is cleaning up a lot of its previous SQL optimization and execution weirdnesses. In no particular order, I was told: Read more
| Categories: Analytic technologies, Columnar database management, Data warehousing, Vertica Systems | 2 Comments |
Open issues in database and analytic technology
The last part of my New England Database Summit talk was on open issues in database and analytic technology. This was closely intertwined with the previous section, and also relied on a lot that I’ve posted here. So I’ll just put up a few notes on that part, with lots of linkage to prior discussion of the same points. Read more
