Cassandra and the NoSQL scalable OLTP argument
Todd Hoff put up a provocative post on High Scalability called MySQL and Memcached: End of an Era? The post itself focuses on observations like:
- Facebook invented and is adopting Cassandra.
- Twitter is adopting Cassandra.
- Digg is adopting Cassandra.
- LinkedIn invented and is adopting Voldemort.
- Gee, it seems as if the super-scalable website biz has moved beyond MySQL/Memcached.
But in addition, he provides a lot of useful links, which DBMS-oriented folks such as myself might have previously overlooked. Read more
Categories: Cassandra, Data models and architecture, NoSQL, OLTP, Open source, Parallelization, Specific users, Theory and architecture | 16 Comments |
Data exploration vs. data visualization
I’ve tended to conflate data exploration and data visualization, and I’m far from alone in doing so. But a recent Economist article is a useful reminder that they aren’t exactly the same thing. Read more
Categories: Analytic technologies, Business intelligence | 5 Comments |
Another reason to expect number-crunching and big-data management to converge
Dan Olds argues that Oracle is likely to pursue commercially-substantive high performance computing (HPC), emphasis mine: Read more
Categories: Analytic technologies, Data warehousing, Exadata, Oracle, Theory and architecture | Leave a Comment |
Notes on Sybase Adaptive Server Enterprise
It had been a very long time since I was remotely up to speed on Sybase’s main OLTP DBMS, Adaptive Server Enterprise (ASE). Raj Rathee, however, was kind enough to fill me in a few days ago. Highlights of our chat included: Read more
Categories: Cache, In-memory DBMS, Memory-centric data management, Sybase | 1 Comment |
Chris Bird’s blog is brilliant, and update-in-place is increasingly passe’
I wouldn’t say every post in Chris Bird’s occasionally-updated blog is brilliant. I wouldn’t even say every post is readable. But I’d still recommend his blog to just about anybody who reads here as, at a minimum, a consciousness-raiser.
One of the two posts inspiring me to mention this is a high-level one on “technical debt“, reminding us why things don’t always get done right the first time, and further reminding us that circling back to fix them sooner rather than later is usually wise. The other connects two observations that individually have great merit (at least if you don’t take them to extremes):
- Update-in-place is passe’
- So is elaborate up-front database design
Specific points of interest here include: Read more
Categories: Theory and architecture | 7 Comments |
February 2010 data warehouse DBMS news roundup
February is usually a busy month for data warehouse DBMS product releases, product announcements, and other real or contrived data warehouse DBMS news, and it can get pretty confusing trying to keep those categories of “news” apart.* This year is no exception, although several vendors – including Teradata and Netezza – are taking “rolling thunder” approaches, doing some of their announcements this month while holding others back for March or April.
*I probably have it worse than most people in that regard, because my clients run tentative feature lists and announcement schedules by me well in advance, which may get changed multiple times before the final dates roll around. I also occasionally miss some detail, if it wasn’t in a pre-briefing but gets added at the end.
Anyhow, the three big themes of this month’s announcements are probably:
- Integrating different kinds of analytic processing into databases and DBMS.
- Taking advantage of hardware advances.
- Playing catchup in areas where small vendors’ products weren’t mature yet.
Categories: Analytic technologies, Aster Data, Data warehousing, Netezza, Teradata, Vertica Systems | Leave a Comment |
TwinFin(i) – Netezza’s version of a parallel analytic platform
Much like Aster Data did in Aster 4.0 and now Aster 4.5, Netezza is announcing a general parallel big data analytic platform strategy. It is called Netezza TwinFin(i), it is a chargeable option for the Netezza TwinFin appliance, and many announced details are on the vague side, with Netezza promising more clarity at or before its Enzee Universe conference in June. At a high level, the Aster and Netezza approaches compare/contrast as follows: Read more
Categories: Aster Data, Data warehouse appliances, Data warehousing, Hadoop, MapReduce, Netezza, Predictive modeling and advanced analytics, SAS Institute, Teradata | 10 Comments |
Aster Data nCluster 4.5
Like Vertica, Netezza, and Teradata, Aster is using this week to pre-announce a forthcoming product release, Aster Data nCluster 4.5. Aster is really hanging its identity on “Big Data Analytics” or some variant of that concept, and so the two major named parts of Aster nCluster 4.5 are:
- Aster Data Analytic Foundation, a set of analytic packages prebuilt in Aster’s SQL-MapReduce
- Aster Data Developer Express, an Eclipse-based IDE (Integrated Development Environment) for developing and testing applications built on Aster nCluster, Aster SQL-MapReduce, and Aster Data Analytic Foundation
And in other Aster news:
- Along with the development GUI in Aster nCluster 4.5, there is also a new administrative GUI.
- Aster has certified that nCluster works with Fusion I/O boards, because at least one retail industry prospect cares. However, that in no way means that arm’s-length Fusion I/O certification is Aster’s ultimate solid-state memory strategy.
- I had the wrong impression about how far Aster/SAS integration has gotten. So far, it’s just at the connector level.
Aster Data Developer Express evidently does some cool stuff, like providing some sort of parallelism testing right on your desktop. It also generates lots of stub code, saving humans from the tedium of doing that. Useful, obviously.
But mainly, I want to write about the analytic packages. Read more
Categories: Aster Data, Data warehousing, Investment research and trading, Predictive modeling and advanced analytics, RDF and graphs, SAS Institute, Teradata | 9 Comments |
Vertica 4.0
Vertica briefed me last month on its forthcoming Vertica 4.0 release. I think it’s fair to say that Vertica 4.0 is mainly a cleanup/catchup release, washing away some of the tradeoffs Vertica had previously made in support of its innovative DBMS architecture.
For starters, there’s a lot of new analytic functionality. This isn’t Aster/Netezza-style ambitious. Rather, there’s a lot more SQL-99 functionality, plus some time series extensions of the sort that financial services firms – an important market for Vertica – need and love. Vertica did suggest a couple of these time series extensions are innovative, but I haven’t yet gotten detail about those.
Perhaps even more important, Vertica is cleaning up a lot of its previous SQL optimization and execution weirdnesses. In no particular order, I was told: Read more
Categories: Analytic technologies, Columnar database management, Data warehousing, Vertica Systems | 12 Comments |
Quick thoughts on the StreamBase Component Exchange
Streambase is announcing something called the StreamBase Component Exchange, for developers to exchange components to be used with the StreamBase engine, presumably on an open source basis. I simultaneously think:
- This is a good idea, and many software vendors should do it if they aren’t already.
- It’s no big deal.
For reasons why, let me quote an email I just sent to an inquiring reporter:
- StreamBase sells mainly to the financial services and intelligence community markets. Neither group will share much in the way of core algorithms.
- But both groups are pretty interested in open source software even so. (I think for both the price and customizability benefits.)
- Open source software commonly gets community contributions for connectors, adapters, and (national) language translations.
- But useful contributions in other areas are much rarer.
- Linden Labs is one of StreamBase’s few significant customers outside its two core markets.
- All of the above are consistent with the press release (which quotes only one StreamBase customer — guess who?).