Theory and architecture
Analysis of design choices in databases and database management systems. Related subjects include:
- Any subcategory
- Database diversity
- Explicit support for specific data types
- (in Text Technologies) Text search
Is the enterprise data warehouse a myth?
An enterprise data warehouse should:
- Manage data to high standards of accuracy, consistency, cleanliness, clarity, and security.
- Manage all the data in your organization.
Pick ONE. Read more
Categories: Data models and architecture, Data warehousing, Database diversity, Teradata, Theory and architecture | 8 Comments |
Thoughts on IBM’s anti-Oracle announcements
IBM is putting out a couple of press releases today that are obviously directed competitively at Oracle/Sun, and more specifically at Oracle’s Exadata-centric strategy. I haven’t been briefed, so I just have those to go on.
On the whole, the releases look pretty lame. Highlights seem to include:
- Maybe a claim of enhanced data compression.
- Otherwise, no obvious new technology except product packaging and bundling.
- Aggressive plans to throw capital at the Sun channel to convert it to selling IBM gear. (A figure of $1/2 billion is mentioned, for financing.
Disappointingly, IBM shows a lot of confusion between:
- Text data
- Machine-generated data such as that from sensors
While both highly important, those are very different things. IBM has not in the past shown much impressive technology in either of those two areas, and based on these releases, I presume that trend is continuing.
Edits:
I see from press coverage that at least one new IBM model has some Fusion I/O solid-state memory boards in it. Makes sense.
A Twitter hashtag has a number of observations from the event. Not much substance I could detect except various kind of Oracle bashing.
Categories: Database compression, Exadata, IBM and DB2, Oracle, Solid-state memory | 14 Comments |
Notes on the evolution of OLTP database management systems
The past few years have seen a spate of startups in the analytic DBMS business. Netezza, Vertica, Greenplum, Aster Data and others are all reasonably prosperous, alongside older specialty product vendors Teradata and Sybase (the Sybase IQ part). OLTP (OnLine Transaction Processing) and general purpose DBMS startups, however, have not yet done as well, with such success as there has been (MySQL, Intersystems Cache’, solidDB’s exit, etc.) generally accruing to products that originated in the 20th Century.
Nonetheless, OLTP/general-purpose data management startup activity has recently picked up, targeting what I see as some very real opportunities and needs. So as a jumping-off point for further writing, I thought it might be interesting to collect a few observations about the market in one place. These include:
- Big-brand OLTP/general-purpose DBMS have more “stickiness” than analytic DBMS.
- By number, most of an enterprise’s OLTP/general-purpose databases are low-volume and low-value.
- Most interesting new OLTP/general-purpose data management products are either MySQL-based or NoSQL.
- It’s not yet clear whether MySQL will prevail over MySQL forks, or vice-versa, or whether they will co-exist.
- The era of silicon-centric relational DBMS is coming.
- The emphasis on scale-out and reducing the cost of joins spans the NoSQL and SQL-based worlds.
- Users’ instance on “free” could be a major problem for OLTP DBMS innovation.
I shall explain. Read more
Pranks, apocryphal and otherwise
I’ve been posting a bit about pranks of various kinds, mainly geeky ones. But so far I’ve only covered real pranks, rather than the much funnier imaginary ones.
The classic of that genre, of course, is a certain database-oriented xkcd comic strip. (If you haven’t instantly guessed what I’m talking about, you must see that strip.) And in a similar vein, I further offer six examples of xkcd’s “My Hobby” strips. (The last two are not for the sexually squeamish, but the others are pretty G-rated.)
One thing I just learned about xkcd — if you mouse over the strip, you get another joke. Some are almost as funny as the main strip. So even if you have already seen the database-classic xkcd linked above, you might want to revisit it. 😉
In a very different vein is Dadhacker’s list of real or imaginary past shenanigans, (Edit: The original link is fried, but here’s a partial replacement) which starts:
I am not permitted to replace a coworker’s reference books (including his Knuth, Sedgewick, and C++ reference manuals) with several linear feet of steamy hardback romance novels.
I will not name my variables after nasty tropical diseases, or executives who are under indictment for fraud.
Elevators are not toys, nor should they ever be wired into the corporate net.
Funny and vaguely prankish (and not for the language-squeamish) is this non-xkcd comic about NoSQL. And finally (definitely also for the non-squeamish), see the first long comment in this Reddit thread, which seems to have successfully pranked a whole lot of readers.
Categories: Fun stuff, Humor, NoSQL | 3 Comments |
Quick news, links, comments, etc.
Some notes based on what I’ve been reading recently: Read more
Three kinds of software innovation, and whether patents could possibly work for them
In connection with an attempt to articulate my views on software patents (more on those below), I was thinking about the different ways in which software development can be innovative. And it turns out that most forms of software innovation can, at their core, be assigned to one or more of three overlapping categories: Read more
Categories: Analytic technologies, Business intelligence, Cloud computing, Data warehousing, Parallelization, Software as a Service (SaaS), Theory and architecture | 5 Comments |
Vertica update
I caught up with Jerry Held (Chairman) and Dave Menninger (VP Marketing) of Vertica for a chat yesterday. The immediate reason for the call was that a competitor had tipped me off to the departure of Vertica CEO Ralph Breslauer, which of course raises a host of questions. Highlights of the call included:
- Vertica had a “killer” Q4 and is doing very well in Q1 again.
- Vertica burned hardly any cash last year; i.e., it was close to cash-flow neutral in 2009.
- Vertica is hiring aggressively, e.g., in sales.
- Vertica is well down the path with several CEO candidates who Jerry regards as outstanding. He is hopeful there will be a new CEO in April. (But I bet that would be late April, given what Jerry mentioned about his own travel plans.)
- Absent a full-time CEO, Jerry and Andy Palmer are spending a lot more time with Vertica.
- One Vertica customer is approaching a petabyte of user data. The last time Vertica had checked, that customer had been more in the ¼ petabyte range.
- Other multi-hundred terabyte Vertica databases were mentioned, including one where Vertica claims to have beaten Teradata and perhaps other competitors in a head-to-head competition (it sounds like that one’s too recent to be deployed yet).
- Vertica sees Aster and Greenplum competitively more often than it sees ParAccel.
- Vertica sees Sybase IQ competitively a lot in financial services (in new-name accounts for Sybase as well as where some kind of Sybase DBMS is an incumbent), and more occasionally in other sectors.
NDA parts of the conversation also gave me the impression that Vertica is moving forward just as eagerly as its peers. I.e., I didn’t uncover any reason to think that Ralph’s departure is a sign of trouble, of the company being shopped, etc. Read more
Categories: Analytic technologies, Data warehousing, Investment research and trading, Market share and customer counts, ParAccel, Petabyte-scale data management, Sybase, Vertica Systems | 6 Comments |
Infobright blog update
I often offer that, if a company puts up a sufficiently good blog post, I’ll link to it. Well, I just noticed that Infobright CEO Mark Burton (somewhere along the way he seems to have dropped the “interim”) put up an excellent post last month.
Highlights on the market share/sector side include: Read more
Categories: Columnar database management, Data mart outsourcing, Data warehousing, Infobright, Log analysis, Market share and customer counts, Open source, Web analytics | 1 Comment |
XtremeData update
I talked with Geno Valente of XtremeData tonight. Highlights included:
- XtremeData still hasn’t sold any dbX stuff (they’ve had a side business in generic FPGA-based boards paying the bills for years). Well, there may have been some paid POCs (proofs of concept) or something, but real sales haven’t come through yet.
- XtremeData does have three prospects who have said “Yes”, and expects one order to come through this month.
- XtremeData continues to believe it shines when:
- Data models are complex
- In particular, there are complex joins
- In particular, two large tables have to be joined with each other, under circumstances where no product can avoid doing vast data redistribution
- XtremeData insists that all the nice things Bill Inmon – including in webinars — has said about it has not been for pay or other similar business compensation. That’s quite unusual.
- XtremeData is coming out with a new product, codenamed the Personal Data Warehouse (PDW), which:
- Is ready to go into beta test
- Should be launched in a month and a half or so
- Will have a different name when it is launched
Naming aside, Read more
Memcached-based company NorthScale launches
NorthScale, a start-up based around memcached, has just launched, two weeks after the Todd Hoff’s post arguing the MySQL/memcached combo is passe’. NorthScale wouldn’t necessarily argue with Todd, arguing that what you really should use instead is NorthScale’s combo of memcached and Membase, a memcached-like DBMS …
… or something like that. I don’t intend to write seriously about NorthScale until I have a better idea of what Membase is.
In the mean time,
- VentureBeat put up a solid post on NorthScale’s company history and so on
- Om Malik bought into the NorthScale memcached pitch
- TechCrunch has a low-quality post about NorthScale (although it wasn’t as error-riddled as the same author’s post about nStein, which Seth Grimes properly blasted)
Categories: Cache, Clustering, Couchbase, memcached, NoSQL, Parallelization | Leave a Comment |