My NoSQL article is finally posted; I hope it lives up to all the foreshadowing. It is being run online at Intelligent Enterprise/Information Week, as per the link above, where Doug Henschen edited it with an admirably light touch.
Below please find three excerpts* that convey the essence of my thinking on NoSQL. For much more detail, please see the article itself.
*Notwithstanding my admiration for Doug’s editing, the excerpts are taken from my final pre-editing submission, not from the published article itself.
My quasi-definition of “NoSQL” wound up being:
NoSQL DBMS start from three design premises:
- Transaction semantics are unimportant, and locking is downright annoying.
- Joins are also unimportant, especially joins of any complexity.
- There are some benefits to having a DBMS even so.
NoSQL DBMS further incorporate one or more of three assumptions:
- The database will be big enough that it should be scaled across multiple servers.
- The application should run well if the database is replicated across multiple geographically distributed data centers, even if the connection between them is temporarily lost.
- The database should run well if the database is replicated across a host server and a bunch of occasionally-connected mobile devices.
In addition, NoSQL advocates commonly favor the idea that a database should have no fixed schema, other than whatever emerges as a byproduct of the application-writing process.
I subdivided the space by saying:
If not SQL, then what? A number of possibilities have been tried, with the four main groups being:
- Simple key-value store.
- Fully SQL/tabular.
DBMS based on graphical data models are also sometimes suggested to be part of NoSQL, as are the file systems that underlie many MapReduce implementations. But as a general rule, those data models are most effective for analytic use cases somewhat apart from the NoSQL mainstream.
My conclusion was:
So should you adopt NoSQL technology? Key considerations include:
- Immaturity. The very term “NoSQL” has only been around since 2009. Most NoSQL “products” are open source projects backed by a company of fewer than 20 employees.
- Open source. Many NoSQL adopters are constrained, by money or ideology, to avoid closed-source products. Conversely, it is difficult to deal with NoSQL products’ immaturity unless you’re comfortable with the rough-and-tumble of open source software development.
- Internet orientation. A large fraction of initial NoSQL implementations are for web or other internet (e.g., mobile application) projects.
- Schema mutability. If you like the idea of being able to have different schemas for different parts of the same “table,” NoSQL may be for you. If you like the database reusability guarantees of the relational model, NoSQL may be a poor fit.
- Project size. For a large (and suitable) project, the advantages of NoSQL technology may be large enough to outweigh its disadvantages. For a small, ultimately disposable project, the disadvantages of NoSQL may be minor. In between those extremes, you may be better off with SQL.
- SQL DBMS diversity. The choice of SQL DBMS goes far beyond the “Big 3-4” of Oracle, IBM DB2, Microsoft SQL Server, and SAP/Sybase Adaptive Server Anywhere. MySQL, PostgreSQL, and other mid-range SQL DBMS – open source or otherwise – might meet your needs. So might some of the scale-out-oriented startups cited above. Or if your needs are more analytic, there’s a whole range of powerful and cost-effective specialized products, from vendors such as Netezza, Vertica, Aster Data, or EMC/Greenplum.
Bottom line: For cutting-edge applications – often but not only internet-centric — NoSQL technology can make sense today. In other use cases, its drawbacks are likely to outweigh its advantages.