Two subjects in one post, because they were too hard to separate from each other
Any sufficiently complex software is developed in modules and subsystems. DBMS are no exception; the core trinity of parser, optimizer/planner, and execution engine merely starts the discussion. But increasingly, database technology is layered in a more fundamental way as well, to the extent that different parts of what would seem to be an integrated DBMS can sometimes be developed by separate vendors.
Major examples of this trend — where by “major” I mean “spanning a lot of different vendors or projects” — include:
- The object/relational, aka universal, extensibility features developed in the 1990s for Oracle, DB2, Informix, Illustra, and Postgres. The most successful extensions probably have been:
- Geospatial indexing via ESRI.
- Full-text indexing, notwithstanding questionable features and performance.
- MySQL storage engines.
- MPP (Massively Parallel Processing) analytic RDBMS relying on single-node PostgreSQL, Ingres, and/or Microsoft SQL Server — e.g. Greenplum (especially early on), Aster (ditto), DATAllegro, DATAllegro’s offspring Microsoft PDW (Parallel Data Warehouse), or Hadapt.
- Splits in which a DBMS has serious processing both in a “database” layer and in a predicate-pushdown “storage” layer — most famously Oracle Exadata, but also MarkLogic, InfiniDB, and others.
- SQL-on-HDFS — Hive, Impala, Stinger, Shark and so on (including Hadapt).
Other examples on my mind include:
- Data manipulation APIs being added to key-value stores such as Couchbase and Aerospike.
- TokuMX, the Tokutek/MongoDB hybrid I just blogged about.
- NuoDB’s willing reliance on third-party key-value stores (or HDFS in the role of one).
- FoundationDB’s strategy, and specifically its acquisition of Akiban.
And there are several others I hope to blog about soon, e.g. current-day PostgreSQL.
In an overlapping trend, DBMS increasingly have multiple data manipulation APIs. Examples include: Read more
Jaikumar Vijayan of Computerworld did a story based on my reporting on the JP Morgan Chase Oracle outage. He did a good job, getting me to simplify some of what I said before. He also added a quote from Chase to the effect:
the “long recovery process” was caused by a corruption of systems data that disabled the bank’s “ability to process customer log-ins to chase.com”
While that’s true, and indeed is the reason I first referred to this as an “authentication” problem, I believe it to be incomplete. For example, the $132 million in missed ACH payments weren’t directly driven by log-ins; they were to be done on schedule, perhaps based on previous log-ins. Or as Jai and I put it in the guts of his story: Read more
After posting my speculation about the JPMorgan Chase database outage, I was contacted by – well, by somebody who wants to be referred to as “a credible source close to the situation.” We chatted for a long time; I think it is very likely that this person is indeed what s/he claims to be; and I am honoring his/her requests to obfuscate many identifying details. However, I need a shorter phrase than “a credible source close to the situation,” so I’ll refer to him/her as “Deep Packet.”
According to Deep Packet,
- The JPMorgan Chase database outage was caused by corruption in an Oracle database.
- This Oracle database stored user profiles, which are more than just authentication data.
- Applications that went down include but may not be limited to:
- The main JPMorgan Chase portal.
- JPMorgan Chase’s ability to use the ACH (Automated Clearing House).
- Loan applications.
- Private client trading portfolio access.
- The Oracle database was back up by 1:12 Wednesday morning. But on Wednesday a second problem occurred, namely an overwhelming number of web requests. This turned out to be a cascade of retries in the face of – and of course exacerbating – poor response time. While there was no direct connection to the database outage, Deep Packet is sympathetic to my suggestions that:
- Network/app server traffic was bound to be particularly high as people tried to get caught up after the Tuesday outage, or just see what was going on in their accounts.
- Given that Deep Packet said there was a definite operator-error contributing cause, perhaps the error would not have happened if people weren’t so exhausted from dealing with the database outage.
Deep Packet stressed the opinion that the Oracle outage was not the fault of JPMorgan Chase (the Wednesday slowdown is a different matter), and rather can be blamed on an Oracle bug. Read more
Edit: Subsequent to making this post, I obtained more detail about the JPMorgan Chase database outage.
I was just contacted for comment about the Chase database outage, about which they’ve released remarkably little information (they’ve even apologized for their terseness). About all Chase has said is:
A third-party database company’s software caused a corruption of systems information, disabling our ability to process customer log-ins to chase.com. This resulted in a long recovery process,
and even that quote is a bit hard to find. From other reporting, we know that ATM machines, bank branches, and the call centers continued to work, but various web and mobile access applications were disabled.
Of course, that quote is pretty ambiguous. My thoughts on it include: Read more
In recent conversations with various analytic DBMS vendors, a fairly consistent picture has emerged.
- Business is strong. Multiple vendors claim to be going gangbusters, with the happy sounds coming out of Vertica and Infobright being echoed by several competitors. Hearsay suggests some other companies in related businesses are doing well too. Depending on who you talk to, the business pickup dates back to Q4, give or take a quarter.
- Oracle Exadata has become a formidable competitor, on the strength of Exadata 2. Exadata 2’s positioning and perception among Oracle users seem to be pretty much in line with what Oracle portrayed to me.
- Teradata is portrayed as a weak competitor. Competitors don’t worry about Teradata nearly as much as they do about Oracle. That said, I suspect a bit of wishful thinking; Teradata is clearly still getting a lot of business the other vendors would dearly love to have.
- HP Neoview is reeling. (Almost) nobody sees Neoview competitively. The Walmart Neoview installation is said to have stayed small at best. JP Morgan Chase is said to have completely thrown Neoview out (and a bunch of HP engineers with it).
- (Almost) nobody mentions competing against DB2 either. This continues to baffle me.