Emulation, transparency, portability
Analysis of products that support the emulation of market-leading database management systems. Related subjects include:
When I grumbled about the conference-related rush of Hadoop announcements, one example of many was Teradata Aster’s SQL-H. Still, it’s an interesting idea, and a good hook for my first shot at writing about HCatalog. Indeed, other than the Talend integration bundled into Hortonworks’ HDP 1, Teradata SQL-H is the first real use of HCatalog I’m aware of.
The Teradata SQL-H idea is:
- Register your Hadoop data to HCatalog. I’ll confess to being unclear about the details of how that works, for example in the case of data that just doesn’t fit well into flat relational tables. Stay tuned for future posts. For now, I’ll just note that:
- HCatalog is closely based on Hive’s metadata management. If you’ve run Hive against the data, HCatalog should already know about it.
- HCatalog can handle Pig and HBase data as well.
- Write SQL DDL (Data Description Language) so that your Aster cluster knows about the data.
- Write any Teradata Aster SQL/MR against that data. Some of the execution will be done on the Hadoop cluster, but pulling data back into Aster may well be necessary.
At least in theory, Teradata SQL-H lets you use a full set of analytic tools against your Hadoop data, with little limitation except price and/or performance. Teradata thinks the performance of all this can be much better than if you just use Hadoop (35X was mentioned in one particularly favorable example), but perhaps much worse than if you just copy/extract the data to an Aster cluster in the first place.
So what might the use cases be for something like SQL-H? Offhand, I’d say:
- SQL-H use cases are probably focused in areas where copying the data to Aster in advance doesn’t make a lot of sense. So presumably …
- … the Hadoop clusters involved would hold a lot more data than you’d want to pay for storing in Teradata Aster. E.g., think of cases where Hadoop is used as a big bit bucket or archival data store.
- There could be a kind of investigative workflow. First you play around with the Hadoop data via SQL-H. Then when you think you’re onto something, you set up ETL (Extract/Transform/Load) to get the data into Aster and ratchet up the effort.
By way of contrast, the whole thing makes less sense for dashboarding kinds of uses, unless the dashboard users are very patient when they want to drill down.
|Categories: Aster Data, Data integration and middleware, Data warehousing, EAI, EII, ETL, ELT, ETLT, Emulation, transparency, portability, Hadoop, MapReduce, SQL/Hadoop integration, Teradata||10 Comments|
Last Friday I stopped by Oracle for my first conversation since January, 2010, in this case for a chat with Andy Mendelsohn, Mark Townsend, Tim Shetler, and George Lumpkin, covering Exadata and the Oracle DBMS. Key points included: Read more
Edit: ANTs Software seems to have subsequently collapsed, which may be why some of these links broke too.
Jeff Pryslak of Sybase put up a post insulting ANTs Software and the general idea of ANTs-aided Sybase-to-DB2 migration. CEO Joe Kozak of ANTs hit back with a rambling diatribe, which came to my attention because he mentioned my name in it, making some rather fanciful remarks about the “long” relationship I used to have with ANTs Software. (I do recall at least one briefing, plus some attempts from them to buy my services under the condition that I agree to a ridiculous NDA, which I refused to sign.)
This piqued my interest, so — recalling that ANTs is a public company — I decided to take a look at just how successful their software products business is. Well, for the quarter ended March 31, 2010, ANTs’ 10-Q filing says (emphasis mine): Read more
EnterpriseDB has some deplorable business practices (my stories of being screwed by EnterpriseDB have been met by “Well, you’re hardly the only one”). But a couple of more successful DBMS vendors have happily partnered with EnterpriseDB even so, to help pick off Oracle users. IBM’s approach was in the vein of an EnterpriseDB-infused version of SQL handling within DB2.* Netezza just announced an EnterpriseDB-based Netezza Migrator that is rather different.
*The comment threads are the most informative parts of those posts.
I’m a little unclear as to the Netezza Migrator details, not least because Netezza folks don’t seem to care too much about Netezza Migrator themselves. That said, the core ideas of Netezza Migrator are: Read more
|Categories: Data integration and middleware, Data warehousing, Emulation, transparency, portability, EnterpriseDB and Postgres Plus, Netezza, Oracle||19 Comments|
After my recent post, the Clustrix guys raised their hands and briefed me. Takeaways included: Read more
|Categories: Application areas, Clustrix, Emulation, transparency, portability, Games and virtual worlds, MySQL, NoSQL, OLTP, Parallelization, Solid-state memory||8 Comments|
I talked with Robert Nagle of Intersystems last week, and it went better than at least one other Intersystems briefing I’ve had. Intersystems’ main product is Cache’, an object-oriented DBMS introduced in 1997 (before that Intersystems was focused on the fourth-generation programming language M, renamed from MUMPS). Unlike most other OODBMS, Cache’ is used for a lot of stuff one would think an RDBMS would be used for, across all sorts of industries. That said, there’s a distinct health-care focus to Intersystems, in that:
- MUMPS, the original Intersystems technology, was focused on health care.
- The reasons Intersystems went object-oriented have a lot to do with the structure of health-care records.
- Intersystems’ biggest and most visible ISVs are in the health-care area.
- Intersystems is actually beginning to sell an electronic health records system called TrakCare around the world (but not in the US, where it has lots of large competitive VARs).
Note: Intersystems Cache’ is sold mainly through VARs (Value-Added Resellers), aka ISVs/OEMs. I.e., it’s sold by people who write applications on top of it.
So far as I understand – and this is still pretty vague and apt to be partially erroneous – the Intersystems Cache’ technical story goes something like this: Read more
|Categories: Data models and architecture, Emulation, transparency, portability, Health care, Intersystems and Cache', Mid-range, Object, OLTP, Sybase, Theory and architecture||8 Comments|
Dataupia marketing VP Samantha Stone — who by the way has been one heck of a trooper through Dataupia’s troubles — is joining the exodus from the company. General graciousness aside, the heart of Samantha’s farewell email reads:
Unfortunately, we have had to reduce our burn rate as we seek an acquirer for our technology.
We have a group of loyal employees remaining on staff focused on current production customers and the acquisition efforts.
As part of the most recent staff reductions I will be leaving Dataupia.
Two years ago I wrote:
[Dataupia would] make a great acquisition for a BI company or DBMS vendor who could then say “Oh, no, this isn’t a DBMS appliance – it’s merely a data warehouse accelerator.” When you look at it that way, their chances of prospering look distinctly higher.
But at this point I think there probably would be more appealing ways for those vendors to meet the same needs.
|Categories: Data warehouse appliances, Data warehousing, Dataupia, Emulation, transparency, portability||14 Comments|
I’ve been beating my head against the wall trying to convince startups of two well-established truisms:
- Experience consistently shows that the demand for transparency/emulation features isn’t as great as entrepreneurs hope.
- If a startup’s competitors sell directly to enterprises, an indirect sales strategy rarely succeeds.
Maybe one or the other will learn from Dataupia’s example.
Todd Fin pointed me yesterday to an article by Wade Roush that confirmed in detail layoffs and other troubles at Dataupia. The article quotes Dataupia marketing VP Samantha Stone as saying Dataupia is down to 23 employees, and that some of the layoffs were in engineering. This is consistent with what I’d been hearing for a while, namely that other analytic DBMS vendors were seeing a flood of Dataupia resumes, especially technical ones.
The article goes on to discuss difficulties Dataupia has had in raising another round of financing. During Dataupia’s very long CEO search — which I kept hearing about from people who’d been approached for the job — it was obvious money wouldn’t come in until a CEO was found. But it seems that even with a new CEO, existing investors are reluctant to re-up without a new investor as well, and that new investment is slow in happening.
On the plus side, the article quotes Samantha as saying founder Foster Hinshaw is recovering well from his heart surgery.
|Categories: Data warehouse appliances, Data warehousing, Dataupia, Emulation, transparency, portability||3 Comments|
Every vendor needs developer-facing web resources, and Teradata turns out to have been working on a new umbrella site for its. It’s called Teradata Developer Exchange — DevX for short. Teradata DevX seems to be in a low-volume beta now, with a press release/bigger roll-out coming next week or so. Major elements are about what one would expect:
- Surprisingly, so far as I can tell, no forums
If you’re a Teradata user, you absolutely should check out Teradata DevX. If you just research Teradata — my situation — there are some aspects that might be of interest anyway. In particular, I found Teradata’s downloads instructive, most particularly those in the area of extensibility. Mainly, these are UDFs (User-Defined Functions), in areas such as:
- Geospatial data
- Imitating Oracle or DB2 UDFs (as migration aids)
Also of potential interest is a custom-portlet framework for Teradata’s management tool Viewpoint. A straightforward use would be to plunk some Viewpoint data into a more general system management dashboard. A yet cooler use — and I couldn’t get a clear sense of whether anybody’s ever done this yet — would be to offer end users some insight as to how long their queries are apt to run.
|Categories: Database compression, Emulation, transparency, portability, GIS and geospatial, Teradata||2 Comments|