Peter Batty on Netezza Spatial
As previously noted, I’m not up to speed on Netezza Spatial. Phil Francisco of Netezza has promised we’ll fix that ASAP. In the mean time, I found a blog by a guy named Peter Batty, who evidently:
- Knows a lot about geospatial data and its uses
- Is consulting to Netezza
- Is smart
Batty offers a lot of detail in two recent posts, intermixed with some gollygeewhiz about Netezza in general. If you’re interested in this stuff, Batty’s blog is well worth checking out. Read more
| Categories: Analytic technologies, Data warehousing, GIS and geospatial, Netezza, Telecommunications | 2 Comments |
The essence of the Oracle Amazon cloud offering
OK. The press release adds color to what I previously posted about Oracle’s new Amazon cloud offering. Read more
| Categories: Cloud computing, Oracle | Leave a Comment |
Oracle announces an Amazon cloud offering
Per the Amazon Web Service Blog, Oracle announced that Oracle can be run in the Amazon cloud (i.e., on EC2, with EBS for persistent storage). Clustering is probably weak, however — e.g., there’s no RAC support, as per Oracle’s well-written FAQ. Perhaps not coincidentally, the FAQ seems to suggest that the primary use case at this time is for backup, and backup is generally a major point of emphasis on Oracle’s cloud computing page.
Of course, another use case could be development, but that depends in part on pricing. Of course, whether Oracle’s offering seems attractively priced compared with, for example, a similar one from EnterpriseDB and Elastra depends a lot on whether you’ve already negotiated an unlimited-use license for Oracle.
James Kobielus, who presumably was pre-briefed, has more to say.
| Categories: Amazon and its cloud, Cloud computing, Oracle | 1 Comment |
Database compression is heavily affected by the kind of data
I’ve written often of how different kinds or brands of data warehouse DBMS get very different compression figures. But I haven’t focused enough on how much compression figures can vary among different kinds of data. This was really brought home to me when Vertica told me that web analytics/clickstream data can often be compressed 60X in Vertica, while at the other extreme — some kind of floating point data, whose details I forget for now — they could only do 2.5X. Edit: Vertica has now posted much more accurate versions of those numbers. Infobright’s 30X compression reference at TradeDoubler seems to be for a clickstream-type app. Greenplum’s customer getting 7.5X — high for a row-based system — is managing clickstream data and related stuff. Bottom line:
When evaluating compression ratios — especially large ones — it is wise to inquire about the nature of the data.
| Categories: Data warehousing, Database compression, Greenplum, Infobright, Vertica Systems, Web analytics | 4 Comments |
Web analytics — clickstream and network event data
It should surprise nobody that web analytics – and specifically clickstream data — is one of the biggest areas for high-end data warehousing. For example:
- I believe that both of the previously mentioned petabyte+ databases on Greenplum will feature clickstream data.
- Aster Data’s largest disclosed database, by almost two orders of magnitude, is at MySpace.
- Clickstream analytics is a big application area for Vertica Systems.
- Clickstream analytics is a big application area for Netezza.
- Infobright’s customer success stories appear to be concentrated in clickstream analytics.
- Coral8 tells me that CEP is also being used for clickstream data, although I suspect that a lot of Coral8’s evidence in that regard comes from a single flagship account. Edit: Actually, Coral8 has a bunch of clickstream customers.
| Categories: Aleri and Coral8, Aster Data, Greenplum, Infobright, Netezza, Streaming and complex event processing (CEP), Vertica Systems, Web analytics | 2 Comments |
More Oracle announcement speculation
Like many other folks, Chris Mellor took a go at speculating about Oracle’s announcements this week. Some of his points are sloppy — e.g., he thinks compression necessarily requires hardware assistance — but he did make one interesting observation: Tea leaves suggest HP has a prominent role in something Oracle is announcing. But then, if you’ve been reading along, you already suspected that.
| Categories: Oracle | Leave a Comment |
How intrinsically numerate are you?
The NY Times reports on research that shows a correlation between mathematical ability and a form of mental reflexes. A Johns Hopkins newsletter article on the same research is here.
Where it gets fun is that the NYT included a link to a version of the test. Blue and yellow dots of diverse sizes flash on the screen, and you have 0.2 seconds to determine which color predominates (in number, not total area). I got 14 correct in 15 trials. Read more
| Categories: Fun stuff | 3 Comments |
When BI, CEP, BAM, and Gartner meet together
Doug Henschen has two good articles based on Gartner’s Event Processing conference, on the theme of BI/event processing integration — an overview, and a detailed interview with Roy Schulte. And as I note elsewhere, Seth Grimes has a good article based on the conference too.
I have my own thoughts on these subjects, but I’m not ready to post them at the moment. In the mean time, I recommend the articles linked above.
| Categories: Analytic technologies, Business intelligence, Streaming and complex event processing (CEP) | Leave a Comment |
Oracle announcements next week, data warehouse appliance, 11g R2 or otherwise
Eric Lai and Chris Kanarcus put up an article on Oracle’s announcements next week. Much of the speculation revolved around generic grid/clustering, with more detail than I posted yesterday. Most interesting to me was the last section of the article, which sounds as if it could be talking about the same thing Luke Lonergan referred to in a comment thread when he said:
Oracle is about to unveil a secret project that uses HP DL185 servers as storage devices with some predicate pushdowns to implement a data warehouse “appliance”.
| Categories: Data warehouse appliances, Data warehousing, Oracle | 1 Comment |
Wikipedia needs some urgent help in the database area
One or more people are going around clobbering Wikipedia’s coverage of analytic DBMS vendors. Netezza’s article has been gutted, and is marked for deletion. Aster Data’s and Dataupia’s articles are marked for deletion, although it seems that at least Aster’s will survive. Greenplum’s article is already gone, as is DATAllegro’s. I can’t immediately tell whether there ever was one on Infobright or ParAccel.
Vertica’s, by way of contrast, is in good shape. (But then, the Vertica guys are a little sharper about internet marketing that most of their peers.) Teradata’s isn’t in danger of deletion, but definitely could use some sprucing up. Read more
| Categories: Data warehousing | 7 Comments |
