Gear6 seems to have failed in the memcached market too
As previously noted, I’ve briefly cut back on blogging (and research) due to some family health issues. The first casualty was a post about memcached. One of the two companies to be featured were my new clients at Northscale. The other was Gear6. What they had in common was:
- Both Northscale and Gear6 offered distributions of memcached.
- Both Northscale and Gear6 also wanted to sell persistent versions of memcached — in essence, simple DBMS with the memcached API in place of a substantial DML (Data Manipulation Language).
Categories: Clustering, Couchbase, memcached, NoSQL | 1 Comment |
ITA Software and Needlebase
Rumors are flying that Google may acquire ITA Software. I know nothing of their validity, but I have known about ITA Software for a while. Random notes include:
- ITA Software builds huge OLTP systems that it runs itself on behalf of airlines.
- Very, very unusually, ITA Software builds these huge OLTP systems in LISP.
- ITA Software is an Oracle shop (see Dan Weinreb’s comment).
- ITA Software is run by a techie (again, see Dan Weinreb’s comment).
- ITA Software has an interesting screen-scraping/web ETL project called Needlebase
ITA’s software does both price/reservation lookup/checking and reservation-making. I’ve had trouble keeping it straight, but I think the lookup is ITA’s actual business, and the reservation-making is ITA’s Next Big Thing. This is one of the ultimate federated-transaction-processing applications, because it involves coordinating huge OLTP systems run, in some cases, by companies that are bitter competitors with each other. Network latencies have to allow for intercontinental travel of the data itself.
Indeed, airline reservation systems are pretty much the OLTP ultimate in themselves. As the story goes, transaction monitors were pretty much invented for airline reservation systems in the 1960s.
A really small project for ITA Software is Needlebase. I stopped by ITA to look at Needlebase in January, and what it is is a very smart and hence interesting screen-scraping system. The idea is people publish database information to the web, and you may want to look at their web pages and recover the database records it is based on. Applications of this to the airline industry, which has 100s of 1000s of price changes per day — and I may be too low by one or two orders of magnitude when I say that — should be fairly obvious. ITA Software has aspirations of applying Needlebase to other sectors as well, or more precisely having users who do so. Last I looked, ITA hadn’t put significant resources behind stimulating Needlebase adoption — but Google might well change that.
Edit: I just re-found an old characterization of (some of) what ITA Software does by — who else? — Dan Weinreb:
I am working on our new product, an airline reservation system. It’s an online transaction-processing system that must be up 99.99% of the time, maintaining maximum response time (e.g. on www.aircanada.com). It’s a very, very complicated system. The presentation layer is written in Java using conventional techniques. The business rule layer is written in Common Lisp; about 500,000 lines of code (plus another 100,000 or so of open source libraries). The database layer is Oracle RAC. We operate our own data centers, some here in Massachusetts and a disaster-recovery site in Canada (separate power grid).
Related links
- ITA Software and Needlebase websites
- More about LISP 🙂
Categories: Data integration and middleware, EAI, EII, ETL, ELT, ETLT, Google, OLTP, Oracle | 5 Comments |
Big Brother watching our parents?
Life as an elderly person can have Kafkaesque aspects. For example, whether you are allowed to continue to live independently in your own apartment can depend upon whether you are trusted to follow orders for your own good in areas such as:
- Taking medication
- Walking with proper care
- Keeping your feet elevated to let various medical conditions heal
Similarly, it can depend upon whether you are deemed likely, for whatever reason, to fall.
Note: All these examples are taken directly from my family’s very recent experience, although at the immediate time we have bigger problems than that.
This raises the subject of how the elderly can be provided with precious additional months or years of independent living. when constantly attentive in-home nursing assistance isn’t affordable. Well, it won’t be long before technology can monitor all of those subjects and more, via a variety of video, audio, tactile, or motion-detecting sensors. In other words, an utter Big Brother set-up is what may allow the elderly some continued freedom.
Putting it that way illustrates that there are huge reasons to invent and commercialize this kind of technology. But clearly, once invented and deployed, that technology would be horrifically easy to abuse. That’s just one more reason we really, really need to get our collective liberty and privacy act together.
Related links
- A 2003 post speculating about multiple uses to which home monitoring technology could be put.
- A couple of academic papers about home/health monitoring kinds of technology
Categories: Surveillance and privacy | 2 Comments |
I’ll be speaking in Washington, DC on May 6
My clients at Aster Data are putting on a sequence of conferences called “Big Data Summit(s)”, and wanted me to keynote one. I agreed to the one in Washington, DC, on May 6, on the condition that I would be allowed to start with the same liberty and privacy themes I started my New England Database Summit keynote with. Since I already knew Aster to be one of the multiple companies in this industry that is responsibly concerned about the liberty and privacy threats we’re all helping cause, I expected them to agree to that condition immediately, and indeed they did.
On a rough-draft basis, my talk concept is:
Implications of New Analytic Technology in four areas:
- Liberty & privacy
- Data acquisition & retention
- Data exploration
- Operationalized analytics
I haven’t done any work yet on the talk besides coming up with that snippet, and probably won’t until the week before I give it. Suggestions are welcome.
If anybody actually has a link to a clear discussion of legislative and regulatory data retention requirements, that would be cool. I know they’ve exploded, but I don’t have the details.
Categories: Analytic technologies, Archiving and information preservation, Aster Data, Data warehousing, Presentations, Surveillance and privacy | 1 Comment |
Greenplum et alia’s BigDataNews.com site
Greenplum recently started a website BigDataNews.com, and quickly signed up Aster Data as a co-sponsor. (Edit: As per a comment below, the decision to sign up additional sponsors was made by the site’s independent publisher.) It’s actually being run by Brett Sheppard, a former Gartner/DataQuest analyst who now gets involved in this kind of thing. (Brett and I may be working on another project soon, with Greenplum funding.)
The heart of the site is feeds* from a variety of high-profile blogs (DBMS2, Daniel Abadi’s, Joe Hellerstein’s, James Kobelius’, et al.), plus some additional posts written by Brett (primarily) or Greenplum folks. Highlights of Brett’s posts include:
- What I am told was an unauthorized revelation that Greenplum Chorus is built on CouchDB and Erlang.
- An impassioned defense of the integrity of Gartner’s analysis.
*At least in my case, that’s just a post title or snippet, plus a link back to the main post. The same goes for mapreduce.org, actually.
Categories: Analytic technologies, Data warehousing, Greenplum, NoSQL | 2 Comments |
Aster Data’s mapreduce.org site
Aster Data has started a site mapreduce.org, which purports to compile “the best information about MapReduce.” At the moment, mapreduce.org highlights include:
- A feed of MapReduce-related posts from several blogs, including this one.
- A calendar of MapReduce-related events, not necessarily Aster-specific, integrated with a feed combining …
- … Aster MapReduce-related press releases and also …
- … not necessarily Aster-specific MapReduce-related press articles.
- Links to a lot of Aster Data MapReduce-related collateral. Some of that stuff is quite good.*
- A sycophantic introduction from Colin White praising the value of the mapreduce.org “independent forum.”
*I did a couple of MapReduce-related webinars for Aster late last year. 🙂 But seriously — Aster does a good job of writing clear and informative collateral.
Categories: Analytic technologies, Aster Data, MapReduce | 3 Comments |
Introduction to Datameer
Elder care issues have flared up with a vengeance, so I’m not going to be blogging much for a while, and surely not at any length. That said, my first post about Datameer was never going to be very long, so lets get right to it:
- Datameer offers a business intelligence and analytics stack that runs on any distribution of Hadoop.
- Datameer is still building a lot of features that it talks about, for target release in (I think) the fall.
- Datameer’s pride and joy is its user interface. Very laudably for a software start-up, Datameer claims to have spent considerable time with professional user interface designers.
- Datameer’s core user interface metaphor is formula definition via a spreadsheet.
- Datameer includes 124 functions one can use in these formulae, ranging from math stuff to text tokenization.
- Datameer does some straight BI, with 4 kinds of “visualization” headed for 20 kinds later. But if you want to do hard-core BI, use Datameer to dump data into an RDBMS and then use the BI tool of your choice. (Datameer’s messaging does tend to obscure or even contradict that point.)
- Rather, Datameer seems to be designed for the classic MapReduce use cases of ETL and heavy data crunching.
- Datameer’s messaging includes a bit about “Datameer is real-time, even though Hadoop is generally thought of as batch.” So far as I can tell, what that boils down to is …
- … Datameer will let you examine sample and/or partial query results before a full Hadoop run is over. Apparently, there are three different ways Datameer lets you do this:
- You can truly query against a sample of the data set.
- You can query against intermediate results, when only some stages of the Hadoop process have already been run.
- You can drill down into a “distributed index,” whatever the heck that means when Datameer says it.
- Datameer will let you import data from 15 or so different kinds of sources, SQL, NoSQL, and file system alike.
Categories: Analytic technologies, Business intelligence, Datameer, EAI, EII, ETL, ELT, ETLT, Hadoop, MapReduce | 3 Comments |
Story of an analytic DBMS evaluation
One of our readers was kind enough to walk me through his analytic DBMS evaluation process. The story is:
- The X Company (XCo) has a <1 TB database.
- 100s of XCo’s customers log in at once to run reports. 50-200 concurrent queries is a good target number.
- XCo had been “suffering” with Oracle and wanted to upgrade.
- XCo didn’t have a lot of money to spend. Netezza pulled out of the sales cycle early due to budget (and this was recently enough that Netezza Skimmer could have been bid).
- Greenplum didn’t offer any references that approached the desired number of concurrent users.
- Ultimately the evaluation came down to Vertica and ParAccel.
- Vertica won.
Notes on the Vertica vs. ParAccel selection include: Read more
Categories: Analytic technologies, Benchmarks and POCs, Buying processes, Data warehousing, Greenplum, Netezza, Oracle, ParAccel, Vertica Systems | 7 Comments |
Greenplum Chorus and Greenplum 4.0
Greenplum is making two product announcements this morning. Greenplum 4.0 is a revision of the core Greenplum database technology. In addition, Greenplum is announcing Greenplum Chorus, which is the first product release instantiating last year’s EDC (Enterprise Data Cloud) vision statement and marketing campaign.
Greenplum 4.0 highlights and related observations include: Read more
Is the enterprise data warehouse a myth?
An enterprise data warehouse should:
- Manage data to high standards of accuracy, consistency, cleanliness, clarity, and security.
- Manage all the data in your organization.
Pick ONE. Read more