Data warehousing

Analysis of issues in data warehousing, with extensive coverage of database management systems and data warehouse appliances that are optimized to query large volumes of data. Related subjects include:

June 11, 2010

Kickfire update

A Kickfire competitor tipped me off that he got 3 Kickfire salesmen’s resumes in 24 hours. I ran this by Kickfire CEO Bruce Armstrong, who confirmed that Kickfire has had a layoff, but gave me no further details.

Bruce also told me that Kickfire is now up to 10 paying customers, and that there are repeat deals.

Categories: Data warehouse appliances, Data warehousing, Kickfire, Market share and customer counts

3 Comments

June 11, 2010

Ingres VectorWise technical highlights

After working through problems w/ travel, cell phones, and so on, Peter Boncz of VectorWise finally caught up with me for a regrettably brief call. Peter gave me the strong impression that what I’d written in the past about VectorWise had been and remained accurate, so I focused on filling in the gaps. Highlights included: Read more

Categories: Actian and Ingres, Analytic technologies, Benchmarks and POCs, Columnar database management, Data warehousing, Database compression, Open source, VectorWise

2 Comments

June 5, 2010

Algebraix

I talked Friday with Chris Piedemonte and Gary Sherman, respectively the Cofounder/CTO and Chief Mathematician of Algebraix, who hooked up together for this project back in 2003 or 2004. (Algebraix is the company formerly known as XSPRADA.) Algebraix makes an analytic DBMS, somewhat based on the ideas of extended set theory, that runs on SMP (Symmetric MultiProcessing) boxes. Like all analytic DBMS vendors, Algebraix has on some occasions run some queries orders of magnitude faster than they ran on the systems users were looking to replace.

Algebraix’s secret sauce is that the DBMS keeps reorganizing and recopying the data on disk, to optimize performance in response to expected query patterns (automatically inferred from queries it’s seen so far). This sounds a lot like the Infobright story, with some of the more obvious differences being: Read more

Categories: Algebraix, Data warehousing, Database compression, Infobright, Theory and architecture

3 Comments

May 23, 2010

Various quick notes

As you might imagine, there are a lot of blog posts I’d like to write I never seem to get around to, or things I’d like to comment on that I don’t want to bother ever writing a full post about. In some cases I just tweet a comment or link and leave it at that.

And it’s not going to get any better. Next week = the oft-postponed elder care trip. Then I’m back for a short week. Then I’m off on my quarterly visit to the SF area. Soon thereafter I’ve have a lot to do in connection with Enzee Universe. And at that point another month will have gone by.

Anyhow: Read more

Categories: Analytic technologies, Business intelligence, Data warehousing, Exadata, GIS and geospatial, Google, IBM and DB2, Netezza, Oracle, Parallelization, SAP AG, SAS Institute

3 Comments

May 23, 2010

More on Sybase IQ, including Version 15.2

Back in March, Sybase was kind enough to give me permission to post a slide deck about Sybase IQ. Well, I’m finally getting around to doing so. Highlights include but are not limited to:

Slide 2 has some market success figures and so on. (>3100 copies at >1800 users, >200 sales last year)
Slides 6-11 give more detail on Sybase’s indexing and data access methods than I put into my recent technical basics of Sybase IQ post.
Slide 16 reminds us that in-database data mining is quite competitive with what SAS has actually delivered with its DBMS partners, even if it doesn’t have the nice architectural approach of Aster or Netezza. (I.e., Sybase IQ’s more-than-SQL advanced analytics story relies on C++ UDFs — User Defined Functions — running in-process with the DBMS.) In particular, there’s a data mining/predictive analytics library — modeling and scoring both — licensed from a small third party.
A number of the other later slides also have quite a bit of technical crunch. (More on some of those points below too.)

Sybase IQ may have a bit of a funky architecture (e.g., no MPP), but the age of the product and the substantial revenue it generates have allowed Sybase to put in a bunch of product features that newer vendors haven’t gotten around to yet.

More recently, Sybase volunteered permission for me to preannounce Sybase IQ Version 15.2 by a few days (it’s scheduled to come out this week). Read more

Categories: Analytic technologies, Application areas, Columnar database management, Data mart outsourcing, Data warehousing, Database compression, Investment research and trading, Market share and customer counts, Petabyte-scale data management, Sybase, Telecommunications, Text

1 Comment

May 22, 2010

Notes on SciDB and scientific data management

I firmly believe that, as a community, we should look for ways to support scientific data management and related analytics. That’s why, for example, I went to XLDB3 in Lyon, France at my own expense. Eight months ago, I wrote about issues in scientific data management. Here’s some of what has transpired since then.

The main new activity I know of has been in the open source SciDB project. Read more

Categories: Analytic technologies, Data warehousing, eBay, GIS and geospatial, Microsoft and SQL*Server, SciDB, Scientific research, Web analytics

5 Comments

May 17, 2010

Technical basics of Sybase IQ

The Sybase IQ folks had been rather slow about briefing me, at least with respect to crunch. They finally fixed that in February. Since then, I’ve been slow about posting based on those briefings. But what with Sybase being acquired by SAP, Sybase having an analyst meeting this week, and other reasons – well, this seems like a good time to post about Sybase IQ. 🙂

For starters, Sybase IQ is not just a bitmapped system, but it’s also not all that closely akin to C-Store or Vertica. In particular,

Sybase IQ stores data in columns – like, for example, Vertica.
Sybase IQ relies on indexes to retrieve data – unlike, for example, Vertica, in which the column pretty much is the index.
However, columns themselves can be used as indexes in the usual Vertica-like way.
Most of Sybase IQ’s indexes are bitmaps, or a lot like bitmaps, ala’ the original IQ product.
Some of Sybase IQ’s indexes are not at all like bitmaps, but more like B-trees.
In general, Sybase recommends that you put multiple indexes on each column because — what the heck – each one of them is pretty small. (In particular, the bitmap-like indexes are highly compressible.) Together, indexes tend to take up <10% of Sybase IQ storage space.

Categories: Columnar database management, Data warehousing, Database compression, Sybase, Theory and architecture

3 Comments

May 13, 2010

SAP believes in database proliferation

For as long as we’ve had the concept of database management, there’s been a debate as to whether it is realistic for large enterprises to have a single Grand Unified Enterprise Storehouse Of All Information, or whether database proliferation actually makes sense. This argument has been particularly intense in the area of data warehouse/data marts. I’m generally on the side of data mart proliferation.

4 1/2 years ago, I noted that SAP believed strongly in database proliferation: Read more

Categories: Data warehousing, SAP AG, Theory and architecture

3 Comments

May 12, 2010

Quick reactions to SAP acquiring Sybase

SAP is acquiring Sybase. On the conference call SAP said Sybase would be run as a separate division of SAP (no surprise). Most of the focus was on Sybase’s mobile technology, which is forecast at >$400 million in 2010 revenues (which would be 30%ish of the total). My quick reactions include: Read more

Categories: Analytic technologies, ANTs Software, Business intelligence, Business Objects, Columnar database management, Data warehousing, In-memory DBMS, Memory-centric data management, OLTP, ParAccel, SAP AG, Sybase, Theory and architecture, Vertica Systems

13 Comments

May 8, 2010

8 not very technical problems with analytic technology

In a couple of talks, including last Thursday’s, I’ve rattled off a list of eight serious problems with analytic technology, all of them human or organizational much more than purely technical. At best, these problems stand in the way of analytic success, and at least one is a lot worse than that.

The bulleted list in my notes is:

Individual-human
- Expense of expertise
- Limited numeracy
Organizational
- Limited budgets
- Legacy systems
- General inertia
Political
- Obsolete systems
- Clueless lawmakers
- Obsolete legal framework

I shall explain. Read more

Categories: Analytic technologies, Business intelligence, Data integration and middleware, Data warehousing, EAI, EII, ETL, ELT, ETLT, Surveillance and privacy

Search our blogs and white papers

Monash Research blogs

DBMS 2 covers database management, analytics, and related technologies.
Text Technologies covers text mining, search, and social software.
Strategic Messaging analyzes marketing and messaging strategy.
The Monash Report examines technology and public policy issues.
Software Memories recounts the history of the software industry.

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.

Links
- Monash Research
- White Papers
Admin
- Log in

Data warehousing

Kickfire update

Ingres VectorWise technical highlights

Algebraix

Various quick notes

More on Sybase IQ, including Version 15.2

Notes on SciDB and scientific data management

Technical basics of Sybase IQ

SAP believes in database proliferation

Quick reactions to SAP acquiring Sybase

8 not very technical problems with analytic technology

Search our blogs and white papers

Monash Research blogs

User consulting

Vendor advisory

Monash Research highlights

Recent posts

Categories

Date archives

Admin