May 17, 2010

Technical basics of Sybase IQ

The Sybase IQ folks had been rather slow about briefing me, at least with respect to crunch. They finally fixed that in February. Since then, I’ve been slow about posting based on those briefings. But what with Sybase being acquired by SAP, Sybase having an analyst meeting this week, and other reasons – well, this seems like a good time to post about Sybase IQ. :)

For starters, Sybase IQ is not just a bitmapped system, but it’s also not all that closely akin to C-Store or Vertica. In particular,

Sybase IQ is not immune to Sybase’s confusing choices in version numbering. Thus:

Now let’s get down to some technical particulars.

Sybase IQ columns are always stored in RowID order. However, RowIDs are logical and not physical, and hence take up little disk space. A small amount of per-page metadata lets you find the specific cell you want. (Cells are commonly fixed-width, in which case finding the cell of choice is a simple calculation.) So RowIDs are not much of an I/O overhead issue, although I’m not sure at what point they get unpacked and start needing to be carried around as the data travels through silicon.

Sybase IQ has 9 or so kinds of indexes. The choice of index has a lot to do with cardinality. In the extreme low-cardinality case, a simple bitmap might do. With intermediate cardinality, you might go to a modified kind of bitmap – e.g., if there there are 2^16 possible values, you can represent a value in 16 bits, and bitmap operations are approximately 16 times as costly as if the number of possible values were only 2^1. For very high cardinality, there’s a B-tree-like index called “High Group”.

Note: Surely every Sybase index name, at some time, made sense to at least one engineer.

Sybase IQ’s execution engine does seem to rely quite a bit on bitmaps. E.g., intermediate query results are stored as bitmaps, which helps them play nicely with each other and with many of the indexes. Sybase claims that Sybase IQ’s bitmap orientation often makes WHERE clauses execute very quickly. Sybase IQ reoptimizes queries after WHERE clauses are evaluated. Complex expressions are, when possible, evaluated once per unique value, not once per row.

Speaking of unique values – Sybase IQ’s compression story doesn’t currently match that of some other columnar products, but it seems to stack up pretty well against row-based systems. In particular:

Sybase IQ boasts pipelining, in two senses. First, IQ tries to read pages for multiple queries at the same time. Second, Sybase IQ tries to prefetch pages into cache before they’re needed. Sybase points out that these prefetched pages have the WHERE clauses already executed, and that no extra baggage is being dragged into cache that doesn’t need to be there.

Highlights of Sybase IQ’s update and load story include:

I’ve lost track a little bit as to which “advanced analytics” functionality is in Sybase IQ 15.1, which will be in 15.2, and what’s a future beyond that, which is a great excuse for me to leave it out of what has already become a rather long post. But anyhow, except perhaps for the future stuff and/or some time series functionality, none of it seems terribly advanced. Sybase IQ does have two stored procedure languages, namely the ones for Sybase ASE (T-SQL) and for Sybase Anywhere or Adaptive Server Anywhere or whatever it’s called this week (Watcom SQL, which Sybase asserts is similar to the ANSI SQL stored procedure language).

Similarly, I’ll leave a lot of other stuff out as well, and for now stop here.

Related links

Comments

3 Responses to “Technical basics of Sybase IQ”

  1. More on Sybase IQ, including Version 15.2 | DBMS2 -- DataBase Management System Services on May 23rd, 2010 4:34 am

    [...] 6-11 give more detail on Sybase’s indexing and data access methods than I put into my recent technical basics of Sybase IQ [...]

  2. Dweller on October 27th, 2010 3:31 pm

    You are referring/comparing Sybase IQ with Vertica. Actually it should be the other way around as the original Sybase IQ technology has been around for about 18+ years. I also need to sensitize you on some of the facts. Your statement… ‘Sybase IQ relies on indexes to retrieve data’ is not entirely true as the data is already stored in a highly optimized index which means when you query the data it will be ready the index rather than data as there is not flat data stored somewhere.

    Having said that, there are so many players entering the market the last 3 year and those ‘not so new’ technologies are doing well. I’m a firm believer in competition and that’s what the Sybase’s, Teradata’s, Netezza’s need to keep them ahead. Have you noticed I did not mention Oracle or IBM DB2 UDB or MS SQL Server as they are the worst performers in the analytical db space. I’ve implemented analytical DW’s for the past 11 years and none of the latter could cope with the massive high volumes of data. Even the Open Source guys are making waves and passing the Big Red and Blue when it gets to VVLDB’s (Very Very Large Databases)or MTDB’s (multi terabyte > 20TB)

    Watch the Open Source players they are catching up fast and is way more cost effective than the old boys of databases.

    Thanks for your reviews.

  3. Curt Monash on October 27th, 2010 9:16 pm

    Dweller,

    If you’re saying that Sybase IQ can keep all the database’s information in its indexes … well, you have a point.

    But that’s not always the implementation choice made.

Leave a Reply




Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.