August 4, 2008

QlikTech/QlikView update

I talked with Anthony Deighton of memory-centric BI vendor QlikTech for an hour and a half this afternoon. QlikTech is quite the success story, with disclosed 2007 revenue of $80 million, up 80% year over year, and confidential year-to-date 2008 figures that do not disappoint as a follow-on. And a look at the QlikTech’s QlikView product makes it easy to understand how this success might have come about.

Let me start by reviewing QlikTech’s technology, as best I understand it.

Unfortunately, we did not have time for a serious discussion of data loading performance, nor of any resulting compromises in data freshness. However, to the extent QlikView is positioned against MOLAP (as opposed to straightforward relational BI) alternatives — well, MOLAP isn’t exactly great at sub-hour latency either.

You can see how the QlikView UI works in multiple examples at demo.qlikview.com. However, I’d caution against trying to use the AJAX versions in Firefox (things worked much better for me when I switched to the IE rendering engine).

What Anthony told me about customer and business trends is pretty much in line with what QlikTech has said before, and with general industry trends. Highlights included:

Related links

Comments

17 Responses to “QlikTech/QlikView update”

  1. Scott on August 5th, 2008 3:03 pm

    On the whole associative thing:

    Rather than pre-calculating totals as the data is loaded into a disk-based repository to gain end-user query speed, (as traditional OLAP based BI does), QlikView pre-associates elements of information to gain data model flexibility without compromising speed. It does so by linking every piece of loaded data to every other piece of relevant data using a mechanism called Associative Query Logic (AQL). While moving data into its repository, QlikView does not replicate an existing bit of information (such as a specific part number), but rather associates it with existing records using a pointer. This design lets enterprises analyze the business along any possible dimension or combination of dimensions, a powerful capability considering that many companies do not know for sure what tomorrow’s questions will be. (Or else they would build a cube for it, ahead of time.)

    Possibly contrary to expectation, this flexibility doesn’t come with a query speed penalty. By loading only nonduplicate data and compact pointers, the system is able to load the entire repository into memory — rather than onto slower disk — ensuring that response time is lightning fast for end-users. In some cases, data brought in can range to a billion rows or more. (Yes, with a “b”)

    Said another way: why precalculate limited dimensional views in cubes which contain a very small number of available dimensions, using fixed, pre-defined drill-down paths, with (usually) aggregated or summarized data, when you can load transactional level data of essentially unlimited dimensions and perform whatever calculations are needed by the user at that moment they need something analyzed (“click”), at RAM speed?

    For more background, see here:

    http://customerexperiencematrix.blogspot.com/2007/08/what-makes-qliktech-so-good.html

  2. Curt Monash on August 5th, 2008 5:48 pm

    Scott,

    That link says, in effect, that this is just a fancy name for SQL joins without the joiner having to explicitly cite the table and column names. If so, then you would seem to be offering something more truly relational than SQL quasi-relational DBMS, or something like that.

    The bit with the pointers driving an essentially columnar, in-memory relational DBMS reminds me of the transrelational idea, which I’ve written about at length here (see the category list to the right, bottom entry). Is that a fair comparison?

    Best,

    CAM

  3. Curt Monash on August 5th, 2008 5:49 pm

    I should also add that this seems to depend on columns happening to have the same names if they mean the same thing. Am I understanding that part correctly?

    Thanks,

    CAM

  4. David Raab on August 5th, 2008 11:03 pm

    Hi Curt,

    As the author of the link in question, I can assure you it was not intended to say that QlikView is “a fancy name for SQL joins without the joiner having to explicitly cite the table and column names”. In fact, I tried to make clear that QlikView maintains relationships among records WITHOUT any kind of join being set up, either by the end user or in advance by a system administrator.

    If establishing relations based on shared keys is a join, then, yes, that’s what QlikView does. So does any other database that links related records. But, to me, the term “SQL join” implies a process executed during a query that compares the key fields on two records and links the records where those fields match. As best I can tell, QlikView doesn’t do this, presumably because those relationships were already stored in pointers when the database is built. Thus, query-time processing is nil. I am quite certain that QlikView is not a conventional SQL database engine—whatever that might be—wrapped in a clever package. It just doesn’t perform that way.

    Nor is QlikView is a columnar database. That case is actually a bit clearer, because I can give a very specific definition of a columnar database: one where the data for each column is stored contiguously, so it can all be read in sequence. (Similarly, a row-oriented database has the data for each row stored contiguously.) Since QlikView is an in-memory database, it doesn’t quite make sense to talk about the physical organization of the data. But insofar as you can talk about it, I believe the pointers are what is stored contiguously. This is what allows QlikView to easily traverse the relationships among the data elements within the system.

    Although the QlikView people have been maddeningly vague about the technical details of their system, it is not the only pointer-based product ever developed. You may remember Digital Archaeology from some years ago. Among current systems, illuminate (www.i-lluminate.com) also uses pointers. Neither product is technically identical with QlikView, but my point is that there are known alternatives to row- and column-based organization.

    Incidentally, QlikView’s claim of 10x compression needs to be parsed quite carefully. The amount of disk space occupied by data stored in the QlikView format can indeed be 1/10th the size of the input. But the data expands when it’s loaded back into memory, often to nearly the same volume as the original. If you consider compression as evidence that QlikView is “really” a columnar system, then cross that off your list.

    David Raab

  5. Curt Monash on August 6th, 2008 2:21 am

    David,

    Thanks for commenting at such insightful length!

    A star index pre-joins, yet it is used to execute what reasonably can be called “joins”. And query time processing is, I doubt, not exactly “nil”. So that’s probably not where the crux of the matter lies.

    I agree with you that the QlikView folks are maddeningly vague. Indeed, I’d say that vagueness of this kind is almost almost always misleading, whether or not deliberately. A pity.

    I don’t understand what you mean by data expanding when it is loaded “back into memory”. Where was it just before?

    iLuminate apparently is “often” sold as a back end to QlikView, which suggests they’re doing different things on some level, as per http://www.dbms2.com/2008/03/26/illuminate-iluminate-correlation-associative/

    I think it makes perfect sense to talk about the physical data model of an in-memory data management system. I do it all the time. :)

    But if you’re right that pointers are stored contiguously, then perhaps we’re talking about an inverted index of some kind. Today, those are used most commonly in text management systems, but right before the relational era they were the best way of handing other kinds of data as well, as in products such as ADABAS, Datacom-DB, and Model 204.

    That said, the differences between inverted-list and columnar architectures often are small. SAP’s in-memory BI Accelerator is a modification of TREX, a classic inverted-index text indexing system. But it’s fairly described as a columnar system.

    CAM

  6. Scott on August 6th, 2008 10:21 am

    The associative aspect is really more meaningful in describing the end user experience, in that you see visually what is associated and is not associated with any particular selection or drilldown. This is particularly useful in that it can give you answers to questions you didn’t think to ask: “Why hasn’t that customer bought any of this product?” (since it is greyed out). It’s also particularly useful, again, in enabling thought-based analysis, rather than traditional “this is the way IT preconstructed it, so analyze it this way” analysis.

    There are three patents protecting the uniqueness of Qlikview’s data model, and a company has to balance a technologist’s or analyst’s “need to know” with revealing IP to competitors. There’s an understanding this could create some curiosity on the technologist’s or analyst’s part, but 8,000+ customers are thrilled with the value/uniqueness and don’t care so much about the technology’s inner workings.

    Any sharp data person can download Qlikview’s fully functional developer client and give it a try. It’s perhaps the best way to “get it”.

  7. David Raab on August 6th, 2008 10:22 am

    Hi Curt,

    After a good night’s sleep, I was going to add this morning that what I’m calling “pointers” are actually quite similar to a join index, but you have beaten me to the punch with your comment about star index pre-joins. Either way, the critical distinction is between joins that are executed at the time of the query, and joins/pointers that are executed in advance and stored with the data. Clearly QlikView does the latter, and that is what gives it great speed.

    As to the data expansion, it was on disk before it was in memory. It’s the disk file that may be 1/10th the size of the original data. The compression in QlikView comes primarily from tokenization (that is, if the same string occurs many times, QlikView stores the actual value just once). Apparently, QlikView returns the data to its original format when it loads it back into memory.

    illuminate definitely is a different technology from QlikView. See my own post on the topic at http://customerexperiencematrix.blogspot.com/2008/04/illuminate-systems-iluminate-may-be.html. I purposely didn’t mention the illuminate/QlikView connection because it only confuses matters. But, since you brought it up, you should realize that the relationship is technically limited: illuminate exports a data set that is then loaded into a QlikView database, and QlikView reports against it. That is, QlikView does not query the illuminate database directly.

    I empathize with your desire to fit QlikView into familiar categories, but try to resist. It’s not a columnar or inverted index system. The salient performance characteristics of those technologies are that response time increases (a) when you add more columns (because you have to read more data) and (b) when you add more joins (because processing is required). Neither of those parameters increases response time in QlikView so far as I’ve ever noticed. What does increase response time is calculations within reports, but that’s another topic entirely…

    The other point I had intended to add to yesterday’s comment was a clarification about how QlikView joins differ from SQL joins. QlikView reads the data “in place”, rather than creating a result set like SQL. The specific advantage comes with many-to-many joins. Imagine two tables with three rows, each having the same key value. A traditional SQL join would create nine rows in the result set: that is, you get three records from matching the first record in table A with all three records in table B; another three from matching the second record in table A against all of table B, and yet another three from matching the third record in table A against table B. This means that if you, say, added the values from table B records, they would be triple-counted. QlikView would recognize that the records all match, but still reads table B directly. Therefore no redundant records are created, and a sum of fields from table B would be correct. Better still, a report that showed the sum data from table A and the sum of data from table B would give the correct answer regardless of whether you had selected one, two or three rows from table A.

    (Apologies if this is too abstract. One practical example is calculating response rates to a promotion. Table A has each response, and table B has a single record with the audience quantity. If you do a SQL join of table A to table B, all the records in table A match table B, so the result set has one record for each response, and every record has the audience quantity on it. Calculating response rate by dividing the sum of the responses into the sum of the audience quantities would therefore give the wrong result. This doesn’t happen in QlikView.)

    David

  8. Curt Monash on August 6th, 2008 11:55 am

    Scott,

    Re “The associative aspect is really more meaningful in describing the end user experience, in that you see visually what is associated and is not associated with any particular selection or drilldown.”

    Thank you for admitting that clearly!!! It wastes a fair amount of analysts’ time when your company pretends otherwise.

    Not explaining your technology is a legitimate business decision. (And your financial results suggest that, for now, it’s been a successful choice.) Pretending to be explaining technology when you’re not, however, is needlessly annoying.

    That said, I would caution you that the choice not to explain will probably not always seem to be as happy a one as it is today. Pretending that your technology is more exotic or innovative than it really is can pay dividends for a while. But I think you’ll learn that it has downside too.

    CAM

  9. Extensive QlikView coverage from a big fan and reseller | DBMS2 -- DataBase Management System Services on August 6th, 2008 3:00 pm

    [...] is positive enough to have been recommended by the company itself.  Specifically, it was cited in the comment thread to my recent post on QlikTech, where David himself also addressed some of my [...]

  10. I’m not the only one who thinks vendors underdisclose | Strategic Messaging on August 6th, 2008 3:58 pm

    [...] in what was basically a highly favorable write-up of QlikTech/QlikView, I grew so frustrated as to finally say in the comment thread: Thank you for admitting that [...]

  11. Jay Jakosky on August 6th, 2008 11:58 pm

    I just wanted to add that I’ve been working with QlikView for 7 years and I’ve never heard of iLuminate. QlikView does not need a backend other than your existing transaction or warehouse database and is never sold with a backend. There is no bundling with other products unless an independent partner chooses to do so for a targeted solution.

  12. Jay Jakosky on August 7th, 2008 12:14 am

    I appreciate that you want to know the internal structure. But it’s just not important. QlikView is not competing with the Verticas and Netezzas of the world. QlikView still has an enforced limit of 2 billion unique values in any one column. Clearly it’s not trying to be a terabyte player.

    Though QlikView is being dragged into the Enterprise market because it’s really good at rapid design, test and rollout, QlikTech has stated over and over that they want to move broader, not higher, in the market. So they focus on collaboration features and Microsoft Office integration, for example.

    Performance gains have come from immediate adoption of 64-bit and parallel calculation of report elements on multi-core machines. In the 7 years that I’ve been working with the product and the 4-5 years that I’ve attended the user conferences, no one has complained about the speed of the database. Satisfaction levels among customers is through the roof.

    So, if you are analyzing for the enterprise market (and QlikView has had big success stories there) then QlikView is quickly eliminated from terabyte plays. In the mid-size, small-business or enterprise-division markets, it is still impossible to evaluate QlikView the database without QlikView the frontend. And that’s not to mention QlikView the ETL tool.

  13. Curt Monash on August 7th, 2008 2:50 am

    Jay,

    iLuminate has not, historically, been too active outside of Spain. There’s little reason for you to have heard of them if you don’t happen to do work there.

    CAM

  14. Curt Monash on August 7th, 2008 3:08 am

    Jay,

    QlikView offers some cool UI features rare or missing in other BI products’. But it also is missing some that other products have. So speeds-and-feeds matter a lot — not so much directly, but rather due to their effect on deployment and management hassle. (Great speed w/o much administrative effort means — well, it means that you don’t have to put in much administrative effort.)

    So yes, architecture DOES matter.

    By the way — congrats on getting on the QlikView horse impressively early.

    CAM

  15. Infology.Ru » Blog Archive » Последние новости о QlikTech/QlikView on October 12th, 2008 8:45 am

    [...] Автор: Curt Monash Дата публикации оригинала: 2008-08-04 Перевод: Олег Кузьменко Источник: Блог Курта Монаша [...]

  16. Mech88 on October 21st, 2008 2:05 pm

    As a real neophyte, I’d like to ask how QlikTech compares to Panaroma – both up and comers in the BI space. I think both use an OLAP “front end”.

    Second, understanding that neither are true “enterprise” plays like Cognos/IBM, Oracle,SAP/Business Objects, etc. can these SME (small-medium enterprise) focused companies compete successfully against the big boys or is it simply a matter of time before the big boys start to focus on the small-mid sized customer and either put these guys out of business or acquire them outright.

    Thanks

  17. Hari on November 25th, 2012 8:40 pm

    @Jay

    I hope in Qlikview 11 SR2 , 2 Billion unique values in a column restriction is removed using DIRECT DATA DISCOVERY. Your values can in be external harddisk and some can be in in-memory.

Leave a Reply




Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.