Kickfire’s basic value proposition is that, if you have a data warehouse in the 100s of gigabytes, they’ll sell you – for $32,000 – a tiny box that solves all your query performance problems, as per the Kickfire spec sheet. And Kickfire backs that up with a pretty cool product design. However, thanks in no small part to what was heretofore Kickfire’s penchant for self-defeating secrecy, the Kickfire story is not widely appreciated.
Fortunately, Kickfire is getting over its secrecy kick. And so, here are some Kickfire technical basics.
- Kickfire is MySQL-based, with all the SQL functionality and lack of functionality that entails.
- The Kickfire/MySQL DBMS is columnar, with the usual benefits in compression and I/O reduction.
- Kickfire is based on FPGAs (Field-Programmable Gate Arrays).
- The Kickfire DBMS is ACID-compliant.
- Kickfire runs only as a single-box appliance.
- While Kickfire earlier estimated that, at least for data sets that compressed well, a Kickfire box could hold 3-10 terabytes of user data, more recent figures I’ve heard from Kickfire have been in the 1-1 /2 terabyte range. (Edit: Karl Van Der Bergh subsequently wrote in to say that the 1 1/2 TB is raw disk figure, not user data.)
The new information there is that Kickfire relies on an FPGA; Kickfire had long been artfully vague on the subject of FPGA vs. custom silicon. This had the unfortunate effect that people believed Kickfire relied on a proprietary chip, with all the negative implications for future R&D effectiveness that is believed to imply. But in fact Kickfire just relies on standard chips, even if — like Netezza and XtremeData — Kickfire does rely on less programmer-friendly FPGAs to do some of what most rival vendors do on Intel-compatible CPUs.
In terms of how it uses the FPGA, Kickfire is more like XtremeData than like Netezza. That is, large fractions of actual SQL processing seem to be done on the FPGA, not just projections and restrictions. Pipelining is a key concept, in that data is shunted among various “processing engines” without, unless absolutely necessary, being sent back into RAM. If I understood Kickfire founder Raj Cherabuddi correctly:
- There are three kinds of on-FPGA Kickfire “processing engines”.
- Each Kickfire processing engine can do any of about half a dozen different basic things.
- When data finishes at one engine it is sent straight to another engine if at all possible.
- One of the Kickfire optimizer’s main responsibilities is to ensure that this will be possible as often as – well, as often as possible.
Raj says that there are two main reasons data can ever be sent back to memory mid-query. First, the optimizer might sadly fail to find a “networking solution” that allows for perfect pipelining. Second, a query might be so complex that several passes through the pipeline are needed to get it done.
That’s one of Kickfire’s top-two performance strategies. I ran out of time on my last visit before I properly understood the other one, which is something that Kickfire calls “deep indexing,” but which sounds a lot like an inverted list. (Key point: If you have an inverted list already created, joins can be very fast.) When/how exactly that’s used, and what Kickfire does in “hardware” to support it, is a subject for another time.
On the negative side: To get good update/trickle-feed performance, columnar vendors have to do something or other clever. That’s still a future for Kickfire, with the specifics of the roadmap being NDA. I imagine Kickfire also has performance weaknesses in areas where it relies on MySQL for things that MySQL doesn’t happen to be very good at.