May 13th, 2008 Curt Monash
McObject — vendor of memory-centric DBMS eXtremeDB — is a tiny, tiny company, without a development team of the size one would think needed to turn out one or more highly-reliable DBMS. So I haven’t spent a lot of time thinking about whether it’s a serious alternative to solidDB for embedded DBMS, e.g. in telecom equipment. However:
- IBM’s acquisition of Solid seems to suggest a focus on DB2 caching rather than the embedded market
- McObject actually has built up something of a customer list, as per the boilerplate on any of its press releases.
And they do seem to have some nice features, including Patricia tries (like solidDB), R-trees (for geospatial), and some kind of hybrid disk-centric/memory-centric operation.
Posted in GIS and geospatial, McObject and eXtremeDB, Memory-centric data management, solidDB | 2 Comments »
May 13th, 2008 Curt Monash
I may have gotten confused again as to an embargo date, but if so, then this time I had it late rather than early. Anyhow, the TDWI-timed news is that Vertica is now available in the Amazon cloud. Of course, the new Vertica cloud offering is:
- Super-easy to set up
- Pay-as-you-go.
Slightly less obviously:
- Vertica insists its software was designed for grid computing from the ground up, and hence doesn’t need Elastra’s administrative aids for starting, stopping, and/or provisioning instances.
- This is a natural fit for new or existing Vertica customers in data mart outsourcing.
Other coverage:
Posted in Analytics and analytic technologies, Cloud computing, Data warehousing, Relational database management systems, SaaS, Vertica Systems | No Comments »
May 8th, 2008 Curt Monash
Another TDWI conference approaches. Not coincidentally, I had another Vertica briefing. Primary subjects included some embargoed stuff, plus (at my instigation) outsourced data marts. But I also had the opportunity to follow up on a couple of points from February’s briefing, namely:
Vertica has about 35 paying customers. That doesn’t sound like a lot more than they had a quarter ago, but first quarters can be slow.
Vertica’s list price is $150K/terabyte of user data. That sounds very high versus the competition. On the other hand, if you do the math versus what they told me a few months ago — average initial selling price $250K or less, multi-terabyte sites — it’s obvious that discounting is rampant, so I wouldn’t actually assume that Vertica is a high-priced alternative.
Vertica does stress several reasons for thinking its TCO is competitive. First, with all that compression and performance, they think their hardware costs are very modest. Second, with the self-tuning, they think their DBA costs are modest too. Finally, they charge only for deployed data; the software that stores copies of data for development and test is free.
Posted in Analytics and analytic technologies, Columnar architectures, Data warehousing, Database compression, Vertica Systems | 4 Comments »
May 8th, 2008 Curt Monash
In which we bring you another instantiation of Monash’s First Law of Commercial Semantics: Bad jargon drives out good.
When Enterprise DB announced a partnership with Truviso for a “blade,” I naturally assumed they were using the term in a more-or-less standard way, and hence believed that it was more than a “Barney” press release.* Silly me. Rather than referring to something closely akin to “datablade,” EnterpriseDB’s “blade” program turns out to just to be a catchall set of partnerships.
*A “Barney” announcement is one whose entire content boils down to “I love you; you love me.”
According to EnterpriseDB CTO Bob Zurek, the main features of the “blade” program include:
Read the rest of this entry »
Posted in Data types, EnterpriseDB and Postgres Plus, Open source RDBMS, Portability, transparency, and plug-compatibility, PostgreSQL, Relational database management systems, Specialized data management in general | 3 Comments »
May 8th, 2008 Curt Monash
Call me slow on the uptake if you like, but it’s finally dawned on me that outsourced data marts are a nontrivial segment of the analytics business. For example:
- I was just briefed by Vertica, and got the impression that data mart outsourcers may be Vertica’s #3 vertical market, after financial services and telecom. Certainly it seems like they are Vertica’s #3 market if you bundle together data mart outsourcers and more conventional OEMs.
- When Netezza started out, a bunch of its early customers were credit data-based analytics outsourcers like Acxiom.
- After nagging DATAllegro for a production reference, I finally got a good one — TEOCO. TEOCO specializes in figuring out whether inter-carrier telcom bills are correct. While there’s certainly a transactional invoice-processing aspect to this, the business seems to hinge mainly around doing calculations to figure out correct charges.
- I was talking with Pervasive about Pervasive Datarush, a beta product that lets you do super-fast analytics on data even if you never load it into a DBMS in the first place. I challenged them for use cases. One user turns out to be an insurance claims rule-checking outsourcer.
- One of Infobright’s references is a French CRM analytics outsourcer, 1024 Degres.
- 1010data has built up a client base of 50-60, including a number of financial and retail blue-chippers, with a soup-to-nuts BI/analysis/columnar database stack.
- I haven’t heard much about Verix in a while, but their niche was combining internal sales figures with external point-of-sale/prescription data to assess retail (especially pharma) microtrends.
To a first approximation, here’s what I think is going on. Read the rest of this entry »
Posted in 1010data, Analytics and analytic technologies, Business intelligence, Cloud computing, Data warehousing, Infobright and Brighthouse, Netezza, Pervasive Software, SaaS, Specific users, TEOCO, Vertica Systems | 2 Comments »
April 29th, 2008 Curt Monash
Truviso and EnterpriseDB announced today that there’s a Truviso “blade” for Postgres Plus. By email, EnterpriseDB Bob Zurek endorsed my tentative summary of what this means technically, namely:
-
There’s data being managed transactionally by EnterpriseDB.
-
Truviso’s DML has all along included ways to talk to a persistent Postgres data store.
-
If, in addition, one wants to do stream processing things on the same data, that’s now possible, using Truviso’s usual DML.
Read the rest of this entry »
Posted in Analytics and analytic technologies, Business intelligence, Complex event/stream processing (CEP), Data types, EnterpriseDB and Postgres Plus, Games and virtual worlds, Memory-centric data management, Open source RDBMS, PostgreSQL, Specialized data management in general, Truviso | 1 Comment »
April 29th, 2008 Curt Monash
Mark Logic* has an interesting, complex story. They sell a technology stack based on an XML DBMS with text search designed in from the get go. They usually want to be known as a “content” technology provider rather than a DBMS vendor, but not quite always.
*Note: Product name = MarkLogic, company name = Mark Logic.
I’ve agreed to do a white paper and webcast for Mark Logic (sponsored, of course). But before I start serious work on those, I want to blog based on what I know. As always, feedback is warmly encouraged.
Some of the big differences between MarkLogic and other DBMS are:
-
MarkLogic’s primary DML/DDL (Data Manipulation/Description Language) is XQuery. Indeed, Mark Logic is in many ways the chief standard-bearer for pure XQuery, as opposed to SQL/XQuery hybrids.
-
MarkLogic’s XML processing is much faster than many alternatives. A client told me last year that – in an application that had nothing to do with MarkLogic’s traditional strength of text search – MarkLogic’s performance beat IBM DB2/Viper’s by “an order of magnitude.” And I think they were using the phrase correctly (i.e., 10X or so).
-
MarkLogic indexes all kinds of entities and facts, automagically, without any schema-prebuilding. (Nor, I gather, do they depend on individual documents carrying proper DTDs.) So there actually isn’t a lot of DDL. (Mark Logic claims in one test MarkLogic had more or less 0 DDL, vs. 20,000 lines in DB2/Viper.) What MarkLogic indexes includes, as Mark Logic puts it:
-
As opposed to most extended-relational DBMS, MarkLogic indexes all kinds of information in a single, tightly integrated index. Mark Logic claims this is part of the reason for MarkLogic’s good performance, and asserts that competitors’ lack of full integration often causes overhead and/or gets in the way of optimal query plans. (For example, Mark Logic claims that Microsoft SQL Server’s optimizer is so FUBARed that it always does the text part of a search first.) Interestingly, Intersystems’ object-oriented Cache’ does pretty much the same thing.
-
MarkLogic is proud of its text search extensions to XQuery. I’ve neglected to ask how that relates to the XQuery standards process. (For example, text search wasn’t integrated into the SQL standard until SQL3.)
Other architectural highlights include:
Read the rest of this entry »
Posted in Data types, IBM and DB2, Mark Logic, Native XML | 1 Comment »
April 25th, 2008 Curt Monash
I made a round of queries about data warehouse software or appliance pricing, and am posting the results as I get them. Earlier installments featured Teradata and Netezza. Now ParAccel is up.
ParAccel’s software license fees are actually very simple — $50K per server or $100K per terabyte, whichever is less. (If you’re wondering how the per-TB fee can ever be the smaller one, please recall that ParAccel offers a memory-centric approach to sub-TB databases.)
Details about how much data fits on a node are hard to come by, as is clarity about maintenance costs. Even so, pricing turns out to be one of the rare subjects on which ParAccel is more forthcoming than most competitors.
Posted in Analytics and analytic technologies, Data warehousing, ParAccel, Relational database management systems | 3 Comments »
April 21st, 2008 Curt Monash
It took a lot of patient nagging, but DATAllegro finally has a blog. Based on the first post, I predict:
- DATAllegro’s blog will live up to CEO Stuart Frost’s talent for clear, interesting writing.
- Like a number of other vendor blogs — e.g., Netezza’s — DATAllegro’s will have infrequent but usually long posts.
The crunchiest part of the first post is probably
Another very important aspect of performance is ensuring sequential reads under a complex workload. Traditional databases do not do a good job in this area - even though some of the management tools might tell you that they are! What we typically see is that the combination of RAID arrays and intervening storage infrastructure conspires to break even large reads by the database into very small reads against each disk. The end result is that most large DW installations have very large arrays of expensive, high-speed disks behind them - and still suffer from poor performance.
I’ve pounded the table about sequential reads multiple times — including in a (DATAllegro-sponsored) white paper — but the point about misleading management tools is new to me.
Now if I could just get a production DATAllegro reference, I’d be completely happy …
Posted in Analytics and analytic technologies, DATAllegro, Data warehouse appliances, Data warehousing, Relational database management systems | No Comments »
April 21st, 2008 Curt Monash
In connection with the announcement of the Teradata 2500, I asked some Teradata competitors about pricing. Netezza’s response amounted to “We don’t disclose list pricing, but our cheapest system handles about 3 1/4 TB and sells for under $200K.” So Netezza’s actual pricing is well below the list price of the Teradata 2500.
Posted in Data warehouse appliances, Data warehousing, Netezza, Teradata | 6 Comments »
April 21st, 2008 Curt Monash
After months of leaks, Teradata has unveiled its new lines of data warehouse appliances, raising the total number either from 1 to 3 (my view) or 0 to 2 (what you believe if you think Teradata wasn’t previously an appliance vendor). Most significant is the new Teradata 2500 series, meant to compete directly with the smaller data warehouse specialists. Highlights include:
-
An oddly precise estimated capacity of “6.12 terabytes”/node (user data). This estimate is based on 30% compression, which is low by industry standards, and surely explains part of the price umbrella the Teradata 2500 is offering other vendors.
-
$125K/TB of user data. Obviously, list pricing and actual pricing aren’t the same thing, and many vendors don’t even bother to disclose official price lists. But the Teradata 2500 seems more expensive than most smaller-vendor alternatives.
-
Scalability up to 24 nodes (>140 TB).
-
Full Teradata application-facing functionality. Some of Teradata’s rivals are still working on getting all of their certifications with tier-1 and tier-2 business intelligence tools. Teradata has a rich application ecosystem.
-
What will be controversial performance, until customer-benchmark trends clearly emerge.
The Teradata 2500 is coming out of the chute with two customers – a new-customer retailer buying a single cabinet (i.e., 6.12 TB), and an existing customer for whom fewer details seem available. So far as I can tell, the sales force has had the product since late January, although the first leaks I got incorrectly suggested the system would only scale to a limited number of nodes.
Other products in the announcement included:
-
The Teradata 5550, a routine annual upgrade to the Teradata 5500.
-
The Teradata 550. This is a low-end, single-server SMP box introduced 9 or so months ago, originally meant for application development and testing. But some customers have been using it for deployment, and Teradata is now officially acknowledging that. It only scales to 2-3 TB of user data.
The Teradata 2500’s performance should be below the Teradata 5550’s for three reasons:
The same considerations apply to a comparison between the Teradata 2500 and the older Teradata 5000, but in that case they’re offset by a year of Moore’s Law benefit.
Read the rest of this entry »
Posted in Analytics and analytic technologies, Data warehouse appliances, Data warehousing, Database compression, Relational database management systems, Teradata | 1 Comment »
April 18th, 2008 Curt Monash
I chatted with Raj Cherabuddi and others on the Kickfire (formerly C2) team for over an hour on Monday, and now have a better sense of their story. There are some very basic questions I still don’t have answers to; I’ll fill those in when I can.
Highlights of what I have and haven’t figured out so far include:
-
Kickfire’s technology has two main parts: A SQL co-processor chip and a MySQL storage engine.
-
Kickfire makes a Type 0 appliance. If I understood correctly, it contains the chip, a couple of standard CPU cores, and 64 gigs of RAM. Or else it contains just the chip, and is meant to be hooked up to a 2U box with 64 gigs of RAM. I’m confused.
-
The Kickfire box can handle up to 3 terabytes of user data. The disk required for that is 4-5 terabytes without redundancy, 2X with. Based on that formulation and other clues, I’m guessing Kickfire — unlike other appliance vendors — doesn’t build in storage itself.
-
I don’t know whether the Kickfire chip is true custom silicon or an FPGA emulation.
-
The essential idea of the chip is dataflow programming for SQL, with pipelining between operations. This eliminates the overhead of registers and context switching. I don’t know what the trade-offs are, if any.
-
Kickfire’s database software is columnar, operating on compressed data even in RAM. In that, Kickfire’s story is most similar to Vertica’s, although I’m guessing Exasol may do something similar as well. Like Vertica, Kickfire uses multiple compression methods (they’re reluctant to give detail, but agreed it would be fair to say they use both something like dictionary/token and something like delta compression).
-
Kickfire’s software is ACID-compliant. You can do incremental loads or trickle feeds. Bulk load speed is 100 Gb/hour. Kickfire’s solution for the traditional problem of updating column stores is called “snapshots.” Without giving details, they position that as similar to the Vertica solution.
-
Like other MySQL storage engines, Kickfire inherits whatever data connectivity, stored procedure capabilities, user-defined functions ability, etc. that MySQL has.
-
Kickfire has no paying customers, but does have a slide showing many logos of “prospects and beta customers.”
-
Kickfire has no MPP capabilities at this time, but says adding those is “on the roadmap” and will be “easy.”
-
Kickfire submitted a 100 Gb TPC-H result, in which it beat the previous leaders — Exasol, ParAccel, and Microsoft – on price-performance, and lagged only Exasol and ParAccel on absolute performance. Kickfire is extremely proud of this. Indeed, I don’t recall another vendor ascribing that much weight to them in the entire history of TPCs.* Kickfire seems unfazed by the fact that its result is for a system listed with a ship date 6 months in the future (I’m guessing that’s the latest the TPC will allow), while the other results are for systems available today.
*Somebody – perhaps adman extraordinaire Rick Bennett? — may want to check my memory on this, but I think Oracle’s famed “Gentlemen, start your snails” ad in the early 1990s was about PC World tests, not TPCs. Oracle also had an ad about WW1-style planes nosediving, but I don’t think those referenced TPCs either.
Posted in Analytics and analytic technologies, Columnar architectures, Data warehouse appliances, Data warehousing, Database compression, Database theory and practice, Kickfire, Open source RDBMS, Relational database management systems | 3 Comments »
April 13th, 2008 Curt Monash
I just put up a long post about a small development-stage company, ScaleDB. The punchline is that ScaleDB has a data access method — an extension of Patricia tries — that gives referential integrity and updatable views for free.
People who think current “relational” DBMS aren’t relational enough often suggest that’s the kind of foundation DBMS should have. And unlike Required Technologies’ TransRelational (TM) shtick, ScaleDB’s really is an OLTP-oriented approach.
Please subscribe to our feed!
Posted in Database theory and practice, MySQL, Relational database management systems, TransRelational | No Comments »
April 13th, 2008 Curt Monash
The MySQL user conference is upon us, and hence so are MySQL-related product announcements, including storage engines. One such is Kickfire. ScaleDB — smaller and earlier-stage — is another.
In a nutshell, ScaleDB’s proposition is:
-
Innovative approach to indexing relational DBMS, providing performance advantages.
-
Shared-everything scale-up that ScaleDB believes will leapfrog the MySQL engine competition already in Release 1. (In my opinion, this is the least plausible part of the ScaleDB story.)
-
State-of-the-art me-too facilities for locking, logging, replication/fail-over, etc., also already in Release 1.
Like many software companies with non-US roots, ScaleDB seems to have started with a single custom project, using a Patricia trie indexing system. Then they decided Patricia tries might be really useful for relational OLTP as well. The ScaleDB team now features four developers, plus half-time or so “Chief Architect” involvement from Vern Watts. Watts seems to pretty much have been Mr. IMS for the past four decades, and thus surely knows a whole lot about pointer-based database management systems; presumably, he’s responsible for the generic DBMS design features that are being added to the innovative indexing scheme. On ScaleDB’s advisory board is PeopleSoft veteran Rick Berquist, about whom I’ve had fond thoughts ever since he talked me into focusing on consulting as the core of my business.*
*More precisely, Rick pretty much tricked me into doing a day of consulting for $15K, then revealed that’s what he’d done, expressing the thought that he’d very much gotten his money’s worth. But I digress …
ScaleDB has no customers to date, but hopes to be in beta by the end of this year. Angels and a small VC firm have provided bridge loans; otherwise, ScaleDB has no outside investment. ScaleDB’s business model thoughts include:
Read the rest of this entry »
Posted in Mid-range DBMS, MySQL, OLTP database management, Open source RDBMS, Relational database management systems, ScaleDB | No Comments »
April 10th, 2008 Curt Monash
As previously announced, I did a webcast this afternoon, discussing database diversity. The title of the talk was taken directly from a post – What leading DBMS vendors don’t want you to realize — that argued mid-range DBMS are suitable for a broad variety of tasks. The overriding theme was a Clayton Christensen-style “disruption” narrative.
The sponsor was EnterpriseDB, which is fitting. While not the biggest DBMS industry disrupter in terms of revenue or visible impact (MySQL and Netezza say “Hi”), the Postgres family in general and EnterpriseDB in particular epitomize the disruption threat like nobody else, because of how broadly they substitute for market-leading database managers.
As I promised on the call, below is a post with links to further research backing up the points made. They’re numbered to match some of the presentation slides, which you can find at this link.
3. Much of the discussion of database diversity comes from a series of posts I coordinated with Mike Stonebraker.
4. At various times, starting on Slide 4, I made reference to datatype extensibility, a key feature of Oracle and DB2 – and a key advantage of Postgres over MySQL.
10. Capping off the database diversity discussion, Slide 10 mirrors this 11-point version of a data management software taxonomy.
13-14. I’ve posted many times about data warehousing DBMS and related technologies, including this overview of major analytic DBMS products, another recent overview of data warehouse specialty technologies, and an attempt to distinguish between data warehouse appliance myths and realities. Of particular interest for further research may be our sections on data warehouse appliances and columnar DBMS.
15. I do most of my posting about text search over on Text Technologies, specifically in the search category. Vendors I specifically mentioned as blending search with other kinds of data retrieval were Mark Logic and Attivio.
16. There’s a section here on native XML database management.
17. We also have a section on managing RDF and other graphical data models.
18. Ditto complex event/stream processing.
19. The only embeddable DBMS I’ve written much about recently is solidDB. And frankly, even in that case I’ve focused more on mid-tier caching uses, the now-canceled MySQL relationship, or general technology than I did specifically on embedded uses.
22-24. Back in February, 2007 I made what is probably still my clearest post explaining why I think market-leading DBMS vendors are in the process of getting disrupted.
Please subscribe to our feed!
Posted in EnterpriseDB and Postgres Plus, Mid-range DBMS, MySQL, Open source RDBMS, Oracle, PostgreSQL, Relational database management systems | No Comments »
April 8th, 2008 Curt Monash
Kickfire, the renamed C2, is doing one of those buzz-building rollouts in which they make sure the first word comes from people on their payroll golly-gee-whizzing. You can see those at Xarpb and Diamond Notes, as well as a forthcoming article in MySQL magazine. Farhan Mashraqi also appears to be involved. Kickfire is also sponsoring the MySQL user conference next week.
I plan to write more after I get some substance, but a few things seem clear:
1. Kickfire’s product is an appliance that functions as a MySQL storage engine.
2. There’s a custom chip involved.
3. Kickfire plans to throw around the “stream processing” buzzphrase a lot.
Now, “stream processing” means a lot of different things to different people. E.g., Netezza uses the phrase just because their FPGA throws away a lot of data before ever routing it to more conventional SQL processing. But pending a briefing, I’m guessing that Kickfire’s sense is similar to what underlies the case for using CEP in BI.
Edit: Here’s an update after an actual Kickfire briefing.
Please subscribe to our feed!
Posted in Analytics and analytic technologies, Data warehouse appliances, Data warehousing, Kickfire, MySQL, Relational database management systems | 6 Comments »
April 5th, 2008 Curt Monash
There now are four hardware vendors that each offer or seem about to announce two different tiers of data warehouse appliances: Sun, HP, EMC, and Teradata. Specifically:
Read the rest of this entry »
Posted in Analytics and analytic technologies, DATAllegro, Data warehouse appliances, Data warehousing, Dataupia, Greenplum, HP and Neoview, IBM and DB2, Infobright and Brighthouse, Kognitio and WX2, Microsoft and SQL*Server, Netezza, Oracle, ParAccel, Relational database management systems, Sybase, Teradata | 4 Comments »
April 5th, 2008 Curt Monash
A talk about a ParAccel/EMC partnership has been promised for a forthcoming EMC user conference. Otherwise, ParAccel is exposing no useful information on the matter.*
*So what else is new?
The talk is called Highly Scalable Analytic Appliance Powered by EMC and ParAccel, and the abstract says: Read the rest of this entry »
Posted in Analytics and analytic technologies, Data warehouse appliances, Data warehousing, ParAccel, Relational database management systems | No Comments »
April 2nd, 2008 Curt Monash
Once or twice a year, EnterpriseDB sponsors a webcast for me. The last two were super well-attended. And most people stayed to the end, which is generally an encouraging sign!
The emphasis this time is on alternatives to the market-leading DBMS. I’ll highlight the advantages of both data warehousing specialists and general-purpose mid-range DBMS (naturally focusing on the latter, given who the sponsor is). The provocative title is taken from a January, 2008 post — What leading DBMS vendors don’t want you to realize. If you read every word of this blog, there probably won’t be much new for you.
But I’d love to have you listen in and perhaps ask a question anyway!
You can register on EnterpriseDB’s webcast page, which also has an archived webcast I did for them in October, 2007.
Posted in Database diversity, EnterpriseDB and Postgres Plus, Mid-range DBMS | No Comments »
April 1st, 2008 Curt Monash
Short and cute. Even makes a genuine marketing point (low power consumption), and ties into past marketing gimmicks (they’ve played Pimp My SPU in the past, with dramatic paint jobs).
Netezza Corporation (NYSE Arca: NZ), the global leader in data warehouse and analytic appliances, today introduced a limited-edition range of its award-winning Netezza system. Expected to become an instant industry collectible, the systems can now be purchased in a variety of color finishes – pink, blue, red or silver. The standard gun-metal gray unit will continue to be the default option for orders requiring eight or more units, to ensure availability.
Affectionately known as ‘the Netezza’ by customers and partners, the systems not only offer unparalleled processing performance, but the secret sauce of its innovative design is also leading the way in effective power and cooling management – making it a truly green option for any data center.
Not earth-shaking — even if it purports to be earth-saving — but unless I’ve overlooked a biggie, there isn’t much competition this rather lame April Fool’s year.
Posted in Data warehouse appliances, Data warehousing, Netezza | 2 Comments »
March 28th, 2008 Curt Monash
Simon Sabin makes an interesting point: If you can have 30,000 columns in a table without sparsity management blowing up, you can handle entities with lots of different kinds of attributes. (And in SQL Server you can now do just that.) The example he uses is products — different products can have different sets of possible colors, different kinds of sizes, and so on. An example I’ve used in the past is marketing information — different prospects can reveal different kinds of information, which may have been gathered via non-comparable marketing programs.
I’ve suggested this kind of variability as a reason to actually go XML — you’re constantly adding not just new information, but new kinds of information, so your fixed schema is never up to date. But I haven’t detected many actual application designers who agree with me …
Please subscribe to our feed!
Posted in Database theory and practice, MySQL, Native XML | 2 Comments »
March 27th, 2008 Curt Monash
If you want to know more about illuminate’s data warehouse offerings, CTO Joe Foley has a blog. A good starting point might be the post on value-based storage. Two key points seem to be:
The VBS also provides some data access features that can not be duplicated in any other structure. A search can be executed starting with a data value in the pool. By going from the value pool back to the index, it is possible to quickly locate every use of the value wherever is may be used in the logical record structures.
which makes sense, and
This structure also enables our incremental query capability. As the result of a query, the database returns a set of instance identifiers rather than a set of records. This is because there are no records, only pointers and values. With the response being a set of pointers, it is a simple matter to perform the next query step and then get the union or difference between the two sets of pointers for the result of the second query step. This process can be continued indefinitely with the result set shrinking or growing as the new results are merged with the old.
which still sounds like gobbledygook to me. Read the rest of this entry »
Posted in Analytics and analytic technologies, Business intelligence, Data warehousing, illuminate Solutions and iLuminate | No Comments »
March 26th, 2008 Curt Monash
illuminate Solutions (small “i”) is an interesting little company, still rough around the edges. (E.g., the Press Release Archive page at i-lluminate.com says, in its entirety, “We are in the process of loading our historical press releases. Please check back the second week in March!” And I only got that much when I corrected an obvious typo in the URL in the menu bar.) According to CTO Joe Foley, illuminate has 37 or so employees, and 40+ customers, ¾ of whom are in their home country of Spain and ½ the rest of whom are in Latin America. Now they’re entering the US.
illuminate’s basic idea is one I’ve heard before, but mainly from companies with more of a search orientation*, such as Attivio: Take a collection of tables, create a big inverted index on all the values in all columns at once, and do queries on that. This, illuminate claims, obviates all sorts of database design problems and similar hassles you otherwise might have. illuminate’s buzzword for all this is “CDBMS”, where the “C” stands for correlation. The actual CDBMS product is called iLuminate; related business intelligence tools are called iCorrelate and iAnalyze. What iLuminate actually indexes is a token that holds four pieces of information: Instance identifier, table identifier, column identifier, and value.
Read the rest of this entry »
Posted in Analytics and analytic technologies, Business intelligence, Data warehousing, illuminate Solutions and iLuminate | No Comments »
March 26th, 2008 Curt Monash
I blogged recently about Cast Iron Systems, a simplicity-oriented data integration appliance vendor that is increasingly focusing on the SaaS market. Well, Pervasive Software is doing something similar.
Via Data Integrator, Pervasive is a leader in the low-cost integration market, with revenue split about 50/25/25 between direct sales, ISVs, and SaaS. Pervasive fondly believes that its products cost half as much as Cast Iron’s, and wind up taking no more installation effort when you factor in Pervasive’s broader capabilities in areas such as workflow. However, there’s some doubt as to whether this is apples-to-apples. Cast Iron does include hardware, after all, and as Pervasive itself points out, Cast Iron will bundle some professional services into a sale if you ask nicely.
Two things are new. Read the rest of this entry »
Posted in Cloud computing, EII, ETL, and/or EAI, Pervasive Software, SaaS | 4 Comments »
March 25th, 2008 Curt Monash
At Elastra’s request, I didn’t write further about them back when I was interested in doing so. But you can go find out about them yourself. Basically, their secret sauce is that they write deployment instructions in a few hundred lines of two proprietary markup languages. They have ambitions beyond DBMS, and beyond the Amazon cloud.
According to their slides, they have 13 paying customers.
Posted in Cloud computing, Elastra | No Comments »