Introduction to Clearpace
Clearpace is a UK-based startup in a similar market to what SAND Technology has gotten into – DBMS archiving, with a strong focus on compression and general cost-effectiveness. Clearpace launched its product NParchive a couple of quarters ago, and says it now has 25 people and $1 million or so in revenue. Clearpace NParchive technical highlights include: Read more
Categories: Archiving and information preservation, Rainstor | 1 Comment |
Introduction to SAND Technology
SAND Technology has a confused history. For example:
- SAND has been around in some form or other since 1982, starting out as a Hitachi reseller in Canada.
- In 1992 SAND acquired a columnar DBMS product called Nucleus, which originally was integrated with hardware (in the form of a card). Notwithstanding what development chief Richard Grodin views as various advantages vs. Sybase IQ, SAND has only had limited success in that market.
- Thus, SAND introduced a second, similarly-named product, which could also be viewed as a columnar DBMS. (As best I can tell, both are called SAND/DNA.) But it’s actually focused on archiving, aka the clunkily named “near-line storage.” And it’s evidently not the same code line; e.g., the newer product isn’t bit-mapped, while the older one is.
- The near-line product was originally focused on the SAP market. Now it’s moving beyond.
- Canada-based SAND had offices in Germany and the UK before it did in the US. This leads to an oddity – SAND is less focused on the SAP aftermarket in Germany than it still is in the US.
SAND is publicly traded, so its numbers are on display. It turns out to be doing $7 million in annual revenue, and losing money.
OK. I just wanted to get all that out of the way. My main thoughts about the DBMS archiving market are in a separate post.
Categories: Archiving and information preservation, Columnar database management, Data warehousing, SAND Technology | 6 Comments |
Kognitio and WX-2 update
I went to Bracknell Wednesday to spend time with the Kognitio team. I think I came away with a better understanding of what the technology is all about, and why certain choices have been made.
Like almost every other contender in the market,* Kognitio WX-2 queries disk-based data in the usual way. Even so, WX-2’s design is very RAM-centric. Data gets on and off disk in mind-numbingly simple ways – table scans only, round-robin partitioning only (as opposed to the more common hash), and no compression. However, once the data is in RAM, WX-2 gets to work, happily redistributing as seems optimal, with little concern about which node retrieved the data in the first place. (I must confess that I don’t yet understand why this strategy doesn’t create ridiculous network bottlenecks.) How serious is Kognitio about RAM? Well, they believe they’re in the process of selling a system that will include 40 terabytes of the stuff. Apparently, the total hardware cost will be in the $4 million range.
*Exasol is the big exception. They basically use disk as a source from which to instantiate in-memory databases.
Other technical highlights of the Kognitio WX-2 story include: Read more
Categories: Application areas, Data warehousing, Kognitio, Scientific research | 2 Comments |
Data warehouse load speeds in the spotlight
Syncsort and Vertica combined to devise and run a benchmark in which a data warehouse got loaded at 5 ½ terabytes per hour, which is several times faster than the figures used in any other vendors’ similar press releases in the past. Takeaways include:
- Syncsort isn’t just a mainframe sort utility company, but also does data integration. Who knew?
- Vertica’s design to overcome the traditional slow load speed of columnar DBMS works.
The latter is unsurprising. Back in February, I wrote at length about how Vertica makes rapid columnar updates. I don’t have a lot of subsequent new detail, but it made sense then and now. Read more
The Teradata Accelerate program
An article in Intelligent Enterprise clued me in that Teradata has announced the Teradata Accelerate program. A little poking around revealed a press release in which — lo and behold — I am quoted,* to wit:
“The Teradata Accelerate program is a great idea. There’s no safer choice than Teradata technology plus Teradata consulting, bundled in a fixed-cost offering,” said Curt Monash, president of Monash Research. “The Teradata Purpose Built Platform Family members are optimized for a broad range of business intelligence and analytic uses.”
Categories: Data warehousing, Pricing, Teradata | Leave a Comment |
High-end MySQL use
To a large extent, MySQL lives in two different alternate universes from most other DBMS. One is for low-end, simple database applications. For example, of all the DBMS I write about, MySQL is the one I actually use in my own business — because MySQL sits underneath WordPress, and WordPress is what runs my blogs. My largest database (the one for DBMS2) contains 12 megabytes of data in 11 tables, none of which has yet reached 5000 rows in size. Read more
Categories: Google, MySQL, OLTP, Open source, Parallelization | 1 Comment |
MySQL Query Analyzer
Given how the product’s rollout has been handled, it seems necessary to comment on MySQL’s recently released MySQL Query Analyzer without actually having much information on the subject. Mark Callaghan offers a good take — he’s generally very favorable, but notes that MySQL has some limitations that Query Analyzer has trouble getting around.
Categories: MySQL | 2 Comments |
Silly website tricks
Vertica’s marketing is usually good-to-outstanding, but they made a funny misstep this time. If you go to the Vertica home page, you’ll see seasonal art suggesting that their product is a turkey and/or that it’s terrified it’s about to get the ax.
Live by the pun, die by the pun.
Categories: Humor, Vertica Systems | 6 Comments |
High-performance analytics
For the past few months, I’ve collected a lot of data points to the effect that high-performance analytics – i.e., beyond straightforward query — is becoming increasingly important. And I’ve written about some of them at length. For example:
- MapReduce – controversial or in some cases even disappointing though it may be – has a lot of use cases.
- It’s early days, but Netezza and Teradata (and others) are beefing up their geospatial analytic capabilities.
- Memory-centric analytics is in the spotlight.
Ack. I can’t decide whether “analytics” should be a singular or plural noun. Thoughts?
Another area that’s come up which I haven‘t blogged about so much is data mining in the database. Data mining accounts for a large part of data warehouse use. The traditional way to do data mining is to extract data from the database and dump it into SAS. But there are problems with this scenario, including: Read more
Categories: Aster Data, Data warehousing, EAI, EII, ETL, ELT, ETLT, Greenplum, MapReduce, Netezza, Oracle, Parallelization, SAS Institute, Teradata | 6 Comments |
Beyond query
I sometimes describe database management systems as “big SQL interpreters,” because that’s the core of what they do. But it’s not all they do, which is why I describe them as “electronic file clerks” too. File clerks don’t just store and fetch data; they also put a lot of work into neatening, culling, and generally managing the health of their information hoards.
Already 15 years ago, online backup was as big a competitive differentiator in the database wars as any particular SQL execution feature. Security became important in some market segments. Reliability and availability have been important from the getgo. And manageability has been crucial ever since Microsoft lapped Oracle in that regard, back when SQL Server had little else to recommend it except price.*
*Before Oracle10g, the SQL Server vs. Oracle manageability gap was big.
Now data warehousing is demanding the same kinds of infrastructure richness.* Read more
Categories: Data warehousing, Microsoft and SQL*Server, Oracle | 1 Comment |