Analysis of SnapLogic and its data integration products, such as SnapCenter, SnapReduce, and the snaps in the SnapStore.
I talked with the SnapLogic team last week, in connection with their SnapReduce Hadoop-oriented offering. This gave me an opportunity to catch up on what SnapLogic is up to overall. SnapLogic is a data integration/ETL (Extract/Transform/Load) company with a good pedigree: Informatica founder Gaurav Dillon invested in and now runs SnapLogic, and VC Ben Horowitz is involved. SnapLogic company basics include:
- SnapLogic has raised about $18 million from Gaurav Dillon and Andreessen Horowitz.
- SnapLogic has almost 60 people.
- SnapLogic has around 150 customers.
- Based in San Mateo, SnapLogic has an office in the UK and is growing its European business.
- SnapLogic has both SaaS (Software as a Service) and on-premise availability, but either way you pay on a subscription basis.
- Typical SnapLogic deal size is under $20K/year. Accordingly, SnapLogic sells over the telephone.
- SnapReduce is in beta with about a dozen customers, and slated for release by year-end.
SnapLogic’s core/hub product is called SnapCenter. In addition, for any particular kind of data one might want to connect, there are “snaps” which connect to — i.e. snap into — SnapCenter.
|Categories: Cloud computing, Data integration and middleware, EAI, EII, ETL, ELT, ETLT, SnapLogic, Software as a Service (SaaS)||1 Comment|
There have been many recent announcements about how data integration/ETL (Extract/Transform/Load) vendors are going to work with MapReduce. Most of what they say boils down to one or more of a few things:
- Hadoop generally stores data in HDFS (Hadoop Distributed File System). ETL vendors want to be able to extract data from or load it into HDFS.
- ETL vendors have development environments that let you specify/script/whatever ETL jobs. ETL vendors want their development tools to develop ETL processes executed via MapReduce/Hadoop.
- In particular, this allows ETL vendors to exploit the parallel-processing capabilities of MapReduce.
Some additional twists include:
- Pentaho announced business intelligence and ETL for Hadoop last year.
- Syncsort thinks different sort algorithms should be usable with Hadoop. Consequently, it plans to contribute technology to the community to make sort pluggable into Hadoop. (However, Syncsort is keeping its own sort technology proprietary.)
- Syncsort is considering replicating some Hive functionality, starting with joins, hopefully running much faster. (However, Syncsort’s basic Hadoop support is a quarter or three away, so any more advanced functionality would probably come out in 2012 or beyond.)
- SnapLogic fondly thinks that its generation of MapReduce jobs is particularly intelligent.
Finally, my former clients at Pervasive, who haven’t briefed me for a while, seem to have told Doug Henschen that they have pointed DataRush at MapReduce.* However, I couldn’t find evidence of same on the Pervasive DataRush website beyond some help in using all the cores on any one Hadoop node.
*Also see that article because it names a bunch of ETL vendors doing Hadoop-related things.
|Categories: Data integration and middleware, EAI, EII, ETL, ELT, ETLT, Hadoop, MapReduce, Parallelization, Pentaho, Pervasive Software, SnapLogic, Syncsort||1 Comment|
It occurs to me that there are three reasons why Federal Express, aka Fedex, is a great metaphor for data integration. Read more