November 1, 2011

MarkLogic 5, and why you might care

MarkLogic is releasing MarkLogic 5. Key elements of the announcement are:

More-of-the-same in line with MarkLogic’s core positioning.
A new bi-directional Hadoop connector.
A free MarkLogic Express edition, limited in license terms more than in actual features, as per Slide 27 of the deck MarkLogic graciously supplied for me to post.

Also, MarkLogic is early with a feature that most serious DBMS vendors will soon have – support for tiered storage, with writes going first to solid-state storage, then being flushed to disk via a caching-style algorithm.* And as befits a sometime search-engine-substitute, MarkLogic has finally licensed a large set of document filters, from an Australian company called Isys. Apparently, the special virtue of the Isys filters is that they’re good at extracting not only text, but metadata as well.

*If there’s a caching algorithm that doesn’t contain a major element of LRU (Least Recently Used), I don’t recall ever hearing about it.

MarkLogic seems to have settled on a positioning that, although distressingly buzzword-heavy, is at least partly based upon reality. The real part includes:

MarkLogic is a serious, enterprise-class DBMS (see for example Slide 12 of the MarkLogic deck) …
… which has been optimized from the getgo for poly-structured data.
MarkLogic can and does scale out to handle large amounts of data.
MarkLogic is a general-purpose DBMS, suitable for both short-request and analytic tasks.
MarkLogic is particularly well suited for analyses with long chains of “progressive enhancement” (MarkLogic’s favorite term when talking about derived data).
MarkLogic often plays the role of a content assembler and/or search engine, and the people who use MarkLogic in those ways are commonly doing things that can be described as research and analysis.

Based on that reality, MarkLogic talks a lot about Volume, Velocity, Variety, Big Data, unstructured data, semi-structured data, and big data analytics.

My November, 2010 overview of MarkLogic technology remains pretty relevant. One correction, however: Node heterogeneity configurations, in which “data” and “evaluation” nodes reside on separate servers, are the exception rather than the rule.

Like Vertica, MarkLogic has laudably said that true academic researchers can get MarkLogic for free without the severe license restrictions. Free MarkLogic should be of particular interest to researchers who:

Are studying natural networks or graphs, such as social networks or biological pathways. (This might be a fit in the social or biological sciences.)
Are managing metadata for, say, a variety of disparate kinds of experimental files. (This might be a fit anywhere in the natural sciences.)
Are managing actual documents, images, videos, etc., or data about such things. (This might be a fit in the humanities or social sciences.)

MarkLogic provided some disclosable financial substance by email, which I shall quote verbatim:

MarkLogic has 45% revenue growth and 55-60% license growth year over year.
We expect to finish this year with over $85 million in revenue, up from $55 million last year.

Arithmetical purists might note that 85/55 is more than 145%, but I’m just going to settle for the information I got and move on.

Edit: I posted separately about the MarkLogic Hadoop connector. As for that Hadoop connector – stay tuned for a short follow-up post, as writing about it now would not be convenient. (My backup discipline isn’t what it should be, and the only copy of my notes about that product is on a heavy tower computer in a house that doesn’t have working power.)

Categories: Hadoop, Market share and customer counts, MarkLogic, Scientific research, Solid-state memory, Structured documents, Text

Subscribe to our complete feed!

Comments

One Response to “MarkLogic 5, and why you might care”

MarkLogic’s Hadoop connector | DBMS 2 : DataBase Management System Services on November 3rd, 2011 8:58 pm

[…] time to circle back to a subject I skipped when I otherwise wrote about MarkLogic 5: MarkLogic’s new Hadoop […]

Leave a Reply

Search our blogs and white papers

Monash Research blogs

DBMS 2 covers database management, analytics, and related technologies.
Text Technologies covers text mining, search, and social software.
Strategic Messaging analyzes marketing and messaging strategy.
The Monash Report examines technology and public policy issues.
Software Memories recounts the history of the software industry.

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.

Links
- Monash Research
- White Papers
Admin
- Log in

MarkLogic 5, and why you might care

Comments

Search our blogs and white papers

Monash Research blogs

User consulting

Vendor advisory

Monash Research highlights

Recent posts

Categories

Date archives

Admin