My new clients at Aerospike have a range of minor news to announce:
- A company and product name change (they used to be Citrusleaf).
- Some new people and funding.
- In association with an acqui-hire — of AlchemyDB guy Russ Sullivan — some unspecified future technical plans.
- A community edition (Aerospike, nee’ Citrusleaf, is closed-source).
Mainly, however, they want to call your attention to the fact that they’ve been selling a fast, reliable key-value store, with a number of production references, and want to suggest that other organizations should perhaps buy it as well.
- Aerospike has a key-value data model.
- Secondary indexes and so on are still futures.
- Aerospike is clustered, of course.
- Two hardware/storage choices are encouraged:
- Spinning disk, but you keep all your data in RAM.
- Solid-state disk.
AeroSpike’s three core marketing claims are performance, consistent performance, and uninterrupted operations.
- Aerospike’s performance claims are supported by a variety of blazing internal benchmarks.
- Aerospike’s consistent performance claims are along the lines of sub-millisecond latency, with 99.9% of responses being within 5 milliseconds, and even a node outage only borking performance for some 10s of milliseconds.
- Uninterrupted operation is a core AeroSpike design goal, and the company says that to date, no AeroSpike production cluster has ever gone down.
Aerospike technical details start with the expected:
- Many more logical data partitions than physical ones (default is 4000, but you can double that a few times if you want).
- Synchronous replication within a data center; asynchronous for disaster recovery or other geographical distribution.
Further technical details include:
- Aerospike is divided into three layers: Client, distribution, and data. The client layer lives with the application; the distribution and data layers live with each other. The distribution layer does the main mapping, but every node of any kind has a full map of the partitions.
- Aerospike is written in C (hence no garbage collection).
- Aerospike finds data in two steps: Keys are hashed to partition IDs; each partition travels with an index that is used to find data within it. The index is red-black rather than b-tree.
- Those indexes carry expiry information and so on, so data is invalidated rather than being deleted in place. Actual deletion only occurs via a defrag/compaction operation.
For business metrics and so on, the following is edited from an email sent over by Aerospike marketing VP Monica Pal. (The original, naturally, had a lot more marketing-speak.)
- Headcount – 30 and hiring.
- # of production customers – mid double digits, all paying.
- Biggest database – 12TB and doubling.
- Most customers are at 1-4TB of unique data; most replicate at least x2; many also replicate across data centers.
- (After saying about three times it’s OK that Aerospike clusters are small because they can do so much work on each node.) We have hundreds of servers in our own test lab to exercise the clustered architecture.
- Pricing – per terabyte and per datacenter, unlimited nodes per cluster, unlimited number of clusters, pay only for unique data, not replicas. Most start at $50k.