In the Neo4j world, we consider large datasets to be those which are substantially larger than main memory. With such large datasets, it’s impossible for a Neo4j instance to cache the whole database in RAM and therefore provide extremely rapid traversals of the graph, since we’ll have to hit the disk eventually. In those situations we’ve previously recommended scaling vertically by opting for solid-state storage to provide constant, low seek times for data on disk (avoiding the high seek penalty incurred by spinning disks). While the SSD approach provides a substantial performance boost in most scenarios, even the fastest SSD isn’t a replacement for RAM.
I wrote previously that partitioning graphs across physical instances is a notoriously difficult way to scale graph data. Yet we still want the ability to service large workloads and host large data sets in Neo4j. Until the recent 1.2 release I think this was a weak point for the database, but with the advent of Neo4j High Availability (HA) I don’t think it is anymore. In fact Neo4j HA gives us considerably more options for designing solutions for availability, scalability and very large data sets.
One pattern that I’ve seen using Neo4j HA for large deployments is what we’re calling “Cache Sharding” to maintain high performance with a dataset that far exceeds main memory space. Cache sharding isn’t sharding in the traditional sense, since we expect a full data set to be present on each database instance. To implement cache sharing, we partition the workload undertaken by each database instance, to increase the likelihood of hitting a warm cache for a given request – and warm caches in Neo4j are ridiculously high performance.
The solution architecture for this setup is shown below. We move from the hard problem of graph sharding, to the simpler problem of consistent routing, something which high volume Web farms have been doing for ages.
The strategy we use to implement consistent routing will vary by domain. Sometimes it’s good enough to have sticky sessions, other times we’ll want to route based on the characteristics of the data set. A simple strategy is where the instance which first serves requests for a particular user will serve subsequent requests for that user ensuring a good chance that the request will be processed by a warm cache. Other domain-specific approaches will also work, for example in geographical data system we can route requests about particular locations to specific database instances which will be warm for that location. Either way, we’re increasing the likelihood of the required nodes and relationships already being cached in RAM, and therefore quick to access and process.
Of course reading from the database is only half of the story. If we’re going to run a number of servers to exploit their large aggregate caching capability, we need to keep those servers in sync. This is where the Neo4j HA software becomes particularly useful. A Neo4j HA deployment effectively creates a multi-master cluster.
Writes to any node will result in all other nodes eventually receiving that write through the HA protocol. Writing to the elected master in the cluster causes the data to be persisted (strictly ACID, always), and then changes are propagated to the slaves through the HA protocol for eventual consistency.
If a write operation is processed by a slave, it enrols the elected master in a transaction and both instances persist the results (again strictly ACID). Other slaves then catch up through the HA protocol.
By using this pattern, we can treat a Neo4j HA cluster as a performant database for reads and writes, knowing that with a good routing strategy we’re going to be keeping traversals in memory and extremely fast – fast enough for very demanding applications.