Towards a Horizontally Scalable Graph Database


This paper is discusses the problems of sharding graphs with the open source graph database Neo4j. Based on the input from real users of the database, a pattern called ‘cache-sharding’ for efficiently spreading read-load amongst a cluster of database machines is presented along with empirically derived guidance for when cache sharding should be used, and when simpler mechanisms can be equally effective. Based on this experience, we end the paper with an outline discussion on our next steps towards creating a statistically balanced graph database for very large deployments.

Proceedings of the European Data Forum, June 6-7 2012, Copenhagen, Denmark.