Each Neo4j node stores all of the data, and it doesn't scale write horizontally well.
OrientDB tries to scale writes (I'll be testing this in a few months), but still stores all of the data everywhere.
This looks like it shards the data automagically. If it works well, I might be able to bang on it a bit, but I'm guessing that it gives shit performance for complex graph questions.
Titan exposes graph data over a machine cluster. It is an OLTP system that allows you to do local neighborhood graph traversals in sub-second time. For OLAP processing (e.g. global graph algorithms), Aurelius will be releasing two projects named Faunus and Fulgora in the coming months. These provide Hadoop connectivity and compressed in-memory representations of "graph slices." We will be publishing our talk slides tonight that discuss this eco-system of graph technologies. See http://titanbiggraphdata.eventbrite.com/
I really wish that I could attend. We ask a lot of questions like:
Find all Nodes with a property in a tree
Find all leaves L of those nodes
Find all annotations in a DAG of those leaves
Collapse similar DAG entries by backtracking up the graph based on edge weights
Writes are bulk loaded, and right now, we are just trying to push all of the graph stuff offline, but there are some limitations to that, and we could really up our accuracy by being able to perform these queries quickly.
Also, Aurelius will be posting a blog post in ~2 weeks where we will be simulating Twitter. We replay Twitter from day 1 to June 2009 and slam it with ~10,000 concurrent users writing/reading follows relationships, tweets, stream constructions -- ultimately growing (what we think will be) a 3 billion edge graph when the simulation is complete.