In Part 1, we walked through why the graph database ecosystem is so expensive. The pricing is hostile, the deployment is brutal, and the managed alternatives don't solve the underlying problem.
The underlying problem is the server. Every graph database assumes you need a dedicated server process with the graph in memory. That assumption made sense in 2010. It doesn't in 2026.
Your graph database should be a file.
#The server is the problem
Neo4j runs as a JVM process. It loads the graph into memory (or a memory-mapped page cache). It listens on a port. It serves queries. When nobody is querying it, it's still running. Still consuming RAM. Still costing money.
This is the same architecture as Postgres, MySQL, Redis — a server process managing state. It works well when you have one instance per application. It breaks when you need one instance per workload.
The problem compounds:
- →1 graph: $65/mo. Fine.
- →10 graphs: $650/mo. Painful.
- →100 graphs: $6,500/mo. Prohibitive.
- →1,000 graphs: Not happening.
And you can't share a single instance because Neo4j doesn't have real multi-tenant isolation. Labels and properties aren't security boundaries.
#SQLite proved this already
SQLite is the most deployed database in the world. Not Postgres. Not MySQL. SQLite. A file.
SQLite doesn't run a server process. There's no daemon to manage, no port to configure, no memory to tune. Your application opens a file, reads and writes SQL, and closes the file. The “database server” is a library linked into your process.
This gives SQLite properties that server databases don't have: zero cost when idle, instant creation, trivial isolation (each database is a separate file), trivial cleanup (delete the file), and the ability to run millions per machine.
Turso and Neon proved this scales to the cloud. Turso runs SQLite on the edge with libSQL. Neon separates Postgres compute from storage. Both showed that file-backed databases can serve real production workloads.
The same insight applies to graphs. A graph database doesn't need to be a server. It can be a file.
#Embedded engines are already faster
The performance argument isn't theoretical. Embedded graph engines already exist — and they're dramatically faster than Neo4j.
Kuzu (now archived after Apple acquired the team in October 2025) proved this decisively. Benchmarked on 100K nodes and 2.4M edges:
| Query type | Neo4j | Kuzu | Speedup |
|---|---|---|---|
| Second-degree path counting | 3.45s | 0.019s | 180x |
| Filtered path counting | 4.27s | 0.023s | 189x |
| Filtered state aggregation | 0.163s | 0.007s | 24x |
| Top follower aggregation | 1.89s | 0.119s | 16x |
| Multi-hop city filter | 0.044s | 0.008s | 5.4x |
| Simple lookups | 0.694s | 0.126s | 5.5x |
Source: The Data Quarry benchmark, 100K nodes / 2.4M edges, MacBook Pro M2. Every query was faster in Kuzu.
Why is an embedded engine faster than a server-based one? Five architectural advantages that compound:
Columnar storage + CSR indices. Neo4j uses row-oriented fixed-size records with linked-list pointers — great for single-hop OLTP, terrible for cache locality on multi-hop queries. Columnar storage keeps related data contiguous in memory.
Vectorized execution. Neo4j processes tuples one at a time. Kuzu processes vectors of 2,048 tuples at a time. Same reason DuckDB destroys PostgreSQL on analytics — batch processing exploits CPU pipelines.
Factorized execution. This is the killer. On multi-hop many-to-many traversals, intermediate results explode combinatorially. Neo4j materializes the full Cartesian product. Kuzu's factorized representation compresses intermediate results 50–100x. This alone explains the 180x gap on path queries.
No JVM. Neo4j runs on the JVM. Full GC events cause stop-the-world pauses that can reach minutes on large heaps. Native C++ engines have no garbage collector and deterministic memory management.
In-process execution. No Bolt protocol serialization, no network hop, no connection pooling overhead. The query engine runs in your process.
Kuzu was archived, but two active forks continue the work: LadybugDB (same engine, MIT license, targeting regulated industries) and RyuGraph (same engine, adding vector search and full-text search for AI/RAG workloads). Both inherit every performance characteristic.
The embedded graph engine ecosystem proves that the server isn't just expensive — it's slower. The file-backed approach wins on both economics AND performance.
#Applying file-backed architecture to graphs
What does a file-backed graph database look like?
The graph (nodes, relationships, properties, indexes) lives in a file on disk. Not a server process. A file. When a query comes in:
- 1.The file is opened (or already memory-mapped if recently active).
- 2.Hot data lives in a RAM buffer — the nodes and relationships being actively traversed.
- 3.Warm data lives on NVMe — the rest of the graph, accessible in microseconds.
- 4.Cold data (archived graphs) lives on cloud storage — loadable in milliseconds when needed.
The query engine is embedded — it runs in the process that opens the file, not in a separate server. This is the SQLite model applied to graph traversal.
The Bolt protocol (Neo4j's wire protocol) is just a transport layer. You can implement it over any backend. The client sends Cypher queries over Bolt. The server parses Cypher, executes against the file-backed graph, returns results over Bolt. Your existing Neo4j drivers work unchanged.
What you lose: some of Neo4j's Enterprise features (causal clustering, specific monitoring integrations). What you gain: 10–20x better economics, instant creation, scale to zero, per-workload isolation, and the ability to run thousands of graphs on one machine.
#What Cinch Graph looks like
Protocol: Bolt (Neo4j wire protocol). Any Neo4j driver works — Python, JavaScript, Java, Go, .NET, Rust.
Query language: Cypher. Same queries you'd write for Neo4j.
Storage: File-backed with three tiers — Hot (RAM buffer for actively-traversed nodes), Warm (NVMe, microsecond access), Cold (cloud storage for archived graphs, millisecond restore).
Isolation: Every graph is a separate file. No shared state. No label-based filtering hacks. Real isolation.
Create via API in milliseconds. Use with any Neo4j client library. Auto-stop when idle. Wake on first query. Fork to test changes. Delete when done.
#The economics
| Provider | 1GB graph | 10GB graph | 100 graphs (1GB each) | Scale to zero |
|---|---|---|---|---|
| Neo4j Aura Pro | ~$65/mo | ~$500+/mo | ~$6,500/mo | No |
| Amazon Neptune | ~$115/mo | ~$400+/mo | ~$11,500/mo | Partial |
| Memgraph Cloud | ~$50/mo | ~$300+/mo | ~$5,000/mo | No |
| Cinch Graph | ~$3/mo | ~$25/mo | ~$300/mo* | Yes |
*Assumes 80% of graphs idle at any time. Active graphs: ~$3/mo. Idle graphs: ~$0.10/mo.
The 100-graph column is the killer number. This is where the file-backed architecture absolutely destroys server-based pricing. Neo4j can't offer this because each graph is a server. Cinch can because each graph is a file. Most of those files are idle. Idle files cost almost nothing.
For GraphRAG, agent knowledge bases, per-tenant graphs — the multi-graph scenario is the common case, not the edge case. And it's exactly where the economics diverge by 20x.
#When this doesn't work
We're honest about limitations:
- →One massive graph. If you have a single graph with billions of nodes queried constantly with complex multi-hop traversals, Neo4j in-memory is still faster. Cinch Graph optimizes for many small-to-medium graphs, not one giant one.
- →Neo4j Enterprise features. Causal clustering across regions, enterprise monitoring integrations, Neo4j Bloom visualization — those are Neo4j-specific.
- →Extremely deep traversals. If your traversals are 10+ hops across millions of nodes, the difference between RAM and NVMe becomes noticeable. For most GraphRAG and agent workloads (2–4 hop traversals), it's invisible.
- →Battle-testing. Cinch Graph is early. Neo4j has 15 years of production hardening. If you need guaranteed SLAs on a mission-critical system today, Neo4j is the safer choice. If you need 100 ephemeral graphs for agent workloads at reasonable cost, we're the only option.
File-backed graphs. Bolt-compatible. A fraction of the cost.
Same Cypher queries. Same Neo4j drivers. 20x cheaper. Scale to zero.
GET STARTED →