Cinch is in public alpha.

← Back to blog
Reading time·8 min

Your Graph Database Should Be a File

The same insight that made SQLite the most deployed database in the world applies to graphs.

RR
Russell Romney
@russellromney
February 17, 2026

In Part 1, we walked through why the graph database ecosystem is so expensive. The pricing is hostile, the deployment is brutal, and the managed alternatives don't solve the underlying problem.

The underlying problem is the server. Every graph database assumes you need a dedicated server process with the graph in memory. That assumption made sense in 2010. It doesn't in 2026.

Your graph database should be a file.

#The server is the problem

Neo4j runs as a JVM process. It loads the graph into memory (or a memory-mapped page cache). It listens on a port. It serves queries. When nobody is querying it, it's still running. Still consuming RAM. Still costing money.

This is the same architecture as Postgres, MySQL, Redis — a server process managing state. It works well when you have one instance per application. It breaks when you need one instance per workload.

The problem compounds:

  • 1 graph: $65/mo. Fine.
  • 10 graphs: $650/mo. Painful.
  • 100 graphs: $6,500/mo. Prohibitive.
  • 1,000 graphs: Not happening.

And you can't share a single instance because Neo4j doesn't have real multi-tenant isolation. Labels and properties aren't security boundaries.

#SQLite proved this already

SQLite is the most deployed database in the world. Not Postgres. Not MySQL. SQLite. A file.

SQLite doesn't run a server process. There's no daemon to manage, no port to configure, no memory to tune. Your application opens a file, reads and writes SQL, and closes the file. The “database server” is a library linked into your process.

This gives SQLite properties that server databases don't have: zero cost when idle, instant creation, trivial isolation (each database is a separate file), trivial cleanup (delete the file), and the ability to run millions per machine.

Turso and Neon proved this scales to the cloud. Turso runs SQLite on the edge with libSQL. Neon separates Postgres compute from storage. Both showed that file-backed databases can serve real production workloads.

The same insight applies to graphs. A graph database doesn't need to be a server. It can be a file.

#Embedded engines are already faster

The performance argument isn't theoretical. Embedded graph engines already exist — and they're dramatically faster than Neo4j.

Kuzu (now archived after Apple acquired the team in October 2025) proved this decisively. Benchmarked on 100K nodes and 2.4M edges:

Query typeNeo4jKuzuSpeedup
Second-degree path counting3.45s0.019s180x
Filtered path counting4.27s0.023s189x
Filtered state aggregation0.163s0.007s24x
Top follower aggregation1.89s0.119s16x
Multi-hop city filter0.044s0.008s5.4x
Simple lookups0.694s0.126s5.5x

Source: The Data Quarry benchmark, 100K nodes / 2.4M edges, MacBook Pro M2. Every query was faster in Kuzu.

Why is an embedded engine faster than a server-based one? Five architectural advantages that compound:

Columnar storage + CSR indices. Neo4j uses row-oriented fixed-size records with linked-list pointers — great for single-hop OLTP, terrible for cache locality on multi-hop queries. Columnar storage keeps related data contiguous in memory.

Vectorized execution. Neo4j processes tuples one at a time. Kuzu processes vectors of 2,048 tuples at a time. Same reason DuckDB destroys PostgreSQL on analytics — batch processing exploits CPU pipelines.

Factorized execution. This is the killer. On multi-hop many-to-many traversals, intermediate results explode combinatorially. Neo4j materializes the full Cartesian product. Kuzu's factorized representation compresses intermediate results 50–100x. This alone explains the 180x gap on path queries.

No JVM. Neo4j runs on the JVM. Full GC events cause stop-the-world pauses that can reach minutes on large heaps. Native C++ engines have no garbage collector and deterministic memory management.

In-process execution. No Bolt protocol serialization, no network hop, no connection pooling overhead. The query engine runs in your process.

Kuzu was archived, but two active forks continue the work: LadybugDB (same engine, MIT license, targeting regulated industries) and RyuGraph (same engine, adding vector search and full-text search for AI/RAG workloads). Both inherit every performance characteristic.

The embedded graph engine ecosystem proves that the server isn't just expensive — it's slower. The file-backed approach wins on both economics AND performance.

#Applying file-backed architecture to graphs

What does a file-backed graph database look like?

The graph (nodes, relationships, properties, indexes) lives in a file on disk. Not a server process. A file. When a query comes in:

  • 1.The file is opened (or already memory-mapped if recently active).
  • 2.Hot data lives in a RAM buffer — the nodes and relationships being actively traversed.
  • 3.Warm data lives on NVMe — the rest of the graph, accessible in microseconds.
  • 4.Cold data (archived graphs) lives on cloud storage — loadable in milliseconds when needed.

The query engine is embedded — it runs in the process that opens the file, not in a separate server. This is the SQLite model applied to graph traversal.

The Bolt protocol (Neo4j's wire protocol) is just a transport layer. You can implement it over any backend. The client sends Cypher queries over Bolt. The server parses Cypher, executes against the file-backed graph, returns results over Bolt. Your existing Neo4j drivers work unchanged.

What you lose: some of Neo4j's Enterprise features (causal clustering, specific monitoring integrations). What you gain: 10–20x better economics, instant creation, scale to zero, per-workload isolation, and the ability to run thousands of graphs on one machine.

#What Cinch Graph looks like

Protocol: Bolt (Neo4j wire protocol). Any Neo4j driver works — Python, JavaScript, Java, Go, .NET, Rust.

Query language: Cypher. Same queries you'd write for Neo4j.

Storage: File-backed with three tiers — Hot (RAM buffer for actively-traversed nodes), Warm (NVMe, microsecond access), Cold (cloud storage for archived graphs, millisecond restore).

Isolation: Every graph is a separate file. No shared state. No label-based filtering hacks. Real isolation.

# Same driver. Different URL.
from neo4j import GraphDatabase
driver = GraphDatabase.driver(
"bolt+s://your-workspace.cinchdb.dev:7687",
auth=("", "cinch_token")
)
with driver.session() as session:
session.run(
"CREATE (a:Person {name: 'Alice'})"
"-[:KNOWS]->"
"(b:Person {name: 'Bob'})"
)
result = session.run(
"MATCH (a)-[:KNOWS]->(b) RETURN a.name, b.name"
)

Create via API in milliseconds. Use with any Neo4j client library. Auto-stop when idle. Wake on first query. Fork to test changes. Delete when done.

#The economics

Provider1GB graph10GB graph100 graphs (1GB each)Scale to zero
Neo4j Aura Pro~$65/mo~$500+/mo~$6,500/moNo
Amazon Neptune~$115/mo~$400+/mo~$11,500/moPartial
Memgraph Cloud~$50/mo~$300+/mo~$5,000/moNo
Cinch Graph~$3/mo~$25/mo~$300/mo*Yes

*Assumes 80% of graphs idle at any time. Active graphs: ~$3/mo. Idle graphs: ~$0.10/mo.

The 100-graph column is the killer number. This is where the file-backed architecture absolutely destroys server-based pricing. Neo4j can't offer this because each graph is a server. Cinch can because each graph is a file. Most of those files are idle. Idle files cost almost nothing.

For GraphRAG, agent knowledge bases, per-tenant graphs — the multi-graph scenario is the common case, not the edge case. And it's exactly where the economics diverge by 20x.

#When this doesn't work

We're honest about limitations:

  • One massive graph. If you have a single graph with billions of nodes queried constantly with complex multi-hop traversals, Neo4j in-memory is still faster. Cinch Graph optimizes for many small-to-medium graphs, not one giant one.
  • Neo4j Enterprise features. Causal clustering across regions, enterprise monitoring integrations, Neo4j Bloom visualization — those are Neo4j-specific.
  • Extremely deep traversals. If your traversals are 10+ hops across millions of nodes, the difference between RAM and NVMe becomes noticeable. For most GraphRAG and agent workloads (2–4 hop traversals), it's invisible.
  • Battle-testing. Cinch Graph is early. Neo4j has 15 years of production hardening. If you need guaranteed SLAs on a mission-critical system today, Neo4j is the safer choice. If you need 100 ephemeral graphs for agent workloads at reasonable cost, we're the only option.

File-backed graphs. Bolt-compatible. A fraction of the cost.

Same Cypher queries. Same Neo4j drivers. 20x cheaper. Scale to zero.

GET STARTED →

*Redis is a registered trademark of Redis Ltd. Any rights therein are reserved to Redis Ltd. Cinch is not affiliated with or endorsed by Redis Ltd.