Vector databases are probabilistic and slow. Graph embeddings are deterministic and sub-millisecond. If you are building context for an AI coding agent — or any system where you need to know why a result came back — the difference is not academic. It changes what your application is allowed to promise.
The Pitch in One Paragraph
Terraphim represents concepts as nodes in a knowledge graph and ranks them by how many synonyms and edges connect them. There is no embedding model, no GPU, no per-query distance computation in a 1024-dimensional space. There is an Aho-Corasick automaton built once, queried in O(n+m+z) time over the input length plus the number of matches. The mechanism is described in detail on the Graph Embeddings reference page; this post is about why it matters.
The Numbers
Three numbers carry the argument. Each is reproducible on a laptop.
- 1.4 million patterns matched in under one millisecond, with under 4 GB of RAM. That is the working set behind a multi-role knowledge graph — operator, engineer, analyst — held resident in the same process that serves the query.
- 5–10 nanoseconds per knowledge-graph inference step. Not microseconds. Nanoseconds. Once the automaton is built, traversal is a tight loop over byte slices and graph edges, and modern CPUs are extremely good at that.
- 20 milliseconds to rebuild the embeddings for a role from scratch. Rename a synonym, add a new term, drop an obsolete one — the whole role's graph is reconstituted before your editor has rendered the next frame.
For comparison, a typical vector-DB nearest-neighbour query lands in the 5–50 ms range after you have paid the embedding API call (50–500 ms) and the network round-trip. We are not in the same regime.
Three Consequences
The numbers are interesting on their own. The reason they matter is what they let you build.
1. Full Explainability
Every match in Terraphim traces back to a specific edge in the knowledge graph and a specific synonym in a specific role. There is no "the model said so." When a search returns a document, you can show the user exactly which terms matched, which role's graph supplied the synonym, and which edges connected them. That is not a debugging nicety — it is a regulatory requirement in any domain where you have to defend a decision after the fact. Healthcare, legal, finance, government. Vector search by construction cannot do this.
2. No Training, No Retraining, No Fine-Tuning
Adding a new concept is a text edit. You write the synonym down, you point Terraphim at the file, the graph rebuilds in 20 ms. There is no training run, no GPU bill, no "we need to schedule a retrain on the new corpus." This collapses the loop between noticing a gap and fixing the gap from days or weeks to seconds. For an AI coding agent that needs to learn a project's vocabulary as you onboard, this is the difference between a working tool and a stalled rollout.
3. Language-Agnostic Without Language Detection
Because matching is done on normalised terms — synonyms you supply explicitly — the same node in the graph can carry English, French, Russian, and Mandarin labels at no extra cost. There is no language-detection step, no per-language embedding model, no separate index. The query "consensus" and the query "консенсус" both reach the same node if you have told the graph they are synonyms. Stop-word lists become irrelevant: if a word is not in the graph, it does not match, full stop.
What This Lets You Do
The pieces above are infrastructure. The story arc continues:
- Build hooks that transform AI coding agent output deterministically. When Claude Code suggests
npm install, intercept it via a graph-embeddings match and replace it withbun install. We wrote this up at Teaching AI Coding Agents with Knowledge Graph Hooks — that post is the demo of what this engine enables. - Capture and reuse mistakes. When an agent gets corrected, store the correction as a new synonym and the next session never repeats it. See Teaching AI Agents to Learn from Their Mistakes and Learning via Negativa.
- Run the whole thing in a 4 GB process on your laptop with no network calls. The compactness is not an accident — it is the engineering brief from the Origin Story, which explains where the design came from and why it has stayed this small.
How to Apply
If you want to wire this into your own project, the Command Rewriting How-to walks through the moving parts: where to put your synonyms, how the role graph is built, how hooks call the matcher.
The mechanism — automata, ranking formula, ASCII walk-through — is on the Graph Embeddings reference page. Read that next if you want the data structures.
Why Bother Saying This Out Loud
The current default in the AI tooling ecosystem is to reach for a vector database the moment anyone mentions "semantic search." It is the path of least resistance because the tools are well-marketed and the API surface is familiar. But for a large class of problems — explainability-first systems, on-device agents, anywhere you need a hard latency budget or a hard explainability guarantee — graph embeddings are the better-engineered answer. Not the only answer; the better one for that class.
The promotion campaign over the next few weeks goes deeper: a sub-millisecond context article walks through the FST/Aho-Corasick implementation, and the Context Engineering with Knowledge Graphs book (launching in May) puts it in the wider context of moving from RAG to context graphs.
Until then: read the reference, try the how-to, and let us know in Discourse what you build with it.