pure Rust · zero dependencies · compiles to WebAssembly

An associative memory you can run anywhere.

A small local model decides each turn; neuron-db remembers what it knows, grounds the answer in stored facts, and searches the live web when it doesn't know. The whole loop runs in your browser, with no server and no API key. Write facts in plain language and recall them by meaning, with no tables, schema, embeddings, or GPU.

▶ Try the live lab Quickstart → Star on GitHub

The model reasons over a knowledge-gap signal: when the store can't answer, it routes to a web fetch and grounds the reply in what it pulls back. Core is pure Rust with zero dependencies in a 1 MB wasm worker; durable storage, encryption, an HTTP server, and the hippocampus are opt-in.

~/app · the grounding loop

# you ask the local model a question
ask me 'who is the current ceo of openai?'
dispatch FETCH   # gary-neuron: store can't answer (gap)
web      ↳ fetched 1 source
store    ↳ 'the ceo of openai is sam altman'
ask me 'who runs openai?'
dispatch ANSWER  # grounded from memory, no web
→ sam altman     # recall ~3.9 µs · dispatch ~54 ms

gary-neuron v5 ~7M-param dispatcher, in every build route() on CLI · MCP · WASM ~4 µs recall p50 · flat as the store grows 0 deps in the default build MIT licensed

LLM memory · the synapse

Link infinite neurons. At no cost.

An LLM's context window is small; neuron-db is the memory that lives outside it. A relational question — "the timezone of the manager of the owner of Aurora" — normally forces the model to recall a fact, wait, recall the next, wait… N hops = N+1 model calls. recall_chain collapses that: the model sends one path, and the synapse walks the whole chain server-side, each hop a microsecond recall. Depth is paid in microseconds, not model turns.

recall_chain(start="Aurora", path=[owner, manager, timezone])

  Aurora ──owner──▶   Kenji
   Kenji ──manager─▶ Marisol
 Marisol ──timezone▶ WET

3 hops · resolved server-side · 2 model calls · ~40 µs synapse
        

A 3-hop or a 30-hop answer costs the LLM the same two calls. The recall itself is free — see it fire in the 3D synapse demo →

live, gpt-4o-mini · 1k–50k facts	neuron-db	markdown dump
multi-hop accuracy (1/2/3 hops)	100%	92–100% (degrades)
context cost / turn	~1.1k tok (flat)	9.9k → 447k
at 6,000 stored facts	$0.19 /1k-q	$10.06 /1k-q
at 50,000 facts	100% · 1.1k tok	won't fit 128k window
model calls per answer (any depth)	2	1
selective recall in 1M facts	100% · ~6 µs	context-bound

The markdown-dump reinjects the whole memory every turn (linear, and it eventually overruns the window). neuron-db injects only what it recalled — flat cost, no ceiling, and it matches or beats accuracy. Read the full comparison →

Mount it in one line. neuron-mcp is a native std-only stdio MCP server — point any MCP client (Claude Desktop/Code, Cursor) at the binary and your model gets the full toolset: recall / recall_associative / recall_chain / remember / note (typed neurons) / recall_var. No Node, no Python, no HTTP process. Recall stays microseconds whether the user has 10 facts or 10 million.

The model

Two ideas. That's the whole database.

A fact is a sentence, like "the api key is zeta-9931". neuron-db keeps the surprising word as the retrievable value and indexes the rest as cues. A scope is a named bag of facts (user:42). You insert by stating things and read by asking questions. Retrieval is associative (cue overlap), so you never declare a column or write SQL.

NeuronDB (one .db file) ├─ scope "user:42" facts: "the plan is pro", "region us-west-2" … ├─ scope "user:43" facts: "the plan is free" … └─ scope "team:ops" facts: "the on-call is Dana" …

A scope is like a row keyed by its id, and its facts are the columns, except you never declare them. "what is the api key?" finds the fact above without you naming a column.

Intent	SQL-ish	neuron-db
insert	INSERT	observe(s, "the plan is pro")
read one	SELECT … LIMIT 1	get(s, "what plan?") → "pro"
read + meta	SELECT *	recall(s, q) → {value, fact, coverage}
update	UPDATE	just observe again; newest wins
delete	DELETE	forget(s, "plan") by substring
converse	·	turn(s, msg), stores or answers in one call
encrypt	·	SecureNeuronDB.put/get(…)

There is no UPDATE; facts aren't rows you mutate. To change an answer, state the new fact; recall prefers the most recent match. To make one stick, reinforce it.

Tiers & capabilities

One core, every tier opt-in. Add only what you need.

Every tier shares the same associative store. Plasticity, sharding, durability, encryption, a network server, embedding-free semantic recall, an optional Big-Five personality layer, and an MCP mount are opt-in Cargo features, so the default build stays std-only and wasm-clean.

Neuron std-only

In-memory associative store, the default. A stem→fact inverted index keeps recall sub-linear. Recall in microseconds, no deps, no I/O.

n.observe("the wifi password is hunter2");
n.recall("what is the wifi password?") → "hunter2"

PlasticNeuron adaptive

Recall that adapts: strength on use, lazy exponential decay on disuse, Hebbian links, and a neurotransmitter-style spreading-activation recall. All O(1) scalar updates, with no re-embedding and no re-indexing.

for _ in 0..4 { n.reinforce(id, 1.0); }
n.recall_spreading(q, 2, 10, 0.6, 6);

NeuronRouter shard

One scope recalls best when it's small. The router shards across many small neurons, auto-spills into new shards, and fans a query out to return the single best value.

let mut r = NeuronRouter::new(128);
r.get("what is the north gate code?")

NeuronDB sqlite

A durable database of scopes in one SQLite file, WAL mode, sharded by scope so independent tenants recall and write in parallel — recall scales ~8× across 16 cores. Each write is a durable append-log INSERT (~25k/s, flat as a scope grows). Ships the neuron CLI.

let db = NeuronDB::open("app.db", 500);
db.turn("user:42", "my color is teal");

SecureNeuronDB secure

Values are AES-256-GCM ciphertext, the index is a keyed hash, and the per-scope secret is supplied per call and never stored. A stolen .db file is opaque. Lose the secret, lose the data.

v.put("alice", "alice-secret", "wifi", "hunter2");
v.get("alice", "WRONG", q) → None

HTTP server server

One endpoint per scope from a std TcpListener: turn, get, recall, batch observe, top-k memory block, forget, metrics. Optional Bearer auth via NEURON_DB_KEY.

POST /v1/{scope} {message} → turn
POST /v1/{scope}/recall_many {query,k}

SemanticSpace semantic

Embedding-free fuzzy recall. A corpus-distributional space (Random Indexing, 256-dim, std-only, no model) grounds meaning in co-occurrence, so open-vocabulary paraphrase resolves — "the thing I use to get online" finds the wifi fact. Lexical recall stays the fast path; this is the fallback.

db.train_semantic(corpus);
recall("a gigantic sea creature") → "…Leviathan…"

neuron-mcp mcp

A native std-only stdio MCP server. Point any MCP client (Claude Desktop/Code, Cursor) at the binary and your model gets recall / recall_associative / recall_chain / remember / note / recall_var / forget / stats as tools — no Node, no Python, no HTTP process.

cargo build --release --features mcp --bin neuron-mcp

Beyond a lookup table

A memory that thinks and grows.

Most stores are key-value lookups. neuron-db adds a second tier: a dispatcher hippocampus that reads what the store recalled, decides what to do with it, and escalates to the host model (the reasoning neocortex) when memory can't answer, plus a plastic tier that adapts in the moment without gradient descent. The store stays fast because the hippocampus only ever sees the small working set, not the whole database.

cue ▼

① STORE TIER PlasticNeuron · scales to millions

cheap scalar plasticity decides what is relevant:
• strength (bumped on use), lazy exponential decay
• Hebbian graph, one-hop spreading activation
→ returns a small working set (a handful of facts)

▼

② MODEL TIER gary-neuron v5 dispatcher · bounded window

runs only over that working set (192-384 tokens):
• routes the turn: ANSWER / ESCALATE / FETCH / STORE
• answers from the recalled facts, or grounds via a web fetch
• v5: STORE routing 0→100%, open-ended turns escalate cleanly, holds 1–18-fact working sets
→ cost is O(working set), never O(database)

▼

③ SLEEP consolidation · off the hot path

folds new episodes into hippocampus weights; merges & prunes the store.
the model keeps learning; the store stays lean.

No query ever runs a neural net over the whole database. That split is the entire point: it's how a plastic, thinking, growing memory stays fast.

Adaptation

What you use often surfaces first. Two facts collide on "meeting"; reinforcing "monday" overtakes recency after 2 uses.

strength · O(1)

Forgetting

Unused facts decay cleanly: w·½^(age/half_life), computed lazily at read time. Decay only reranks; it never deletes.

decay · O(1) lazy

Association

Facts recalled together wire together. A Hebbian link grows 0.5 → 8.0 over 5 rounds; spreading activation then surfaces the associate.

link · O(1)

Spreading recall

Neurotransmitter-style: release activation at cued facts, gate conflicting relations off, spread across synapses with reuptake decay, reaching a 2-hop fact that shares no word with the query.

spread · O(neighbors)

gary-neuron v5 — the dispatcher hippocampus. A ~7M-parameter int8 transformer baked into every build with include_bytes!: no download, no GPU, no network to load it. Each turn it routes to ANSWER / ESCALATE / FETCH / STORE over the recalled working set. On a held-out test the routing triage is 100% on each class (router_bench 500/500), with STORE routing fixed 0→100%; grounded answers land 94–100% for working sets up to 12 facts, and an open-ended turn over on-topic facts resolves to a clean ESCALATE — no degenerate output. A dispatch runs in ~54 ms in the browser (SIMD128) or ~255 ms natively. It is the front gate on every mount — neuron route on the CLI, the MCP route tool, and route() in the WASM binding — a required feature, never bypassed; each returns a typed {type, value, facts}, so raw model text never reaches the user. The pure-Rust forward pass matches the numpy and TypeScript ports to 0.0000 MSE. The weights are published on Hugging Face as a pure-NumPy mirror.

Benchmarks

Measured, with the limits shown.

From the Rust core (release, single core); the original Python prototype is archived on the legacy-python branch. Numbers are reproducible with cargo run --release --bin bench, and the failure mode is in the table too.

Path · neurons/sec (3 facts each)	throughput
Rust core, in-memory (Neuron)	~215,000
legacy-python, in-memory Neuron (archived)	~30,000
legacy-python, SQLite-backed NeuronDB.turn (archived)	~1,200

Operation (legacy-python reference, archived)	result
write throughput (observe)	~55,000 facts/s
secure put (AES-GCM + keyed index)	~390 /s
secure get	~270 µs
router recall (2,000 facts / 16 shards)	~0.6 ms
arithmetic op (turn evaluates math)	~12 µs

The SQLite number is the realistic API rate (a durable write per neuron). In-memory is the ceiling. Rust is ~7× the Python in-memory rate, with true multi-core concurrency (no GIL) and a single static binary.

Facts in scope (Rust core)	selective cue	broad cue (O(N))	legacy-python
1,000	~5 µs	~0.2 ms	~380 µs
10,000	~5 µs	~2 ms	~1.4 ms
50,000	~5 µs	~11 ms	·
1,000,000	~6 µs	~0.4 s	·

Recall cost is the frequency of the queried words, not the fact count. A selective cue (a distinctive word) hits ~1 fact via the stem→fact inverted index, so it stays flat at ~5-6 µs from 1k to 1,000,000 facts (it does not grow with scope). A broad cue (a word in every fact) scans the whole scope (O(N)). A 2025 pass made the candidate scan ~4× faster (binary-search over sorted stems + precomputed positions) and made the index incremental, so appending a fact then recalling it stays ~10 µs/turn even at 1M facts. Give each memory a distinct subject, or shard with the router.

Turn	facts	recall@1 (rotating probe)	latency	store
2,000	1,015	50%	317 µs	38 KB
4,000	1,724	72%	539 µs	64 KB
6,000	1,630	82%	454 µs	61 KB
8,000	1,383	90%	416 µs	52 KB
10,000	1,262	74%	413 µs	48 KB

Latency stays flat (~0.3-0.5 ms) across 10,000 turns; consolidation holds facts bounded (~1,300, not unbounded) under continuous writes. Accuracy dips are consolidation pruning a sampled fact, i.e. correct forgetting, not a regression.

Effect	Measured behaviour
Adaptation	Two facts collide on "meeting". After 2 reinforcements of "monday" it overtakes recency; by 40 uses w(mon)=42.0 vs w(fri)=1.0.
Forgetting	Untouched fact decays cleanly (half_life=50): 0.99 → 0.70 → 0.49 → 0.25 → 0.06 at 0/25/50/100/200 idle ticks.
Association	Co-activating two unrelated facts grows their Hebbian link 0.5 → 8.0 over 5 rounds; spreading activation then surfaces the associate.
Consolidation	5 duplicates + 1 decayed fact consolidate 6 → 1 (4 merged, 1 pruned), recall preserved.

None of these are visible to a static recall@1 test, which is exactly why plasticity is measured as a sequence of uses over time. Tests: cargo test over tests/plastic.rs (Rust); the original Python suites are archived on legacy-python.

Keys	facts	recall@1
Distinct (north wifi password, spare gate code…)	400	400/400 · 100%
Colliding stems (project0…project499 → all stem to projec)	500	~1/500

Recall accuracy depends on whether keys are lexically distinct, not how many facts a neuron holds. The stemmer truncates to 6 characters, so keys differing only past that prefix collapse and the most-recent collider wins. Real-world keys (names, relations, attributes) are distinct and recall cleanly at 400+. Near-duplicate keys are the failure mode, and the memory-harness design addresses them with explicit keys, full-token disambiguation, and a dedup/supersede policy.

The book test — ingest whole books (--features semantic)	result
corpus ingested (5 Project Gutenberg books)	598,684 words → 29,123 facts
store on disk (SQLite)	~8.7 MB (~298 B/sentence)
semantic space (23,919-word vocab, 256-dim)	~25.8 MB
lexical recall over 29k facts, one scope	single-digit ms
semantic fallback (paraphrase, O(N) today)	~tens of ms

Ingesting whole books builds a queryable knowledge base and a distributional semantic space that learns meaning from the text alone (whale → ship · sea · sperm), with zero shared words between a paraphrased query and the passage it recalls. Lexical recall stays in milliseconds at book scale; the fuzzy fallback is O(N), and the next win is an embedding cache + ANN index. Full write-up: SEMANTIC.md →.

Quickstart

Running in under a minute.

The default build is zero-dependency and targets wasm32-unknown-unknown; native tiers are opt-in features that never touch the wasm build. Pick how you want to drive it.

Build from source

You need the Rust toolchain once. build.sh checks cargo and builds the sqlite + secure + server features.

build

# clone and build everything
git clone https://github.com/gary23w/neuron-db
cd neuron-db && ./build.sh

# or pick features yourself
cargo build --release --features "sqlite secure server"

# install the neuron + serve binaries onto PATH
cargo install --path rust/neuron-core --features "sqlite secure server"

CLI: a Unix front door to the store

The neuron binary opens the SQLite file directly, no server required. The CLI, the MCP server, the HTTP server, and the in-browser wasm all route through one shared op vocabulary, so behavior is identical wherever you drive it.

neuron

# state & recall; '-' reads stdin; a miss exits 3, so recall composes in a shell
echo "the launch is Friday" | neuron observe user -
neuron get user "when is the launch"          # -> Friday
neuron get user "gate code" || research ...    # routes on the knowledge gap

# interactive shell: recall, spreading assoc, multi-hop chain, vars
neuron shell case
  case> chain Lena Marsh -> partner -> creditor
  Eliza Crowe  (via Lena Marsh -> Marcus Vane -> Eliza Crowe)

# pipe ANY app's output into a scope (transparent tee + substring filters)
my-service 2>&1 | neuron capture logs --tee --only ERROR
neuron run build -- cargo test       # spawn, tee, record, keep its exit code

# mount neuron-db as Claude Code's memory (never clobbers a config)
neuron mount claude

Rust library: embedded, no server

Add the crate and enable the sqlite feature for the durable NeuronDB.

main.rs

use neuron_core::db::NeuronDB;

let db = NeuronDB::open("app.db", 500);        // file path, max facts/scope

db.observe("user:42", "the plan is pro");          // INSERT
db.get("user:42", "what plan?");                  // Some("pro")

db.observe("user:42", "the plan is enterprise");   // "UPDATE": newest wins
db.get("user:42", "what plan?");                  // Some("enterprise")

db.forget("user:42", Some("plan"));             // DELETE by substring
let t = db.turn("user:42", "what is my color?");  // t.reply == "teal."

HTTP: query it over the network

Start the server with serve /data/neurons.db 8088. Set NEURON_DB_KEY to require a bearer token on every request.

curl

# turn: store a fact (one endpoint per scope)
curl -d '{"message":"the api key is zeta-9931"}' \
     localhost:8088/v1/user:42

# get a single value back
curl -d '{"query":"what is the api key?"}' \
     localhost:8088/v1/user:42/get          # -> {"value":"zeta-9931"}

# top-k memory block for an LLM context window
curl -d '{"query":"api key","k":5}' \
     localhost:8088/v1/user:42/recall_many

Docker: run it as a service

The image binds 0.0.0.0 inside the container and stores the db on a /data volume. Put it behind your own TLS-terminating proxy for public exposure.

docker

docker build -t neuron-db .
docker run -d -p 8088:8088 -v neuron-data:/data \
  -e NEURON_DB_KEY=$(openssl rand -hex 16) neuron-db

# or with compose: persists + restarts on failure
NEURON_DB_KEY=$(openssl rand -hex 16) docker compose up -d

Encrypted tier: the secret is the login

Values are AES-256-GCM ciphertext; the per-scope secret is supplied every call and never written to disk. A stolen .db is opaque.

secure

// Rust, feature "secure"
let v = SecureNeuronDB::open("vault.db");
v.put("alice", "alice-secret", "wifi password", "hunter2");
v.get("alice", "alice-secret", "what is the wifi password?"); // Some("hunter2")
v.get("alice", "WRONG",        "what is the wifi password?"); // None

# CLI
neuron --db vault.db --secret s3cr3t secure-put alice "wifi password" hunter2
neuron --db vault.db --secret s3cr3t secure-get alice "what is the wifi password?"

MCP: mount it as an LLM's long-term memory

Build the std-only stdio server and point any MCP client at it. The model gets recall / recall_associative / recall_chain / remember / note (typed neurons) / recall_var as tools; recall stays microseconds whether the user has 10 facts or 10 million.

mcp

# build the native stdio MCP server (no Node, no Python)
cargo build --release --features mcp --bin neuron-mcp

# point a client at it — e.g. a Claude Desktop mcpServers entry
{
  "neuron-db": {
    "command": "/path/to/neuron-mcp",
    "env": { "NEURON_MCP_DB": "/data/memory.db" }
  }
}

JavaScript / TypeScript: the npm package

The same Rust core compiled to WebAssembly, as a dependency-free npm package — @gary23w/neuron-db. One typed ES module over the raw mem() FFI; runs in Cloudflare Workers, the browser, Node, Deno, Bun.

npm

# install
npm i @gary23w/neuron-db

// then — browser, Node, Deno, or a Worker
import { NeuronDB } from "@gary23w/neuron-db";
const db = await NeuronDB.forBrowser(new URL("@gary23w/neuron-db/wasm", import.meta.url));

db.observeMany("user:42", ["the api key is zeta-9931"]);
db.recall("user:42", "what is the api key?");   // -> ["the api key is zeta-9931"]
db.route("user:42", userMessage);              // {type:"answer"|"escalate"|"fetch"|"store", value, facts}

A runnable login + memory console: examples/npm-demo.

Why it's different

Clear about the trade.

neuron-db isn't trying to replace a vector DB everywhere. It owns the cheap, high-volume lookup-and-adapt path, and the table below says where it doesn't fit.

✓ Where neuron-db wins

Many small facts. Preferences, entities, events, settings, the exact shape of LLM long-term memory, at ~130× the density of float32 vectors.
No model, no GPU, no index to maintain. Microsecond recall and O(1) updates with nothing to re-embed or re-index.
Runs anywhere. A 1 MB wasm worker at the edge, a static binary on a box, embedded in your Rust app, or a Postgres extension.
Adapts from use. The plastic tier learns what matters and forgets what doesn't, without a training step.
Fuzzy recall without a model. The optional semantic tier grounds meaning in a corpus (Random Indexing) so paraphrase resolves embedding-free — no vector DB, no GPU.

⚠ Reach for vectors when

You need zero-shot semantic match. The built-in semantic tier resolves paraphrase like "the thing I use to get online" → wifi embedding-free — but only after it has seen related text. For semantic match with no corpus to learn from, a pretrained embedding still wins.
Cross-lingual recall or similarity ranking is the core requirement.
Near-duplicate keys. Ten "meeting on {date}" entries collide on stems; use explicit keys and the dedup/supersede harness, or a vector tier.
ACID rows and joins. neuron-db is an associative memory, not a relational database.

It's scalar-first: neuron-db handles the cheap, high-volume lookup-and-adapt path at much higher density, and you add a vector tier only for the slice of queries that truly need meaning matching, paying for vectors only there.

▶ See the loop run in the live lab A local model, neuron-db, and a web fetch — all in your browser.

An associative memory you can run anywhere.

Link infinite neurons. At no cost.

Two ideas. That's the whole database.

One core, every tier opt-in. Add only what you need.

Neuron std-only

PlasticNeuron adaptive

NeuronRouter shard

NeuronDB sqlite

SecureNeuronDB secure

HTTP server server

SemanticSpace semantic

neuron-mcp mcp

A memory that thinks and grows.

Adaptation

Forgetting

Association

Spreading recall

See it run. In your browser.

Measured, with the limits shown.

~130× more facts per GiB than a vector store.

Running in under a minute.

Build from source

CLI: a Unix front door to the store

Rust library: embedded, no server

HTTP: query it over the network

Docker: run it as a service

Encrypted tier: the secret is the login

MCP: mount it as an LLM's long-term memory

JavaScript / TypeScript: the npm package

The memory under a swarm.

nl-veil — the swarm engine

Clear about the trade.

✓ Where neuron-db wins

⚠ Reach for vectors when