pure Rust · zero dependencies · compiles to WebAssembly

An associative memory you can run anywhere.

Write facts in plain language, recall them by meaning. No tables, no schema, no embeddings, no model required. The core is pure Rust with zero dependencies and fits in a 1 MB wasm worker. Durable storage, encryption, an HTTP server, and a thinking cortex are opt-in.

↑ a local model thinks, neuron-db (as WASM) remembers and grounds it — it reasons over a knowledge-gap signal and searches the live web when it doesn't know. Entirely in your browser — no server, no API key.

~/app · neuron
./build.sh
neuron --db app.db turn me 'my plan is pro'
neuron --db app.db get  me 'what plan am i on?'      # -> pro
~130× denser than a 1536-d vector store ~48 B / fact serialized ~6 µs recall · flat to 1M facts 0 deps in the default build MIT licensed
130×
more facts per GiB
vs float32 vectors
215k/s
neurons created
Rust, in-memory
48 B
serialized
per fact
400/400
recall@1 on
distinct keys
0 deps
default build
std-only, wasm
LLM memory · the synapse

Link infinite neurons. At no cost.

An LLM's context window is small; neuron-db is the memory that lives outside it. A relational question — "the timezone of the manager of the owner of Aurora" — normally forces the model to recall a fact, wait, recall the next, wait… N hops = N+1 model calls. recall_chain collapses that: the model sends one path, and the synapse walks the whole chain server-side, each hop a microsecond recall. Depth is paid in microseconds, not model turns.

recall_chain(start="Aurora", path=[owner, manager, timezone]) Aurora ──owner──▶ Kenji Kenji ──manager─▶ Marisol Marisol ──timezone▶ WET 3 hops · resolved server-side · 2 model calls · ~40 µs synapse

A 3-hop or a 30-hop answer costs the LLM the same two calls. The recall itself is free — see it fire in the 3D synapse demo →

live, gpt-4o-mini · 1k–50k factsneuron-dbmarkdown dump
multi-hop accuracy (1/2/3 hops)100%92–100% (degrades)
context cost / turn~1.1k tok (flat)9.9k → 447k
at 6,000 stored facts$0.19 /1k-q$10.06 /1k-q
at 50,000 facts100% · 1.1k tokwon't fit 128k window
model calls per answer (any depth)21
selective recall in 1M facts100% · ~6 µscontext-bound

The markdown-dump reinjects the whole memory every turn (linear, and it eventually overruns the window). neuron-db injects only what it recalled — flat cost, no ceiling, and it matches or beats accuracy. Read the full comparison →

Mount it in one line. neuron-mcp is a native std-only stdio MCP server — point any MCP client (Claude Desktop/Code, Cursor) at the binary and your model gets the full toolset: recall / recall_associative / recall_chain / remember / note (typed neurons) / recall_var. No Node, no Python, no HTTP process. Recall stays microseconds whether the user has 10 facts or 10 million.
The model

Two ideas. That's the whole database.

A fact is a sentence, like "the api key is zeta-9931". neuron-db keeps the surprising word as the retrievable value and indexes the rest as cues. A scope is a named bag of facts (user:42). You insert by stating things and read by asking questions. Retrieval is associative (cue overlap), so you never declare a column or write SQL.

NeuronDB (one .db file) ├─ scope "user:42" facts: "the plan is pro", "region us-west-2"├─ scope "user:43" facts: "the plan is free"└─ scope "team:ops" facts: "the on-call is Dana"

A scope is like a row keyed by its id, and its facts are the columns, except you never declare them. "what is the api key?" finds the fact above without you naming a column.

IntentSQL-ishneuron-db
insertINSERTobserve(s, "the plan is pro")
read oneSELECT … LIMIT 1get(s, "what plan?") → "pro"
read + metaSELECT *recall(s, q) → {value, fact, coverage}
updateUPDATEjust observe again; newest wins
deleteDELETEforget(s, "plan") by substring
converse·turn(s, msg), stores or answers in one call
encrypt·SecureNeuronDB.put/get(…)

There is no UPDATE; facts aren't rows you mutate. To change an answer, state the new fact; recall prefers the most recent match. To make one stick, reinforce it.

Tiers & capabilities

One core, every tier opt-in. Add only what you need.

Every tier shares the same associative store. Plasticity, sharding, durability, encryption, a network server, embedding-free semantic recall, and an MCP mount are opt-in Cargo features, so the default build stays std-only and wasm-clean.

Neuron std-only

In-memory associative store, the default. A stem→fact inverted index keeps recall sub-linear. Recall in microseconds, no deps, no I/O.

n.observe("the wifi password is hunter2");
n.recall("what is the wifi password?") → "hunter2"

PlasticNeuron adaptive

Recall that adapts: strength on use, lazy exponential decay on disuse, Hebbian links, and a neurotransmitter-style spreading-activation recall. All O(1) scalar updates, with no re-embedding and no re-indexing.

for _ in 0..4 { n.reinforce(id, 1.0); }
n.recall_spreading(q, 2, 10, 0.6, 6);

NeuronRouter shard

One scope recalls best when it's small. The router shards across many small neurons, auto-spills into new shards, and fans a query out to return the single best value.

let mut r = NeuronRouter::new(128);
r.get("what is the north gate code?")

NeuronDB sqlite

A durable database of scopes in one SQLite file, WAL mode behind a process-wide lock. Each write persists immediately; concurrent threads are safe. Ships the neuron CLI.

let db = NeuronDB::open("app.db", 500);
db.turn("user:42", "my color is teal");

SecureNeuronDB secure

Values are AES-256-GCM ciphertext, the index is a keyed hash, and the per-scope secret is supplied per call and never stored. A stolen .db file is opaque. Lose the secret, lose the data.

v.put("alice", "alice-secret", "wifi", "hunter2");
v.get("alice", "WRONG", q) → None

HTTP server server

One endpoint per scope from a std TcpListener: turn, get, recall, batch observe, top-k memory block, forget, metrics. Optional Bearer auth via NEURON_DB_KEY.

POST /v1/{scope} {message} → turn
POST /v1/{scope}/recall_many {query,k}

SemanticSpace semantic

Embedding-free fuzzy recall. A corpus-distributional space (Random Indexing, 256-dim, std-only, no model) grounds meaning in co-occurrence, so open-vocabulary paraphrase resolves — "the thing I use to get online" finds the wifi fact. Lexical recall stays the fast path; this is the fallback.

db.train_semantic(corpus);
recall("a gigantic sea creature") → "…Leviathan…"

neuron-mcp mcp

A native std-only stdio MCP server. Point any MCP client (Claude Desktop/Code, Cursor) at the binary and your model gets recall / recall_associative / recall_chain / remember / note / recall_var / forget / stats as tools — no Node, no Python, no HTTP process.

cargo build --release --features mcp --bin neuron-mcp
Beyond a lookup table

A memory that thinks and grows.

Most stores are key-value lookups. neuron-db has a second life: an emergence cortex that reads what the store recalled and copies the answer out, plus a plastic hippocampus that adapts in the moment, without gradient descent. The store is the bloodstream; the cortex is the brain that never has to think about the whole body at once.

cue ▼
① STORE TIER  PlasticNeuron · scales to millions
cheap scalar plasticity decides what is relevant:
• strength (bumped on use), lazy exponential decay
• Hebbian graph, one-hop spreading activation
→ returns a small working set (a handful of facts)
② MODEL TIER  gary-neuron cortex · bounded window
runs only over that working set (192-384 tokens):
• cortex reads the context and answers / completes
• plastic hippocampus does surprise-gated adaptation
→ cost is O(working set), never O(database)
③ SLEEP  consolidation · off the hot path
folds new episodes into cortex weights; merges & prunes the store.
the model literally grows; the store stays lean.

No query ever runs a neural net over the whole database. That split is the entire point: it's how a plastic, thinking, growing memory stays fast.

Adaptation

What you use often surfaces first. Two facts collide on "meeting"; reinforcing "monday" overtakes recency after 2 uses.

strength · O(1)

Forgetting

Unused facts decay cleanly: w·½^(age/half_life), computed lazily at read time. Decay only reranks; it never deletes.

decay · O(1) lazy

Association

Facts recalled together wire together. A Hebbian link grows 0.5 → 8.0 over 5 rounds; spreading activation then surfaces the associate.

link · O(1)

Spreading recall

Neurotransmitter-style: release activation at cued facts, gate conflicting relations off, spread across synapses with reuptake decay, reaching a 2-hop fact that shares no word with the query.

spread · O(neighbors)
The emergence cortex. An 8-layer, 96-dim, 384-context transformer (the gary-neuron model) baked into the binary with include_bytes!, with no files and no network. It was trained to emergence: it learned to read its context window and copy values out of it. The pure-Rust forward pass matches the numpy and TypeScript ports to 0.0000 MSE, and runs inside a 1 MB WebAssembly worker.
Live demos

See it run. In your browser.

Every demo runs client-side, with nothing sent to a server. The live-memory and cortex views execute the real Rust core compiled to WebAssembly; the rest are faithful, interactive views of the documented behaviour. Use the list or the arrows to move through them.

Benchmarks

Measured, and bounded honestly.

From the Rust core (release, single core); the original Python prototype is archived on the legacy-python branch. Numbers are indicative, reproducible with cargo run --release --bin bench. The honest failure mode is published too.

Path · neurons/sec (3 facts each)throughput
Rust core, in-memory (Neuron)~215,000
legacy-python, in-memory Neuron (archived)~30,000
legacy-python, SQLite-backed NeuronDB.turn (archived)~1,200
Operation (legacy-python reference, archived)result
write throughput (observe)~55,000 facts/s
secure put (AES-GCM + keyed index)~390 /s
secure get~270 µs
router recall (2,000 facts / 16 shards)~0.6 ms
arithmetic op (turn evaluates math)~12 µs

The SQLite number is the realistic API rate (a durable write per neuron). In-memory is the ceiling. Rust is ~7× the Python in-memory rate, with true multi-core concurrency (no GIL) and a single static binary.

Facts in scope (Rust core)selective cuebroad cue (O(N))legacy-python
1,000~5 µs~0.2 ms~380 µs
10,000~5 µs~2 ms~1.4 ms
50,000~5 µs~11 ms·
1,000,000~6 µs~0.4 s·

Recall cost is the frequency of the queried words, not the fact count. A selective cue (a distinctive word) hits ~1 fact via the stem→fact inverted index, so it stays flat at ~5-6 µs from 1k to 1,000,000 facts. A broad cue (a word in every fact) scans the whole scope (O(N)). A 2025 pass made the candidate scan ~4× faster (binary-search over sorted stems + precomputed positions) and made the index incremental, so appending a fact then recalling it stays ~10 µs/turn even at 1M facts. Give each memory a distinct subject, or shard with the router.

Turnfactsrecall@1 (rotating probe)latencystore
2,0001,01550%317 µs38 KB
4,0001,72472%539 µs64 KB
6,0001,63082%454 µs61 KB
8,0001,38390%416 µs52 KB
10,0001,26274%413 µs48 KB

Latency stays flat (~0.3-0.5 ms) across 10,000 turns; consolidation holds facts bounded (~1,300, not unbounded) under continuous writes. Accuracy dips are consolidation pruning a sampled fact, i.e. correct forgetting, not a regression.

EffectMeasured behaviour
AdaptationTwo facts collide on "meeting". After 2 reinforcements of "monday" it overtakes recency; by 40 uses w(mon)=42.0 vs w(fri)=1.0.
ForgettingUntouched fact decays cleanly (half_life=50): 0.99 → 0.70 → 0.49 → 0.25 → 0.06 at 0/25/50/100/200 idle ticks.
AssociationCo-activating two unrelated facts grows their Hebbian link 0.5 → 8.0 over 5 rounds; spreading activation then surfaces the associate.
Consolidation5 duplicates + 1 decayed fact consolidate 6 → 1 (4 merged, 1 pruned), recall preserved.

None of these are visible to a static recall@1 test, which is exactly why plasticity is measured as a sequence of uses over time. Tests: cargo test over tests/plastic.rs (Rust); the original Python suites are archived on legacy-python.

Keysfactsrecall@1
Distinct (north wifi password, spare gate code…)400400/400 · 100%
Colliding stems (project0…project499 → all stem to projec)500~1/500

Recall accuracy depends on whether keys are lexically distinct, not how many facts a neuron holds. The stemmer truncates to 6 characters, so keys differing only past that prefix collapse and the most-recent collider wins. Real-world keys (names, relations, attributes) are distinct and recall cleanly at 400+. Near-duplicate keys are the failure mode, and the memory-harness design addresses them with explicit keys, full-token disambiguation, and a dedup/supersede policy.

The book test — ingest whole books (--features semantic)result
corpus ingested (5 Project Gutenberg books)598,684 words → 29,123 facts
store on disk (SQLite)~8.7 MB (~298 B/sentence)
semantic space (23,919-word vocab, 256-dim)~25.8 MB
lexical recall over 29k facts, one scopesingle-digit ms
semantic fallback (paraphrase, O(N) today)~tens of ms

Ingesting whole books builds a queryable knowledge base and a distributional semantic space that learns meaning from the text alone (whale → ship · sea · sperm), with zero shared words between a paraphrased query and the passage it recalls. Lexical recall stays in milliseconds at book scale; the fuzzy fallback is O(N), and the next win is an embedding cache + ANN index. Full write-up: SEMANTIC.md →.

Storage density

~130× more facts per GiB than a vector store.

A vector DB spends 1.5-12 KB per item on a dense embedding so it can retrieve by content. neuron-db adds no embedding bytes at all. Its retrieval index is stems plus a few scalars, which fit inside the same bytes as the text. ~48 bytes/fact serialized ≈ 22.4 million facts per GiB.

neuron-db serialized
1.0×
Vector DB · binary quant 1536-d
5.4×
Vector DB · int8 SQ 1536-d
33.4×
Vector DB · f32 768-d
65.3×
Vector DB · f32 1536-d
129.3×
pgvector / Qdrant f32 + HNSW
193.2×
Vector DB · f32 3072-d (OpenAI large)
257.2×

Bars show relative facts-per-GiB (neuron-db = 100%). The right column reads as "neuron-db fits this many more facts in the same disk." Even against binary quantization, which throws away most recall accuracy, neuron-db is still ~5× denser. The trade is real: this is cue-and-association recall with scalar plasticity, not cosine-similarity semantic search. It's scalar-first by design, with an optional embedding-free semantic tier for paraphrase; reach for a dense vector tier only when ranked similarity search or cross-lingual recall is the core requirement.

Quickstart

Running in under a minute.

The default build is zero-dependency and targets wasm32-unknown-unknown; native tiers are opt-in features that never touch the wasm build. Pick how you want to drive it.

Build from source

You need the Rust toolchain once. build.sh checks cargo and builds the sqlite + secure + server features.

build
# clone and build everything
git clone https://github.com/gary23w/neuron-db
cd neuron-db && ./build.sh

# or pick features yourself
cargo build --release --features "sqlite secure server"

# install the neuron + serve binaries onto PATH
cargo install --path rust/neuron-core --features "sqlite secure server"

CLI: query the file from your shell

The neuron CLI opens the SQLite file directly, no server required. The db path comes from --db or $NEURON_DB.

neuron
neuron --db app.db observe user:42 "the plan is pro"      # INSERT
neuron --db app.db get     user:42 "what plan am i on?"   # -> pro
neuron --db app.db recall  user:42 "what plan?"          # value + fact + coverage
neuron --db app.db turn    user:42 "my color is teal"    # store or answer
neuron --db app.db forget  user:42 "plan"               # DELETE matching facts
neuron --db app.db list                                 # all scope ids
neuron --db app.db --json get user:42 "what plan?"      # {"value":"pro"}

Rust library: embedded, no server

Add the crate and enable the sqlite feature for the durable NeuronDB.

main.rs
use neuron_core::db::NeuronDB;

let db = NeuronDB::open("app.db", 500);        // file path, max facts/scope

db.observe("user:42", "the plan is pro");          // INSERT
db.get("user:42", "what plan?");                  // Some("pro")

db.observe("user:42", "the plan is enterprise");   // "UPDATE": newest wins
db.get("user:42", "what plan?");                  // Some("enterprise")

db.forget("user:42", Some("plan"));             // DELETE by substring
let t = db.turn("user:42", "what is my color?");  // t.reply == "teal."

HTTP: query it over the network

Start the server with serve /data/neurons.db 8088. Set NEURON_DB_KEY to require a bearer token on every request.

curl
# turn: store a fact (one endpoint per scope)
curl -d '{"message":"the api key is zeta-9931"}' \
     localhost:8088/v1/user:42

# get a single value back
curl -d '{"query":"what is the api key?"}' \
     localhost:8088/v1/user:42/get          # -> {"value":"zeta-9931"}

# top-k memory block for an LLM context window
curl -d '{"query":"api key","k":5}' \
     localhost:8088/v1/user:42/recall_many

Docker: run it as a service

The image binds 0.0.0.0 inside the container and stores the db on a /data volume. Put it behind your own TLS-terminating proxy for public exposure.

docker
docker build -t neuron-db .
docker run -d -p 8088:8088 -v neuron-data:/data \
  -e NEURON_DB_KEY=$(openssl rand -hex 16) neuron-db

# or with compose: persists + restarts on failure
NEURON_DB_KEY=$(openssl rand -hex 16) docker compose up -d

Encrypted tier: the secret is the login

Values are AES-256-GCM ciphertext; the per-scope secret is supplied every call and never written to disk. A stolen .db is opaque.

secure
// Rust, feature "secure"
let v = SecureNeuronDB::open("vault.db");
v.put("alice", "alice-secret", "wifi password", "hunter2");
v.get("alice", "alice-secret", "what is the wifi password?"); // Some("hunter2")
v.get("alice", "WRONG",        "what is the wifi password?"); // None

# CLI
neuron --db vault.db --secret s3cr3t secure-put alice "wifi password" hunter2
neuron --db vault.db --secret s3cr3t secure-get alice "what is the wifi password?"

MCP: mount it as an LLM's long-term memory

Build the std-only stdio server and point any MCP client at it. The model gets recall / recall_associative / recall_chain / remember / note (typed neurons) / recall_var as tools; recall stays microseconds whether the user has 10 facts or 10 million.

mcp
# build the native stdio MCP server (no Node, no Python)
cargo build --release --features mcp --bin neuron-mcp

# point a client at it — e.g. a Claude Desktop mcpServers entry
{
  "neuron-db": {
    "command": "/path/to/neuron-mcp",
    "env": { "NEURON_MCP_DB": "/data/memory.db" }
  }
}
Why it's different

Honest about the trade.

neuron-db isn't trying to replace a vector DB everywhere. It owns the cheap, high-volume, lookup-and-adapt path, and tells you exactly where it doesn't fit.

Where neuron-db wins

  • Many small facts. Preferences, entities, events, settings, the exact shape of LLM long-term memory, at ~130× the density of float32 vectors.
  • No model, no GPU, no index to maintain. Microsecond recall and O(1) updates with nothing to re-embed or re-index.
  • Runs anywhere. A 1 MB wasm worker at the edge, a static binary on a box, embedded in your Rust app, or a Postgres extension.
  • Adapts from use. The plastic tier learns what matters and forgets what doesn't, without a training step.
  • Fuzzy recall without a model. The optional semantic tier grounds meaning in a corpus (Random Indexing) so paraphrase resolves embedding-free — no vector DB, no GPU.

Reach for vectors when

  • You need zero-shot semantic match. The built-in semantic tier resolves paraphrase like "the thing I use to get online" → wifi embedding-free — but only after it has seen related text. For semantic match with no corpus to learn from, a pretrained embedding still wins.
  • Cross-lingual recall or similarity ranking is the core requirement.
  • Near-duplicate keys. Ten "meeting on {date}" entries collide on stems; use explicit keys and the dedup/supersede harness, or a vector tier.
  • ACID rows and joins. neuron-db is an associative memory, not a relational database.
The honest framing is scalar-first: neuron-db handles the cheap, high-volume lookup-and-adapt path at much higher density, and you add a vector tier only for the slice of queries that truly need meaning matching, paying for vectors only there.