Entities

Entities turn unstructured memory text into a navigable layer of people, projects, decisions, technologies, places, and events. When entity extraction is enabled, every memory you save gets a small structured list of what it's about — and the hosted backend resolves those mentions across saves so "John" in memory #4 is the same entity as "John" in memory #200.

This is the first half of the engine roadmap we published in May 2026. The next halves are knowledge graphs, temporal reasoning, and self-revising memory — see ENGINE_ROADMAP.md in the mnueron repo for the full picture.

Why this matters

Without entities, finding everything about a person or project means running a keyword search that catches some mentions and misses others (synonyms, misspellings, references by role instead of name). With entities, each canonical thing gets a stable ID and a list of memories that mention it.

That unlocks queries like:

"Show me every memory about the Q3 roadmap."
"Who do I have notes about?"
"What decisions has my team made about API design?"

Quick start

Entity extraction is opt-in on both local and hosted modes. Three ways to turn it on:

1. Per-call flag (recommended for most users).

Pass metadata.extract_entities: true on the save:

curl -X POST https://www.mnueron.com/api/memories \
  -H "Authorization: Bearer mnu_..." \
  -H "Content-Type: application/json" \
  -d '{
    "content": "After Q3 review, John from Stripe recommended deprecating v1 in favor of v2.",
    "metadata": { "extract_entities": true }
  }'

2. BYOK opt-in.

If you pass byok_anthropic_key or byok_openai_key in metadata, extraction is implicit — you're already volunteering payment for the LLM call. The key is used once and stripped before the row hits the database.

3. Environment-wide.

For self-hosted setups or paid-tier defaults, set ENABLE_ENTITY_EXTRACTION=true on the server. Every save above the minimum length threshold (200 chars) will then extract entities.

What gets extracted

Each entity is a {name, type, context, canonical_id} object:

{
  "name": "John Smith",
  "type": "person",
  "context": "engineer who reviewed the Q3 deprecation plan",
  "canonical_id": "a1b2c3d4-..."
}

Types we ask the LLM to fit into:

person — named individuals
organization — companies, teams, departments
project — initiatives, roadmaps, milestones
technology — products, frameworks, libraries, APIs
place — physical or virtual locations
decision — recommendations, choices, deprecations
event — meetings, reviews, releases
concept — domain ideas, methodologies
other — fallback

We cap extraction at 25 entities per memory to keep payloads bounded. Decisions count as first-class entities — the same query layer that finds people also finds the decisions made about them.

Cross-session resolution (hosted only — for now)

On mnueron.com, every extracted entity gets a canonical_id that's stable across saves. The resolver runs this pipeline per entity:

Lower-case exact match in the existing entities table.
Trigram fuzzy match via pg_trgm — catches "Acme Inc." vs "acme" and "John Smith" vs "John W. Smith".
LLM tiebreak (Haiku) — for borderline matches (similarity 0.50 - 0.85), Haiku decides whether the new mention matches one of the candidates or is a brand new entity. All ambiguous cases in one save are batched into a single LLM call.
Create new canonical if nothing matches.

The local SQLite store currently runs the extraction step (step 1 above) but not the resolution step — every save creates fresh entities without de-duplication. Resolution on local is on the v0.5 roadmap.

Finding memories by entity

Once entities are resolved, two new endpoints power discovery.

List/search entities — see what's been extracted across your store:

GET /api/entities?q=john&type=person&limit=20

Returns the canonical entities matching that fuzzy query, with mention counts and last-seen times.

Filter memories by canonical entity:

GET /api/memories?entity=<canonical_id>

Returns every memory that has a memory_entities edge pointing at that entity. Stacks with q, namespace, created_after, etc.

One entity with its memories — single round trip:

GET /api/entities/<canonical_id>?include_memories=1

Retroactive backfill

Existing memories saved before you turned this on can be filled in.

Hosted backend — admin endpoint:

curl -X POST https://www.mnueron.com/api/entities/backfill \
  -H "Authorization: Bearer mnu_..." \
  -H "Content-Type: application/json" \
  -d '{ "limit": 100, "dry_run": true }'

dry_run previews. Drop it for real processing. Each batch defaults to 100 memories and is capped at 1000 — beyond that, paginate with since (epoch ms).

Local CLI:

ANTHROPIC_API_KEY=... mnueron extract-entities --dry-run
ANTHROPIC_API_KEY=... mnueron extract-entities --ns project-x --limit 200

The local subcommand processes one batch at a time. Re-run with --since to walk forward through older memories. Use --force to re-extract memories that already have an entities array.

Cost reference

Per-memory extraction cost:

Claude Haiku 4.5: ~$0.001 per save (input + output combined)
gpt-4o-mini: ~$0.0001 per save (10x cheaper)
LLM tiebreaks (hosted resolver only): one batched Haiku call per save with ambiguous entities, typically <$0.0005

These add to the existing auto-synopsis cost when both are enabled. With BYOK, you pay your own provider directly; mnueron never sees or stores your key.

Privacy posture

BYOK keys are stripped from metadata before the row hits Postgres / SQLite, even when extraction is skipped (e.g. short content). The strip happens before the gate check — defense in depth.
All entity-related rows (entities, memory_entities) are RLS-scoped to your org_id. Other orgs can never see your entities, even if they guess a canonical_id (queries return empty under RLS).
Local mode keeps everything in ~/.mnueron/memories.db — entity extraction calls go to whichever LLM provider you've configured but the extracted entities never leave your machine.

Limitations & roadmap

What's not in v1 of this layer:

Knowledge graph — entities are extracted but relationships between them aren't yet. "John recommended deprecating v1" gives us John and v1 but no recommended edge. Coming in v0.5.
Temporal validity windows — "John worked at Stripe until 2025" gets John and Stripe but no time-bounded edge between them. Coming in v0.6.
Self-revising consolidation — duplicate entities don't auto-merge on a background pass yet. The resolver merges on write, not on reflection.
Embedding-based resolution — v1 uses pg_trgm (string similarity) on the hosted side. Once we add server-side embeddings, the resolver will also catch semantic equivalence (e.g. "the CFO" and "Jane Doe").

For the full multi-quarter plan, see ENGINE_ROADMAP.md in the main mnueron repository.