Entities — automatic extraction + cross-session resolution

Pull the people, projects, decisions, and technologies out of your memories. Then ask "show me everything about X" across your whole memory store.

Entities

Entities turn unstructured memory text into a navigable layer of people, projects, decisions, technologies, places, and events. When entity extraction is enabled, every memory you save gets a small structured list of what it's about — and the hosted backend resolves those mentions across saves so "John" in memory #4 is the same entity as "John" in memory #200.

This is the first half of the engine roadmap we published in May 2026. The next halves are knowledge graphs, temporal reasoning, and self-revising memory — see ENGINE_ROADMAP.md in the mnueron repo for the full picture.

Why this matters

Without entities, finding everything about a person or project means running a keyword search that catches some mentions and misses others (synonyms, misspellings, references by role instead of name). With entities, each canonical thing gets a stable ID and a list of memories that mention it.

That unlocks queries like:

  • "Show me every memory about the Q3 roadmap."
  • "Who do I have notes about?"
  • "What decisions has my team made about API design?"

Quick start

Entity extraction is opt-in on both local and hosted modes. Three ways to turn it on:

1. Per-call flag (recommended for most users).

Pass metadata.extract_entities: true on the save:

curl -X POST https://www.mnueron.com/api/memories \
  -H "Authorization: Bearer mnu_..." \
  -H "Content-Type: application/json" \
  -d '{
    "content": "After Q3 review, John from Stripe recommended deprecating v1 in favor of v2.",
    "metadata": { "extract_entities": true }
  }'

2. BYOK opt-in.

If you pass byok_anthropic_key or byok_openai_key in metadata, extraction is implicit — you're already volunteering payment for the LLM call. The key is used once and stripped before the row hits the database.

3. Environment-wide.

For self-hosted setups or paid-tier defaults, set ENABLE_ENTITY_EXTRACTION=true on the server. Every save above the minimum length threshold (200 chars) will then extract entities.

What gets extracted

Each entity is a {name, type, context, canonical_id} object:

{
  "name": "John Smith",
  "type": "person",
  "context": "engineer who reviewed the Q3 deprecation plan",
  "canonical_id": "a1b2c3d4-..."
}

Types we ask the LLM to fit into:

  • person — named individuals
  • organization — companies, teams, departments
  • project — initiatives, roadmaps, milestones
  • technology — products, frameworks, libraries, APIs
  • place — physical or virtual locations
  • decision — recommendations, choices, deprecations
  • event — meetings, reviews, releases
  • concept — domain ideas, methodologies
  • other — fallback

We cap extraction at 25 entities per memory to keep payloads bounded. Decisions count as first-class entities — the same query layer that finds people also finds the decisions made about them.

Cross-session resolution (hosted only — for now)

On mnueron.com, every extracted entity gets a canonical_id that's stable across saves. The resolver runs this pipeline per entity:

  1. Lower-case exact match in the existing entities table.
  2. Trigram fuzzy match via pg_trgm — catches "Acme Inc." vs "acme" and "John Smith" vs "John W. Smith".
  3. LLM tiebreak (Haiku) — for borderline matches (similarity 0.50 - 0.85), Haiku decides whether the new mention matches one of the candidates or is a brand new entity. All ambiguous cases in one save are batched into a single LLM call.
  4. Create new canonical if nothing matches.

The local SQLite store currently runs the extraction step (step 1 above) but not the resolution step — every save creates fresh entities without de-duplication. Resolution on local is on the v0.5 roadmap.

Finding memories by entity

Once entities are resolved, two new endpoints power discovery.

List/search entities — see what's been extracted across your store:

GET /api/entities?q=john&type=person&limit=20

Returns the canonical entities matching that fuzzy query, with mention counts and last-seen times.

Filter memories by canonical entity:

GET /api/memories?entity=<canonical_id>

Returns every memory that has a memory_entities edge pointing at that entity. Stacks with q, namespace, created_after, etc.

One entity with its memories — single round trip:

GET /api/entities/<canonical_id>?include_memories=1

Retroactive backfill

Existing memories saved before you turned this on can be filled in.

Hosted backend — admin endpoint:

curl -X POST https://www.mnueron.com/api/entities/backfill \
  -H "Authorization: Bearer mnu_..." \
  -H "Content-Type: application/json" \
  -d '{ "limit": 100, "dry_run": true }'

dry_run previews. Drop it for real processing. Each batch defaults to 100 memories and is capped at 1000 — beyond that, paginate with since (epoch ms).

Local CLI:

ANTHROPIC_API_KEY=... mnueron extract-entities --dry-run
ANTHROPIC_API_KEY=... mnueron extract-entities --ns project-x --limit 200

The local subcommand processes one batch at a time. Re-run with --since to walk forward through older memories. Use --force to re-extract memories that already have an entities array.

Cost reference

Per-memory extraction cost:

  • Claude Haiku 4.5: ~$0.001 per save (input + output combined)
  • gpt-4o-mini: ~$0.0001 per save (10x cheaper)
  • LLM tiebreaks (hosted resolver only): one batched Haiku call per save with ambiguous entities, typically <$0.0005

These add to the existing auto-synopsis cost when both are enabled. With BYOK, you pay your own provider directly; mnueron never sees or stores your key.

Privacy posture

  • BYOK keys are stripped from metadata before the row hits Postgres / SQLite, even when extraction is skipped (e.g. short content). The strip happens before the gate check — defense in depth.
  • All entity-related rows (entities, memory_entities) are RLS-scoped to your org_id. Other orgs can never see your entities, even if they guess a canonical_id (queries return empty under RLS).
  • Local mode keeps everything in ~/.mnueron/memories.db — entity extraction calls go to whichever LLM provider you've configured but the extracted entities never leave your machine.

Limitations & roadmap

What's not in v1 of this layer:

  • Knowledge graph — entities are extracted but relationships between them aren't yet. "John recommended deprecating v1" gives us John and v1 but no recommended edge. Coming in v0.5.
  • Temporal validity windows — "John worked at Stripe until 2025" gets John and Stripe but no time-bounded edge between them. Coming in v0.6.
  • Self-revising consolidation — duplicate entities don't auto-merge on a background pass yet. The resolver merges on write, not on reflection.
  • Embedding-based resolution — v1 uses pg_trgm (string similarity) on the hosted side. Once we add server-side embeddings, the resolver will also catch semantic equivalence (e.g. "the CFO" and "Jane Doe").

For the full multi-quarter plan, see ENGINE_ROADMAP.md in the main mnueron repository.

Last updated 2026-05-19edit