POST /api/memories/search
Single-query search over memories — the canonical "find things that look like this" endpoint. Returns hits ordered by relevance score.
curl -X POST https://www.mnueron.com/api/memories/search \
-H "Authorization: Bearer $MNUERON_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"query": "indentation preferences",
"namespace": "preferences",
"k": 5
}'
Body
| Field | Type | Required | Default | Notes |
|---|---|---|---|---|
query | string | yes | — | Free text |
k | int | no | 5 | Top-k. 1..100 |
namespace | string | no | — | Narrow to one project |
metadata_filter | object | no | — | JSONB @> containment |
tags | string[] | no | — | tags @> input |
created_after / created_before | int | no | — | Epoch ms |
entity | uuid | no | — | Restrict to memories that mention this entity |
Response
{
"query": "indentation preferences",
"hits": [
{
"id": "...",
"content": "Prefers 2-space indentation in JavaScript",
"namespace": "preferences",
"score": 0.842,
"tags": ["style", "js"],
"metadata": {},
"created_at": 1737070000000,
"updated_at": 1737070000000
}
]
}
score is a Postgres ts_rank_cd BM25-style relevance number. Higher is more relevant. The exact scale depends on document length distribution in your org; treat it as comparative, not absolute.
How it works
Today this endpoint uses Postgres FTS (full-text search) over memories.content_tsv. When the background embedder is deployed, it'll run a hybrid keyword+vector search and fall back to FTS when an embedding isn't available — the response shape stays the same, so client code doesn't need to change.
For multi-query fan-out (e.g. RAG with 8 sub-questions), use bulk search instead.