POST /api/memories/search/bulk runs N queries in parallel against the
same scope and returns all results in one response. Saves N-1 RTTs when
your app needs to recall against several things at once (e.g., a
meeting-notes app pulling context for each speaker).
Request
{
"queries": ["onboarding flow", "billing edge cases", "JWT setup"],
"namespace": "work",
"k": 5,
"metadata_filter": { "speaker": "sarah" },
"created_after": 1700000000000,
"created_before": 1800000000000
}
All filters apply uniformly across all queries. If you need
different filters per query, call /api/memories?q=… separately.
Response
{
"results": [
{ "query": "onboarding flow", "hits": [/* up to k Memory objects */] },
{ "query": "billing edge cases", "hits": [...] },
{ "query": "JWT setup", "hits": [...] }
]
}
Limits
- Max 25 queries per batch (
queries[]length) - Max k=50 per query
The endpoint runs queries serially server-side inside one DB connection to avoid SET ROLE overhead per query. Latency budget: ~50-150ms per query at small store sizes.