Your AI agent runs a task, does useful work, then exits. Next run: blank slate. It has no memory of what it learned last time, what state it was in, which contacts it already processed, or which decisions it already made.
This is the agent memory problem. It's not a new idea — every serious agent framework mentions it. But most solutions require you to stand up a vector database, wire up embeddings, build a retrieval layer, and manage all of it in production. That's a lot of infrastructure for "remember what you did yesterday."
Stash is a different approach: a hosted key-value + full-text store, accessible to your agent via MCP, with zero infrastructure to manage.
Most agent memory falls into a few categories:
| Category | Example | How long? |
|---|---|---|
| Episodic | What happened last run | Hours–days |
| Procedural | How to handle edge cases | Weeks–months |
| Semantic | Facts about the domain or users | Long-term |
| Working | Current task state (resume after crash) | Until done |
You don't always need semantic search over all of it. Most agent memory lookups are either "get the most recent state" (episodic) or "find the record about X" (semantic). Stash handles both with context() for standing facts and search() for lookup.
Claude agents can call MCP tools. If you wire Stash as an MCP server, your agent gets stash_add, stash_search, context(), and usage() as native tools — no custom code, no API calls, no database queries.
The agent decides what to save. It decides when to load. The memory lives in Stash, survives between runs, and is searchable by text.
# Example: an agent that processes customer requests
# Run 1: Agent processes request, saves outcome
Agent prompt (run 1):
"Process this request from Alex Chen about their order #8821.
When done, use stash_add to save: what you did, the outcome, and any notes
about their account that would be useful next time."
Agent response: "Resolved the refund. Saving to Stash..."
→ stash_add(collection="customer_notes", content="Alex Chen — order 8821
refund processed 2026-06-08. Preferred contact: email. Noted: sensitive
about shipping delays, acknowledge proactively next time.")
# Run 2 (a week later): fresh context, but memory survives
Agent prompt (run 2):
"Alex Chen just submitted another request — order #9102."
Agent (calls search first):
→ stash_search("Alex Chen")
← "Alex Chen — order 8821, refund processed. Preferred: email. Sensitive
about shipping delays..."
Agent: "Found previous history. Alex had a shipping issue before — I'll
acknowledge that upfront before diving into #9102."
The agent gets institutional memory. It gets better over time. And you didn't build a database.
Long-running agents (processing a queue, doing research, building a report) crash. When they restart, they start over. With Stash:
# Agent saves checkpoint on each significant step
→ stash_add(collection="agent_state", content=json.dumps({
"run_id": "run_20260608_001",
"step": "processed_items_47",
"last_item_id": "item_1047",
"results_so_far": [...summary...]
}))
# On restart, agent loads the checkpoint
→ stash_search("run_20260608_001")
← Returns the checkpoint — agent resumes from step 47, not step 0
No Redis. No a database. No custom checkpoint logic beyond "save a record, search for it later."
| Option | Setup | Search | Cost/mo | MCP-native? |
|---|---|---|---|---|
| Stash (free tier) | Paste URL | FTS5 full-text | £0 | ✓ |
| Stash Pro | Paste URL | FTS5 full-text | £8 | ✓ |
| Pinecone | Signup + API + embeddings | Vector similarity | $25+ | Custom code |
| Supabase + pgvector | DB setup + schema + embeddings | Vector + queries | $25+ | Custom code |
| a structured store local file | Code + schema | FTS5 (if configured) | $0 | No |
| Mem0 | Signup + SDK integration | Semantic | $9+ | No |
Stash is not the most powerful option. If you need sub-millisecond vector search over 10M embeddings, use Pinecone. But for the common case — agent memory that needs to survive runs and be searchable by text — Stash is the fastest path from zero to working.
The naive approach to agent memory is "stuff everything into the system prompt." This is expensive and gets worse as memory grows:
# Prompt-stuffing approach (expensive):
system_prompt = f"""
You are an agent. Here is everything you know:
{all_memory_as_text} # grows without bound, costs tokens every call
"""
# MCP pull approach (cheap):
# Agent starts with minimal context
# Calls stash_search() only when it needs specific memory
# Pays tokens only for what's actually retrieved
At our benchmark: ~192 tokens for a 500-record FTS5 search result vs. ~4,800+ for the same data stuffed in a prompt. When your agent runs hundreds of times, this matters.
If you're building a Claude agent via the Agents SDK or the Claude API with tool use:
stash_add, stash_search, context(), and usage() as toolsNo schema migration. No embedding pipeline. No infra to run.
Stash is intentionally simple. What it doesn't do:
For production multi-agent systems at scale, evaluate accordingly. For individual agents, side projects, and prototypes: Stash is usually enough, and it's the fastest thing to wire up.
Give your agent memory in 2 minutes
Free tier. No credit card. Connector URL on signup.
Get your connector URL →