Stash / Blog · June 2026 · 6 min read

Claude Persistent Memory via MCP: Keep Context Without Burning Your Context Window

Claude doesn't remember yesterday. Every new conversation starts cold. For casual use that's fine. For anyone who uses Claude seriously — for work, research, personal projects — it's a constant friction point.

The common workaround is to paste context into the system prompt or the first message: "here's my role, here's the project, here's the list of things we've been tracking." It works. It also costs tokens on every single call.

This post is about a better way: storing context outside Claude, in a lightweight MCP record store, and retrieving exactly what you need when you need it.

Why "paste it all in" doesn't scale

The naive approach is to keep a giant "context file" and dump it at the start of every conversation. People do this. The problems:

Token cost compounds. A 2,000-token context block on 10 conversations a day = 20,000 tokens you're paying for whether or not it's relevant to this particular task.
Context window limits. Claude has a finite window. The bigger your context dump, the less room for actual work.
It's never the right subset. You don't need all of your notes for every task. You need the three relevant ones. But you can't know which three until Claude already knows the full picture.

What you actually want: ask for what's relevant, get back only that, pay for only that.

How MCP external memory works

The Model Context Protocol lets Claude connect to external services — tools it can call during a conversation. An MCP record store gives Claude:

A search tool: "find records about project Alpha" → returns the 3 most relevant records, not 300
A context tool: "load my standing context" → returns your role, current projects, working style — one cheap call
A write tool: "remember that the deadline moved to July" → stored, not in Claude's weights, not in your custom instructions

The key difference: you're not paying for context you're not using. Every byte returned by the MCP server is something Claude actually asked for.

Stash: the token-light MCP record store

Stash is a hosted MCP server. You connect it to Claude once (paste the connector URL, sign in with Google) and it gives Claude three tools:

context() — loads your standing context (role, projects, preferences). Designed to be called at the start of a "start my day" prompt.
search(collection, query) — full-text search across any named collection. Returns the top matches, not everything.
add(collection, record) — writes a new record. Claude can do this for you mid-conversation.

The store is a structured store with FTS5. Queries return in under 50 ms at typical personal-use scales. The connector response is structured and terse — Claude reads it cleanly without wasting tokens on formatting noise.

A concrete example: the "Start my day" prompt

Here's the pattern that motivated Stash. Add this to your Claude custom instructions:

When I say "start my day", call context() from Stash to load my standing context, then ask what I'm working on today.

The first time Claude sees "start my day", it calls context(). Stash returns:

{
  "role": "Product manager at a B2B SaaS startup",
  "current_projects": ["Q2 pricing review", "user interview synthesis"],
  "working_style": "direct, no filler, bullet points preferred",
  "timezone": "Europe/London"
}

That's ~80 tokens. Compare to pasting the same information manually as free text: probably 150–200 tokens, plus you have to remember to do it every time.

The more useful part: because it's a live store, you can update it. "Add to my context: I'm now also covering the enterprise tier launch." Claude calls add("context", ...). It's there tomorrow. You didn't have to edit a custom instructions block.

The activation model: You fill in your context once. After that, every conversation starts with one cheap context() call instead of a manual paste. The friction drops to near-zero.

Stash vs. alternatives

A few other approaches people use for Claude memory:

Custom instructions

Good for static identity context (your name, your role, always-on preferences). Bad for anything that changes — project statuses, lists, tasks. Custom instructions are baked in on every call whether needed or not.

Pasting a document each time

Free, but high-friction and token-expensive. No search — you paste everything and hope Claude finds what's relevant.

Notion via MCP

Notion's MCP connector gives Claude access to your workspace. It works. The problem: Notion returns full page content including all metadata, version history markup, and property columns. A 500-record Notion database search can cost 4,000–5,000 tokens for results that Stash would return in under 200.

Stash

Designed specifically for Claude use. Terse responses, full-text search, no metadata bloat. The retrieval cost is low by design, not by accident.

Token comparison (real numbers)

We ran a structured test: 500 records, same search query, same scoring criteria, via Notion MCP vs Stash. Results:

Notion MCP: ~4,100 tokens returned for a top-5 result set
Stash: ~175 tokens for the same top-5

That's roughly 2.4× cheaper per query at equivalent result quality. (This is a preliminary single-run comparison — your mileage will vary with different data shapes, but the structural reasons for the gap are stable: Notion returns pages with full metadata; Stash returns records with only the fields you asked for.)

Getting started (three steps)

Add the connector. In Claude: Settings → Connectors → Add custom → paste https://app.stashlite.com/mcp. Sign in with Google. You get a free account, no card required.
Fill in your context. Tell Claude: "Add to my Stash context: my role is [X], I'm working on [Y]." Claude writes it for you.
Add the custom instruction. In Claude custom instructions: "When I say 'start my day', call context() from Stash." Done.

From that point on, every conversation can start with a single cheap call that loads your standing context, instead of a manual paste.

Stash is free to start.
10,000 records · 100 queries/month · no card required.
Add to Claude →

What Stash is not

Worth being clear: Stash is not a replacement for Notion, a document editor, or a general-purpose database. It's a retrieval layer — good at getting the right records back to Claude cheaply. If you need rich documents, tables with relations, or a visual editing interface, those tools still have their place. Stash sits in the gap between "too small for custom instructions" and "too heavy for Notion."

Claude Persistent Memory via MCP: Keep Context Without Burning Your Context Window

Why "paste it all in" doesn't scale

How MCP external memory works

Stash: the token-light MCP record store

A concrete example: the "Start my day" prompt

Stash vs. alternatives

Custom instructions

Pasting a document each time

Notion via MCP

Stash

Token comparison (real numbers)

Getting started (three steps)

What Stash is not

Further reading