Announcement•March 4, 2026

Agent Skills as a Memory Layer: Remember From “What Was Said” to “What Worked”

Agent Skills as a Memory Layer: why agents need memory that learns from every run—from 'what was said' to 'what worked'. Automatic, Markdown, portable.

1. The pain

Your coding agent runs in a sandbox, fixes a tricky env or timeout once, then hits the same failure again next week and has no idea it already solved this. Conversation memory might have the fix buried in dozens of turns; the agent doesn’t reliably reuse it. What’s missing isn’t “more chat history,” but skill memory: a memory layer that captures what the agent did and how it turned out, and turns that into reusable procedures and warnings the next run can pull in. This post is about why that matters and how it works.

2. What existing “memory” actually covers

Today’s memory tooling is mostly about two things:

Conversation memory (Mem0↗, Zep↗, framework buffers): what was said — user and assistant messages, sometimes summarized or chunked.
Retrieval over ingested content (RAG↗, vector stores): what you fed in — docs, chunks, embeddings.

Both are input-centric. They answer “what did we talk about?” and “what docs did we index?” They don’t answer “what did the agent do, and should we do it again or avoid it?”

When the real question is “how do we deploy?”, “how do we roll back?”, or “what’s the pattern for this class of bug?”, chat history and RAG often fall short. You need something that captures behavior and outcome, then turns that into reusable procedures and warnings.

If you want agents to stop repeating the same mistakes and start reusing what worked, you need skill memory — not just conversation memory.

3. Why agent skills need to be a memory layer

You might already know what agent skills↗ are and why they’re useful. The question here is why they need to be a memory layer — not just skills you hand-write or paste in sometimes.

Automatic capture and accumulation. Without a memory layer, skills stay manual: you copy a good run into a doc, or hope the next session somehow has the right context. A memory layer means skills are automatically identified, captured, and accumulated from runs. When a task completes or fails, the system distills what happened and updates the right skill — no “save” button, no digging through logs.
Markdown files. Skill memory is stored as Markdown files. Human-readable, editable, shareable. Your team can open a file, review it, fix a wrong entry, or refactor the structure. Not a black-box API or opaque embeddings.
Portable. Download skills as a ZIP, put them in git, or mount them in another agent. The same skills work across agents, LLMs, and frameworks — no re-embedding, no vendor lock-in.
Your structure. You define how skill memory is organized. With SKILL.md↗ you decide what each skill captures, how files are named and laid out, and how entries are formatted. The layer does the capture and writing; you own the schema. So the layer isn’t “we decide the shape of your memory” — it’s “we fill the structure you defined.”

That’s what Agent Skills as a Memory Layer (we use skill memory as the short name) adds: automatic capture and accumulation, Markdown for readability and control, portability everywhere, and your own structure.

4. The pipeline: from one run to one more line in a skill file

Here’s the mental model for how skill memory gets written. No need to call “save memory” from your application code — it runs in the background once you’ve connected sessions to a learning space↗.

Session runs — User messages, agent tool calls, task progress. Normal agent flow.
Task completes or fails — Some unit of work (extracted automatically or reported by the agent) is marked success/failure, or the agent reports user preferences.
Distillation — A dedicated pass over that task and its messages produces a structured summary:
- For success: goal, approach, key decisions, generalizable pattern (SOP).
- For failure: goal, failure point, flawed reasoning, correct approach, prevention principle.
- For preferences: factual statements about the user (e.g. “prefers morning meetings”).
Skill agent — Given the learning space’s skills (e.g. daily-logs, user-general-facts, or your custom skills↗), it decides:
- Which existing skill to update, or
- Whether to create a new skill (at a domain level, e.g. api-error-handling, not “fix-401-bug-feb-15”). It then reads each skill’s SKILL.md, follows your file structure and guidelines, and appends or creates Markdown entries (with date, principle, steps, symptom/fix/prevention, etc.).
Next run — The agent has tools like get_skill and get_skill_file (skill tools↗). It can pull in api-error-patterns/auth.md or project-runbooks/my-service.md when relevant. Retrieval is tool-based and reasoning-driven, not “embed everything and top-k.” Full files, not chunks.

So: one run → task outcome → distillation → skill agent writes/updates files → next run reads those files when needed. Your app stays focused on agent logic; the memory layer handles the learning pipeline.

5. How Acontext works

Acontext↗ implements this pipeline end-to-end so you don’t have to:

Sessions & tasks — Store messages↗ in any LLM format; tasks are extracted from the conversation and tracked (success/failed). When a task finishes, that’s the trigger for learning.
Learning space — A container↗ for the skills that should be updated from runs. You create a space, attach a session to it with learn(space_id, session_id), and optional default skills (daily-logs, user-general-facts) or your own custom skills (ZIP with SKILL.md↗).
Distillation — A single LLM call per finished task: “Is this worth learning? If yes, here’s the distilled context (SOP / failure analysis / user facts).” That gets handed to the skill agent.
Skill agent — An agent that has tools to read skills (get_skill, get_skill_file), create skills (create_skill with full SKILL.md content), and edit skill files (str_replace_skill_file, create_skill_file, mv_skill_file). It follows your SKILL.md↗ rules and writes only structured, concise entries.
Recall — Your production agent uses Skill Content Tools↗ (get_skill, get_skill_file) or mounts skills into a sandbox↗ so it can read and execute skill scripts. Same skills, any LLM; no embeddings required for this layer.

Quick start (Python):

Upload a skill (optional). To use your own skill: write a SKILL.md↗ (e.g. one file per project, or per error category), zip it, then skill = client.skills.create(file=FileUpload(filename="my-skill.zip", content=zip_bytes)). To use only default skills, skip to step 2.
Create a learning space. space = client.learning_spaces.create(). New spaces get default skills (daily-logs, user-general-facts) automatically. To add the skill from step 1: client.learning_spaces.include_skill(space.id, skill_id=skill.id).
Run and attach a session. session = client.sessions.create() then client.learning_spaces.learn(space.id, session_id=session.id). Call .learn() before or as the agent runs so finished tasks feed the pipeline.
Inspect learned skills. client.learning_spaces.wait_for_learning(space.id, session_id=session.id), then for skill in client.learning_spaces.list_skills(space.id): and for each file f in skill.file_index, use client.skills.get_file(skill_id=skill.id, file_path=f.path) to read content.

6. Example: what skill memory looks like

Suppose your agent often deals with API timeouts and auth errors. Over a few runs, the skill agent might create or update:

api-error-patterns/timeouts.md

# Timeout Errors

## Health check timeout on deploy (date: 2026-03-02)
- Symptom: Deploy marked success but service wasn’t ready; next request 503.
- Root cause: Health check timeout was 10s; DB migration took longer.
- Correct approach: Set health-check timeout to 30s for deploy; use readiness probe separate from liveness.
- Prevention: Document min timeout per service type; fail deploy if health doesn’t pass in window.

api-error-patterns/auth.md

# Auth Errors

## Token expiry mid-session (date: 2026-03-01)
- Symptom: 401 after N minutes of tool use.
- Root cause: Bearer token not refreshed.
- Correct approach: Refresh token when 401 and retry once; if refresh fails, ask user to re-auth.
- Prevention: Use short-lived tokens + refresh in agent loop; don’t assume one token per session.

So the next time the agent hits a timeout or 401, it can call the agent tool get_skill_file with skill_name="api-error-patterns" and file_path="timeouts.md" (or auth.md) and use that content in context. When using the SDK↗ directly you’d use client.skills.get_file(skill_id=skill.id, file_path="timeouts.md"). No embedding search — the agent reasons about which skill/file to pull and gets the full unit.

7. How this fits with RAG and conversation memory

Skill memory doesn’t replace what you already use; it sits alongside it.

Conversation memory / RAG↗ — “What did we say?” and “What docs did we ingest?” Keep using them for chat continuity and document retrieval.
Skill memory — “What did we do, and what should we do next time?” Use it for procedures, runbooks, error patterns, and preferences that you want as explicit, editable files.

A simple split:

Chat + RAG → memory of the user and the corpus.
Skill memory → memory of how we solved (or failed) this kind of problem and how to reuse successful experiences.

8. Why now, and what to do next

Agents are moving from demos to production. Success and failure cases are piling up, but most teams have no automated way to turn those into reusable knowledge. At the same time, frameworks and LLMs are multiplying — the only universal format that travels everywhere is files. Skill memory is exactly that: run → distilled outcome → Markdown files → next run.

If you’re already manually turning good runs into runbooks, you’re already doing it by hand. The missing piece is a pipeline that does it automatically and keeps it in a format you can read and move.

See how it works: Long-term skill quickstart↗ — create a learning space, run a session, inspect the learned skills.
Define your own memory shape: Custom skills (SKILL.md)↗ — one file per contact, per project, per error category.
Open source: GitHub↗ — run it yourself; your data, your infra.

Acontext is Agent Skills as a Memory Layer: skill memory that automatically captures what the agent did and how it turned out, stored as Markdown you can read, edit, and reuse across agents and LLMs.

Try the quickstart↗ or run it open source↗.