Writing Custom Skills to Extend Your AI Coding Assistant

If you have spent any time pairing with an AI coding assistant, you have probably noticed a pattern: you explain your project conventions once, the model follows them for a while, and then the next session starts fresh. You correct the same naming mistake. You re-explain the same error-handling pattern. You paste the same context over and over.

The fix is not a better prompt. It is a better architecture. Custom skills, stored as files the agent can read and reason about, turn one-off instructions into durable, reusable knowledge. This guide shows you how to design them, where to put them, and how to wire them into an agent built with Claude.

What a Skill Actually Is

A skill is a structured document that tells the agent how to handle a specific category of task. Think of it the way you would think of a team runbook or an internal style guide: it is authoritative, scoped, and written to be followed rather than summarized.

A skill file typically contains three things:

Context. Why this skill exists. What problem it solves, what the team cares about, what assumptions are baked in.
Rules or steps. The concrete behavior you want the model to exhibit. Ordered steps for procedural tasks, rule lists for policy tasks.
Examples. At least one before-and-after pair, or a canonical output. Examples do more work than instructions alone.

Skills live in your repository or workspace, not inside the model. That is the key architectural point. The model reads them at runtime, which means you can version them, code-review them, and update them without touching your agent code.

Designing Skills That Actually Stick

The most common mistake is writing a skill like a general-purpose prompt. Vague guidance produces vague behavior. Good skills are narrow and opinionated.

One skill, one job

Resist the urge to create a single mega-document called coding-standards.md that covers everything. Break it into focused files: skills/error-handling.md, skills/api-response-shape.md, skills/database-migrations.md. A focused skill is easier for the model to apply correctly and easier for your team to maintain.

Write rules as constraints, not suggestions

Language matters. Compare these two phrasings:

Weak: “You should probably avoid returning raw exceptions to the client.”
Strong: “Never expose raw exception messages in HTTP responses. Wrap all errors in the shape { error: { code, message } } before returning.”

The second version is unambiguous. The model does not have to guess how seriously you mean it.

Include negative examples

Show what the wrong output looks like and why it is wrong. Negative examples are underused. They activate a different kind of reasoning and make the constraint concrete in a way that positive-only examples cannot.

Keep skills short enough to be read in full

If a skill document runs longer than about 600 words, split it. The model will read the whole thing, but a skill that takes five minutes to read probably has a scope problem. The goal is precision, not comprehensiveness.

Organizing Skills in Your Repo

A straightforward layout works well for most projects:

project-root/
  skills/
    index.md          # maps skill names to file paths
    error-handling.md
    api-response-shape.md
    database-migrations.md
    testing-conventions.md
  agent/
    main.py
    skill_loader.py

The index.md file is optional but useful. It gives the agent a table of contents so it can decide which skills are relevant to the current task without loading every file on every turn.

Loading Skills at Runtime with the Claude API

Here is a minimal Python example using the anthropic SDK. The agent reads skill files from disk and injects them into the system prompt before every turn.

import os
import anthropic
from pathlib import Path

def load_skills(skill_dir: str = "skills") -> str:
    skills_text = []
    for path in sorted(Path(skill_dir).glob("*.md")):
        if path.name == "index.md":
            continue
        content = path.read_text(encoding="utf-8")
        skills_text.append(f"## Skill: {path.stem}\n\n{content}")
    return "\n\n---\n\n".join(skills_text)

def build_system_prompt(skills: str) -> str:
    return f"""You are an expert coding assistant for this project.
Follow the skills below exactly when they apply to the task at hand.
If a skill is relevant, apply it without being asked.

{skills}"""

client = anthropic.Anthropic()  # reads ANTHROPIC_API_KEY from env

skills = load_skills()
system = build_system_prompt(skills)

response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=4096,
    thinking={"type": "adaptive"},
    system=system,
    messages=[
        {"role": "user", "content": "Refactor the getUserById function to use our standard error handling."}
    ]
)

print(response.content)

A few things worth noting here. The thinking: {"type": "adaptive"} parameter tells the model to decide on its own how much reasoning to apply before responding. This is the correct approach for Claude Sonnet 4.6 and newer models. The older fixed budget_tokens approach is deprecated on these versions and should not be used.

The model id claude-sonnet-4-6 is used above as a solid general-purpose choice. For tasks requiring deeper reasoning or larger codebases, claude-fable-5 and claude-opus-4-8 both offer a 1M-token context window, which means you can load an entire large codebase alongside your skills without hitting context limits. claude-haiku-4-5 has a 200K context window and is a good fit for faster, lighter tasks where cost matters more than depth.

Selective Skill Loading for Larger Skill Sets

Once you have more than a dozen skills, loading all of them on every request wastes context and can dilute focus. A better approach is to let the agent first read the index, then fetch only the skills relevant to the current task.

You can implement this with a two-step pattern:

Send the user request plus the skill index to the model. Ask it to return a list of skill names it needs.
Load only those files, then run the actual task with a context that includes only the relevant skills.

This is also a natural fit for the Model Context Protocol. MCP lets you expose your skill library as a tool server. The agent calls a get_skill tool with a skill name, and the server returns the content. This keeps skills out of the system prompt entirely and makes them available on demand, which is cleaner for agentic workflows that run many turns.

Keeping Skills Accurate Over Time

Skills go stale. The error-handling pattern you wrote six months ago may not match what your codebase does today. Build maintenance into your team’s workflow:

Treat skill files as first-class code. Put them through the same pull request review process as application code.
When you catch the agent making a recurring mistake, ask whether the relevant skill is missing or unclear. Write the fix into the skill, not just the conversation.
Schedule a quarterly review of your skill library the same way you review your runbooks. Delete skills that no longer apply.
Link skills to the code they govern. If a skill describes your API response shape, add a comment in the relevant handler pointing to the skill file. This creates a feedback loop when the code changes.

A Note on Scope

Skills work best for stable, repeatable conventions: naming patterns, error shapes, migration procedures, test structure. They are less useful for things that change frequently or require judgment that depends heavily on runtime context. Do not try to encode every possible decision into skills. The goal is to free the model to focus its reasoning on the novel parts of each task, not to script every step.

Takeaway

Custom skills are the simplest way to make an AI coding assistant behave like a member of your team rather than a capable stranger. Write each skill narrow and opinionated, store them as versioned files, load them into your system prompt or surface them through MCP, and maintain them the same way you maintain code. The investment pays back every session where the agent gets it right the first time.