Published on

Claude Skills: A Technical Deep Dive into Agentic Orchestration

Authors

On December 18, 2025, Anthropic released the Agent Skills specification as an open standard a move that triggered rapid industry adoption by Microsoft and OpenAI within 48 hours. This wasn't a coincidence. The Agent Skills standard solves a fundamental problem that every developer building agentic AI systems has encountered: the context-competency paradox.

This post provides a technical deep dive into how Agent Skills work, when to use them, and how they complement the Model Context Protocol (MCP).


The Problem: Context Window Bloat

To make an AI agent capable of handling diverse tasks developers traditionally preloaded the agent's system prompt with massive libraries of instructions and tool definitions. This approach creates several problems:

  • Token costs: Filling a 200k context window with static instructions is expensive
  • Latency: Larger contexts mean slower inference
  • Lost-in-the-middle syndrome: Models fail to retrieve instructions buried in large context blocks
  • Instruction dilution: As instruction density increases, adherence to individual instructions decreases

While Retrieval-Augmented Generation (RAG) handles declarative knowledge and function calling handles discrete actions, neither solves the orchestration problem—the "how-to" of executing complex, multi-step workflows that require judgment and adaptation.


What Are Agent Skills?

Agent Skills represent a paradigm shift from monolithic system prompts to modular, filesystem-based intelligence. Instead of treating capabilities as intrinsic model weights or massive prompt dumps, Skills are discoverable, loadable modules.

Think of it like Docker for AI capabilities: just as containers standardized software deployment, Agent Skills standardize agentic behaviors. A skill authored for Claude can be executed by OpenAI's models or Microsoft's Copilot agents without modification.

A skill is not a single file but a directory containing:

  • A manifest file (SKILL.md)
  • Optional supporting resources (scripts, references, assets)
.claude/skills/
└── my-skill-name/
    ├── SKILL.md           # Required: Metadata & Instructions
    ├── scripts/           # Optional: Executable code
    │   ├── process.py
    │   └── util.sh
    ├── references/        # Optional: Static knowledge
    │   └── api-spec.json
    └── assets/            # Optional: Templates, Images
        └── template.docx

The SKILL.md Anatomy

The SKILL.md file is the interface between agent and capability. It combines YAML frontmatter for machine-readable metadata with Markdown for human-and-machine-readable instructions.

---
name: generate-receipt
description: Generate a PDF receipt for a transaction. Use when asked to create a receipt or proof of purchase.
allowed-tools: python, read_file
---

# Generate Receipt

This skill creates a formatted PDF receipt based on transaction details.

## Procedure

1. **Extract Details**: Identify customer name, date, items, and total amount
2. **Validation**: Ensure all items sum to the total
3. **Generation**: Run the script to generate the PDF

python3 scripts/create_pdf.py --customer "${CUSTOMER_NAME}" --items "${ITEMS_JSON}"

4. **Verification**: Check if the file was created successfully
5. **Output**: Return the path of the generated PDF

Frontmatter Schema

FieldRequiredDescriptionConstraints
nameYesUnique identifierMax 64 chars, lowercase, hyphens only
descriptionYesSemantic trigger for discoveryMax 1024 chars
allowed-toolsNoSecurity scopingComma-separated list of permitted tools
versionNoVersion controlSemantic versioning string
licenseNoLicense informationStandard license identifier
metadataNoAdditional propertiesCustom key-value pairs

The choice of Markdown is architectural, not aesthetic. LLMs are trained on vast repositories of code and documentation, exhibiting high proficiency in interpreting Markdown-formatted instructions.


Progressive Disclosure: The Key Innovation

The most significant technical innovation is Progressive Disclosure—loading information only when strictly necessary. This mechanism operates on three levels:

Level 1: Discovery (Metadata Only)

At startup, the agent scans skills directories (e.g., ~/.claude/skills) and reads only the YAML frontmatter. The name and description fields are injected into the system prompt:

<available_skills>
  <skill>
    <name>generate-receipt</name>
    <description>Generate a PDF receipt for a transaction...</description>
  </skill>
  <skill>
    <name>pdf-extractor</name>
    <description>Extract tables from PDF documents...</description>
  </skill>
</available_skills>

This is extremely lightweight: 100 skills consume approximately 5,000 tokens in this summary form, leaving the vast majority of the context window available for the active conversation.

Level 2: Activation (Context Loading)

When the agent determines a skill is needed—based on semantic similarity between the user's request and a skill's description—it invokes the "Skill Tool." This reads the full SKILL.md content and injects it into the active context.

This dynamic prompt injection solves "catastrophic forgetting" in long conversations. Instructions are fresh in the context window at the moment of need, maximizing attention.

Level 3: Execution (Resource Access)

Complex skills reference files in assets/, references/, and scripts/ subdirectories. The agent uses standard filesystem tools to access these resources on demand. A database migration skill might reference schema.sql, but the agent only reads it when the specific task requires checking a table definition.


The Skill Tool: A Meta-Mechanism

Skills are implemented via a meta-tool—a tool that modifies the agent's own behavior. The tool definition looks like:

name: Skill
description: >
  Use this tool to load a skill into your context.
  Available skills:
  <available_skills>
    <skill>
      <name>git-helper</name>
      <description>Manage git workflows...</description>
    </skill>
  </available_skills>
parameters:
  type: object
  properties:
    name:
      type: string
      description: The name of the skill to load.

When the model invokes Skill(name="git-helper"), the runtime:

  1. Intercepts the call
  2. Reads the corresponding SKILL.md from disk
  3. Returns its content as the "tool output"

The model processes this output as new system instructions, effectively reprogramming itself in real-time.


Script Execution Patterns

Skills execute code in three primary ways:

Pattern A: Direct Script Execution

# PDF form extraction
python scripts/extract_form_field_info.py <input.pdf> <output.json>

# Excel recalculation
python recalc.py <excel_file> [timeout_seconds]

Pattern B: Pipeline Execution

Scripts chained in sequence for complex workflows:

# PDF form filling pipeline
# 1. Check if PDF has fillable fields
python scripts/check_fillable_fields.py <file.pdf>

# 2. Extract field information
python scripts/extract_form_field_info.py <input.pdf> <field_info.json>

# 3. Convert PDF to images for visual analysis
python scripts/convert_pdf_to_images.py <file.pdf> <output_directory>

# 4. Agent creates field_values.json based on analysis

# 5. Fill the form fields
python scripts/fill_fillable_fields.py <input.pdf> <field_values.json> <output.pdf>

Pattern C: Library Import

Scripts provide classes/functions the agent imports into generated code:

from skills.docx.scripts.document import Document

doc = Document('workspace/unpacked', author="Claude", initials="C")
node = doc["word/document.xml"].get_node(tag="w:del", attrs={"w:id": "1"})
doc.add_comment(start=node, end=node, text="Comment text")
doc.save()

This architecture enables deterministic operations: the LLM provides reasoning (when to run the script), while the script provides execution reliability (mathematical correctness, precise parsing).


Skills vs MCP: Brain vs Hands

A common question is how Skills relate to the Model Context Protocol (MCP). The distinction is fundamental:

FeatureAgent SkillsModel Context Protocol (MCP)
AnalogyBrain / Training ManualHands / API Adapter
StorageFilesystem (SKILL.md)Server Process (Local/Remote)
Data FormatUnstructured (Markdown + Code)Structured (JSON schemas)
MechanismContext InjectionRemote Procedure Call (JSON-RPC)
Use CaseComplex workflows, best practicesData fetching, API actions, CRUD

MCP solves the connectivity problem—linking agents to data sources. It abstracts databases, repositories, and APIs into a uniform interface.

Skills solve the orchestration problem—defining how those connections are used. A skill contains the standard operating procedure for using MCP tools to accomplish a goal.

Example: The Difference in Action

An MCP server provides a generic query_database tool. A "Financial Reporting" Skill contains instructions on how to use that tool to generate a balance sheet:

"First query the transactions table filtering by the current quarter, then sum the debit column, then format the results..."

Integration Pattern: Compiling MCP to Skills

A powerful pattern is "compiling" MCP capabilities into Skills for token efficiency:

Direct MCP Approach: The agent loads Playwright MCP server schemas (navigate, click, type, evaluate, etc.)—consuming 5,000-8,000 tokens just for tool definitions.

Skill Wrapper Approach: The agent has a browser-use skill (~150 tokens). The skill instructs the agent to run a local script that handles MCP interactions and returns only the final result.

Result: Token overhead drops by approximately 98% (from ~12,500 to ~250 tokens per interaction).


Real-World Workflow Examples

PDF Form Filling

Step 1: Check fillable fields
┌─────────────────────────────────────────┐
│ python scripts/check_fillable_fields.py<file.pdf>└─────────────────────────────────────────┘
            ├── Has fillable fields ──────────────┐
            │                                     │
            └── No fillable fields                │
                     │                            │
    ┌────────────────▼─────────────┐   ┌─────────▼───────────────────┐
Non-fillable workflow:       │   │ Fillable workflow:1. Convert PDF to PNG        │   │ 1. Extract field info       │
2. Visual analysis           │   │ 2. Convert PDF to PNG3. Create fields.json        │   │ 3. Analyze purpose          │
4. Add annotations           │   │ 4. Create field_values.json    └──────────────────────────────┘   │ 5. Fill fillable fields     │
                                       └─────────────────────────────┘

Excel with Validation Loop

┌─────────────────────────────────┐
Create/Edit Excel with openpyxl │
- Use formulas, not hardcodes   │
- Apply color coding            │
└─────────────────────────────────┘
    ┌─────────────────────┐
    │ python recalc.py<excel_file>    └─────────────────────┘
    ┌─────────────────────────────┐
Check JSON output           │
    │ status: "success"Done    │ status: "errors_found" →    │
Fix errors, loop back     │
    └─────────────────────────────┘

Output format from recalc.py:

{
  "status": "success",
  "total_errors": 0,
  "total_formulas": 42,
  "error_summary": {
    "#REF!": {
      "count": 2,
      "locations": ["Sheet1!B5", "Sheet1!C10"]
    }
  }
}

Security Considerations

Code Execution Risks

Unlike MCP's schema-based contracts with validated inputs/outputs, Skills often rely on direct code execution via Bash or Python. A malicious skill could execute scripts that exfiltrate environment variables.

Mitigations include:

  • Sandboxing: Container execution with no or restricted network access
  • Filesystem scoping: Access limited to project directory

allowed-tools Enforcement

The allowed-tools field restricts which system tools a skill can access:

allowed-tools: Read, Grep  # Blocks Write, Bash

In strict environments (Claude Agent SDK), this is a hard constraint enforced by the orchestration layer. In looser implementations, it may function as guidance rather than control.


When to Use Skills

Use Skills when you need:

  • Complex multi-step workflows: Database migrations, document processing pipelines
  • Standard operating procedures: Consistent execution of business processes
  • Enterprise knowledge management: Codified institutional knowledge available to all agents
  • Token efficiency: Wrapping verbose tool interactions in compact skill definitions
  • Portability: Behaviors that work across different agent implementations

Use MCP when you need:

  • Direct data access: Querying databases, fetching from APIs
  • Real-time connections: Stateful connections to external systems
  • Standardized tool interfaces: Uniform access to diverse data sources

Use both together for production systems: MCP provides the tools, Skills orchestrate their use. Decide based on frequency, complexity and validation needs of datasource if MCP or Skill is the right tool.


Skill Discovery Locations

Different platforms adopt consistent paths:

PlatformDefault PathNotes
Claude Code~/.claude/skillsAlso checks project-level .claude/skills
OpenAI Codex~/.codex/skillsIdentical structure
GitHub Copilot.github/skillsRepository-specific
Goose~/.config/goose/skillsFollows open standard
OpenCode.opencode/skillGlobal and project scopes

Creating Your First Skill

Here's a minimal skill structure:

my-skill/
├── SKILL.md
└── scripts/
    └── run.py

SKILL.md:

---
name: my-skill
description: Brief description of when to use this skill. Trigger phrases: "do X", "help with Y".
allowed-tools: python, read_file
---

# My Skill

## When to Use
Use this skill when the user asks to [specific task].

## Procedure
1. Step one explanation
2. Run the script:

python3 scripts/run.py --arg value

3. Return the result to the user

The description field is critical—it's how the agent discovers your skill. Write it as if describing when to trigger the behavior.


References