Skip to content

Claude Code Mastery: The Config That Compounds

27 min read5289 words

There are two ways to use Claude Code, and they produce wildly different results.

The first way is to open it, type a request, read the suggestion, accept or reject it, and move on. This is the way most people use it, and it works. You get a real speedup, you ship a bit faster, and you tell people the AI is "pretty good." Nothing wrong with that.

The second way is to treat Claude Code as a programmable agent that you configure once and benefit from forever. You give it memory of how you work, custom commands for your repeated tasks, specialist sub-agents with their own permissions, and a project setup that gets sharper every time you correct it. The first way gives you a faster typist. The second way gives you a system that compounds.

I have spent the last year living in the second camp, and the gap between the two is not a matter of prompting skill. It is a matter of setup. This post is the deep dive into that setup: the .claude directory, the memory files, the rules, the skills, the sub-agents, the plugins, and the MCP servers that turn Claude Code from a chat box into part of my development environment.

If you want the workflow side of this story, the day-to-day rhythm of shipping with AI, I covered that in how I ship faster with AI-augmented development. This post is the configuration companion to that one. Where that post is about how I work, this one is about what I built underneath to make that work possible.

A quick credit before I start. The framing of "beyond the prompt" and the idea of configuration that compounds came partly from reading Arpan Patel's post on Claude Code mastery. I have taken the topic in my own direction, grounded in my actual setup, and corrected a few feature claims along the way, but the inspiration deserves naming.


The mental model that changes everything

Before any of the configuration matters, one mental shift has to land: stop pair-programming with Claude and start delegating to it.

When you pair-program, you guide line by line. You watch every token, you steer constantly, and you stay in the loop on every decision. That is a fine way to learn, but it caps your speedup at how fast you can read. The model performs best when you treat it like an engineer you are delegating to, not a junior you are babysitting through every keystroke.

The second half of that shift is just as important: give Claude a way to verify its own work. This single idea is responsible for most of the quality I get out of it. If Claude can run the tests, check the types, hit the dev server, or read the build output, it will iterate on its own until the result is actually correct instead of handing you something that looks correct. A model that can see the red X will keep going until it turns green. A model that cannot will confidently tell you it is done.

Everything below is, in one way or another, in service of those two ideas. The memory files tell Claude how I want work delegated. The skills and sub-agents give it the structure to own a slice of work. The MCP servers and hooks give it the ability to verify what it did. Configuration is not bureaucracy. It is the scaffolding that makes delegation safe.


The .claude directory, top to bottom

Claude Code reads configuration from two places, and understanding the split is the foundation for everything else.

There is a global scope at ~/.claude/ that applies to every project on your machine, and a project scope at .claude/ in any repo root that applies only to that repo. Project settings layer on top of global ones, so the local repo always gets the final say. This is the same idea as global versus local Git config, and it pays to think about it the same way.

Here is roughly what my global directory looks like:

~/.claude/
├── CLAUDE.md            # My universal agent rules, applied everywhere
├── settings.json        # Permissions, model defaults, hooks
├── settings.local.json  # Machine-specific overrides (gitignored)
├── agents/              # Reusable sub-agent definitions
│   ├── design-critic.md
│   ├── frontend-builder.md
│   ├── security-reviewer.md
│   └── ...
├── commands/            # Personal slash commands
├── rules/               # Cross-cutting rule files I maintain
│   ├── git-commit.md
│   ├── github.md
│   ├── monorepo.md
│   └── ...
└── skills/              # Custom skills

And here is what a project scope adds on top of it:

<repo>/.claude/
├── CLAUDE.md            # Project-specific instructions, committed
├── settings.json        # Team-shared permissions and hooks, committed
├── settings.local.json  # My personal project overrides, gitignored
├── agents/              # Project-only sub-agents
└── skills/              # Project-only skills

A note worth being precise about, because the internet gets this wrong. The official .claude subdirectories that Claude Code loads automatically are agents/, skills/, and commands/. There is no canonical rules/ directory that the tool reads on its own. My rules/ folder is a personal organizational convention. I keep each cross-cutting concern as its own file and pull them into my instruction set, rather than stuffing everything into one enormous CLAUDE.md. If you copy my structure, copy it knowing that rules/ is my convention, not a built-in feature. I would rather you understand the mechanism than cargo-cult the folder.

The split between scopes matters more than it looks. Global is for who I am as a developer: my commit style, my safety constraints, my workspace map. Project is for what this codebase is: its stack, its conventions, its current sprint. Keeping those separate means I never accidentally teach Claude that "all projects use Tailwind" just because the one I happened to be in does.


CLAUDE.md: the highest-leverage file you will write

If you take one thing from this post, take this: your CLAUDE.md is the single highest-leverage configuration you will ever write. It is the difference between Claude asking five clarifying questions per task and Claude nailing it on the first pass.

CLAUDE.md is memory. It is injected into context at the start of a session, so everything in it is "known" before you type a word. That makes it powerful and also makes it expensive: every line competes for attention, so a bloated file is worse than a focused one. The discipline is to write the things that change Claude's behavior and cut the things that just describe the obvious.

My global CLAUDE.md is organized around behavior, not facts. A few of the sections that earn their place:

  • A core workflow contract. Understand, then plan, then get approval, then code. In that order, every time, unless I explicitly say go ahead. This one rule alone has prevented more wasted output than anything else I have configured.
  • Git commit rules. My shell has cat aliased to bat, so the usual cat <<'EOF' heredoc trick for multi-line commits breaks. My CLAUDE.md tells Claude to use printf instead and to always include a co-author trailer. Without this, every commit attempt would fail in a confusing way and I would have to explain it again.
  • Safety constraints. Never print real secrets, always reach for .env.example, flag any credentials it discovers, never run git add -A because that is how secrets sneak into a commit.
  • A workspace map. A table of where everything in my ~/Developer/ tree lives. When I say "follow the pattern from the dashboard project," Claude knows exactly where to look without me pasting a path.

A project-level CLAUDE.md is a different animal. It describes the codebase, not me. Here is a trimmed version of what one of mine looks like:

# Portfolio Site (mikul.me)
 
## Stack
- Next.js 16 (App Router) with MDX content
- Tailwind CSS 4, TypeScript strict mode
- Deployed on Vercel
 
## Conventions
- Blog posts live in content/blog/YYYY/MM-month/ as .mdx
- Posts are scanned from the filesystem; no registry to update
- SEO metadata is required, frontmatter drives it
- No em dashes anywhere in content (house style)
 
## Architecture decisions
- Content collections over a CMS
- RSS and sitemap auto-generate from the post tree
 
## Current focus
- New AI-engineering blog series

The most important property of this file is that it is alive. I update it the moment I notice Claude repeating a mistake, because a repeated mistake is almost always a missing line of context, not a model failure. Claude is genuinely good at writing these rules for itself, too. When it gets something wrong, I will often just say "update CLAUDE.md so you do not do that again," and it writes a cleaner constraint than I would have bothered to.

CLAUDE.local.md: your private margin notes

Sitting next to CLAUDE.md is CLAUDE.local.md, and the distinction is simple but easy to get backwards.

CLAUDE.md is shared. It gets committed, and it represents the team's contract for how work happens in this repo. CLAUDE.local.md is personal and stays out of version control. It is where I keep the things that are true for me on this project but should not be imposed on everyone: a reminder that my local database runs on a nonstandard port, a note about a flaky test I keep meaning to fix, a personal shorthand for a command I run constantly.

The mental test I use: if a new teammate cloning the repo would benefit from knowing it, it goes in CLAUDE.md. If it is a quirk of my machine or my habits, it goes in CLAUDE.local.md. Getting this split right keeps the shared file clean and keeps my personal noise out of everyone else's context window.


Rules files: one concern per file

A single CLAUDE.md that tries to hold everything becomes a wall of text that is hard to edit and harder for the model to weigh. My fix is to break cross-cutting concerns into individual files and treat each as a focused, single-responsibility module.

A few of the rule files I actually run, to make this concrete rather than abstract:

  • A commit-message rule that encodes the printf recipe, the conventional-commit prefixes I use, the title-length limit, and the co-author trailer. Commits are something I do dozens of times a day, so getting this consistent has outsized value.
  • A GitHub account-routing rule. I work across a personal account and a work account, and pushing from the wrong one is a genuine hazard. The rule tells Claude which account owns which directory tree and to switch the gh CLI identity before any push in a work repo, then switch back. This is the kind of thing I would forget at 5pm on a Friday, which is exactly when it matters.
  • A monorepo rule that activates when a project has a turbo.json or a pnpm-workspace.yaml, reminding Claude to respect package boundaries, install dependencies into the right workspace, and never import from another package's src/ directly.
  • A role-awareness rule that asks Claude to keep a reviewer's eye, a security mindset, and an accessibility check running in the background as it writes, so it flags an obvious XSS hole or a missing ARIA label without me having to invoke a dedicated review pass.

The reason this structure works is editability. When my commit convention changes, I edit one small file instead of hunting through a giant document. When I want to share just my monorepo conventions with someone, I hand them one file. Each rule is independently understandable, independently editable, and independently shareable. That is the whole point.


Settings and permissions: tuning the friction

settings.json is the least glamorous file in the directory and one of the most consequential, because it decides how often Claude has to stop and ask you for permission.

Claude Code gates tool use behind a permission system. Some actions run freely, some prompt you first, and some are blocked outright. The settings.json file is where you draw those lines, with allow and deny lists that match tools and command patterns. The goal is not to approve everything. It is to approve the safe, repetitive things so the prompts you do get are meaningful.

In practice I tune this by watching where I keep clicking "yes." If I approve the same read-only command twenty times in a week, that is a signal to move it onto the allow list. Reading files, running git status, listing a directory, hitting the type-checker: these are safe and constant, so they live on the allow list and never interrupt me. Anything that writes outside the repo, touches credentials, or talks to the network stays gated. The deny list is reserved for the genuinely destructive, the commands I never want to run by accident regardless of context.

The split between settings.json and settings.local.json mirrors the memory files. The shared settings.json is committed and represents the team's agreement about what is safe in this repo. The settings.local.json is gitignored and holds my personal overrides, so I can loosen or tighten things on my own machine without changing the contract for everyone else.

Getting this right is what makes delegation tolerable. Too loose and you have handed an autonomous agent a blank check. Too tight and you are clicking "allow" so often that you might as well be doing the work yourself. The sweet spot is a permission set where every prompt that interrupts you is one that genuinely deserves a human decision.


Skills: reusable expertise that loads on demand

Skills are where Claude Code stops being a general assistant and starts being your assistant. A skill is a packaged, reusable procedure that lives in a directory with a SKILL.md at its root.

The format is markdown with a small YAML frontmatter block:

---
name: pr-ready
description: >
  PR readiness check. Use before opening a pull request to run
  lint, type-check, and tests, and to review git hygiene and docs.
---
 
# PR Readiness Check
 
When invoked, do the following in order:
 
1. Run the project's lint command and report failures.
2. Run the type-checker; do not proceed past type errors.
3. Run the test suite and summarize results.
4. Check that the branch is named sensibly and the diff is scoped.
5. Confirm docs were updated if behavior changed.
 
Report a go / no-go verdict with the specific blocking items.

Two design properties make skills worth understanding deeply.

The first is progressive disclosure. A skill does not consume any context until it is actually invoked. At session start, Claude only knows the skill's name and description, not its full body. That means you can have dozens of skills installed without paying a context tax for the ones you are not using. The description is the part Claude reads to decide whether a skill is relevant, so writing a sharp, trigger-oriented description is the difference between a skill that fires at the right moment and one that sits unused.

The second is invocation control. By default Claude can auto-invoke a skill when it judges the task matches. For most skills that is what you want. But for anything safety-critical, where you never want the model to decide on its own to run it, you set disable-model-invocation: true in the frontmatter. Now the skill only runs when you explicitly call it with its slash name. I use this for anything that touches deploys or external services, where an eager auto-invocation would be the opposite of helpful.

Skills can also bundle helper files. A SKILL.md can sit next to shell scripts, templates, or reference documents in its directory, and the instructions can point Claude at them. This turns a skill from a static prompt into a small toolkit. My most-used skills are the boring, repeated processes: a spec-first intake that turns a vague request into an approved one-page plan before any code is written, a quick-review pass over the current diff, and a pre-PR gate. None of them are clever. All of them save me from re-explaining a process I run every day.


Sub-agents: parallel work in isolated context

Sub-agents are the feature that most changed how I scale work, and they are worth understanding precisely because they behave differently from everything above.

A sub-agent runs in an isolated context window. It does not see my whole conversation. It receives only the brief I hand it and works from there, then reports a result back. This isolation is a feature, not a limitation: it means a sub-agent stays focused on its slice without being distracted or polluted by the rest of the session, and it means I can run several at once without their contexts bleeding into each other.

Each sub-agent is defined by a markdown file with frontmatter for its description, the tools it is allowed to use, the model it should run on, and its system prompt:

---
name: pr-reviewer
description: Read-only PR reviewer. Catches correctness bugs and risky changes before a human reviews.
tools: Read, Grep, Glob, Bash(git diff:*), Bash(git log:*)
model: opus
---
 
You are a senior engineer reviewing a pull request. You have
read-only access. You cannot edit, write, or create files.
 
Focus on correctness, edge cases, and whether the change follows
the project's established patterns. Rank findings by severity.
Do not nitpick style; the formatter handles that.

Look closely at that frontmatter, because three properties there are the whole reason sub-agents are powerful.

Scoped tool permissions. This reviewer has no Edit or Write tool. It physically cannot modify the codebase. When I want a critic, I want a critic that is structurally incapable of "helpfully" fixing things while it reviews. The permission scope enforces the role.

Per-agent model selection. You can pin each sub-agent to a specific model. I run a careful reviewer on a strong reasoning model and run a high-volume mechanical task on a cheaper, faster one. The orchestrator and the specialist do not have to share a model, so you tune cost and capability per job.

A focused brief. Because the sub-agent starts with zero context from our conversation, I have learned to always brief it with three things: the main task and why it matters, the specific slice this agent owns, and the constraints and prior decisions it needs to respect. A sub-agent handed a one-line instruction with no context produces generic, mis-targeted work. The three-section brief is non-negotiable in my setup.

I keep a small roster of specialists rather than one do-everything agent. A design critic that reviews UI for hierarchy, spacing, and accessibility and is read-only by design. A frontend builder that owns production component work. A security reviewer for auth flows and injection risks. A research analyst that goes and reads library docs and the live web so I do not have to. A refactoring specialist for behavior-preserving restructures. Each one has a tight description so the right specialist gets picked for the right job.

The patterns that come out of this are the real payoff:

  • Parallel research. When I am weighing two approaches, I send one agent to investigate each, and I get both analyses back at roughly the same time. Any A-versus-B decision gets cut roughly in half.
  • Writer and reviewer separation. One agent implements, a separate read-only agent reviews. The reviewer never saw the writer's reasoning, so it catches what the writer rationalized away.
  • Fan-out search. When I need to know how something is used across a large codebase, a search-focused agent sweeps many files and returns just the conclusion, so my main context never fills with file dumps.

The rule of thumb I land on: a sub-agent is for work that should happen in isolation, in parallel, or under different permissions. If you just want a guided procedure in your current context, that is a skill, not an agent.


Hooks: automation that runs whether you ask or not

Everything so far has been about giving Claude better context and better tools. Hooks are a different category: deterministic automation that fires on events, with no model judgment involved. A hook is a shell command wired to a trigger, and it runs every time that trigger fires, every time, no exceptions.

The triggers that matter most are the ones around tool use. A hook can run before a tool is used or after it, and it can run on session events too. That small surface unlocks a surprising amount of leverage. Three hooks in particular have earned permanent spots in my setup.

Auto-formatting on every edit

Every time Claude edits a file, a post-edit hook runs the project's formatter, whether that is Prettier, Biome, or something else. This sounds trivial and is not. It removes an entire category of noise from diffs and reviews, and I never have to ask Claude to "fix the formatting" because formatting is no longer something Claude does. It is something that happens to the file regardless. The model focuses on logic; the hook handles style.

Blocking dangerous commands

A pre-command hook inspects commands before they run and blocks the genuinely destructive ones unless I explicitly override. A force-push, a hard reset, a recursive force-delete: all stopped by default. The value here is not that Claude is reckless. It is that any autonomous system occasionally reaches for the wrong tool, and a hook is a guardrail that does not depend on the model getting it right. It has caught a stray force-push for me more than once, and once is enough to justify it forever.

Audit logging

A logging hook records what happened in a session: what was asked, what changed, what commands ran. This is not about distrust. It is about being able to reconstruct what happened three days later when something turns out to be subtly wrong. I have traced a bug back to a specific session this way, and that traceability is worth the trivial cost of writing a log line.

The deeper point about hooks is that they are where consistency comes from. Context files and skills shape what Claude tends to do. Hooks guarantee what actually happens. When I want a behavior to be reliable rather than likely, I do not write it as a rule and hope the model honors it. I wire it as a hook so it runs deterministically. That distinction, between guidance the model interprets and automation the system enforces, is one of the more important ones to internalize. I went deeper on this in my dedicated hooks post, but even the three above will change how a session feels.


Plugins and the marketplace

Skills, sub-agents, hooks, and MCP servers are individually useful. Plugins are how you ship a bundle of them as one installable unit.

A plugin can package any combination of those pieces, plus language-server (LSP) servers for real code intelligence, and you install it from a marketplace the same way you would add a package. Anthropic maintains official LSP plugins for languages like TypeScript, Python, and Rust, which give Claude proper go-to-definition and type information instead of guessing from text. That is a meaningful upgrade for navigating an unfamiliar codebase.

One correction worth making, because the claim circulates: there is a built-in /code-review skill that reviews your current diff for correctness issues. It is a solid first-pass reviewer, but it is a single bundled skill, not a swarm of parallel auditors. For heavyweight, genuinely parallel multi-agent review, the separate /ultrareview flow exists, and it runs deeper analysis in the cloud. Use the lightweight one for the everyday diff and reach for the heavy one when the change is big enough to justify it. Knowing which is which saves you from over- or under-reviewing.

The way I think about plugins: a skill is a tool I built for myself, and a plugin is a toolset someone packaged for everyone. When a plugin's defaults match how I work, installing it is faster than rebuilding the parts. When they do not, I am better off composing my own skills and agents, which is most of what this post has been about.


MCP servers: from file-reader to system-aware agent

Out of the box, Claude Code reads and writes files and runs shell commands. That is a lot, but it is bounded by your filesystem. Model Context Protocol servers are how you push past that boundary and let Claude reach into the systems you actually use.

An MCP server exposes a set of tools over a standard protocol, and once it is connected, those tools become available to Claude the same way its built-in tools are. The catalog of what people connect is broad: databases, design tools, error trackers, issue trackers, documentation sources, note systems. My own stack leans on a live web-research server for current information, a library-docs server so Claude pulls real, version-accurate API docs instead of relying on training data, a tasks-and-notes integration, and separate GitHub connections for my personal and work accounts.

Configuration happens at one of three scopes, and choosing the right one matters:

  • Local scope keeps a server private to you on a single project.
  • Project scope shares it with the team via a committed .mcp.json, so everyone who clones the repo gets the same integrations.
  • User scope makes it available across all your projects.

The practical advice is to scope narrowly. Every connected server adds tools, and while those tools load on demand rather than running on every turn, a sprawling set of integrations still adds surface area and noise. I only expose the servers a given project genuinely needs. A documentation project does not need my database tools, and my portfolio does not need a Jira connection.

The mental upgrade MCP delivers is the one that ties this whole post together. Without it, Claude is a very capable engineer who can only touch the files in front of it. With it, Claude can read the real docs before writing the integration, check the live state of the system it is changing, and verify its work against the actual tools instead of against its assumptions. That is the verification loop from the very top of this post, extended out past the edge of the repo.


Commands worth knowing

Most people learn three or four slash commands and stop. A handful of the less obvious ones are where real session-level control lives. These are the built-ins I reach for, and I am deliberately listing only ones I have verified are real, because the internet is full of confidently described commands that do not actually exist.

  • /compact summarizes the conversation when the context window starts filling up, keeping the thread alive without losing it. I run it at logical checkpoints, after finishing a feature and before starting the next, so the summary breaks on a clean boundary instead of mid-thought.
  • /context shows what is currently occupying the context window. When responses start feeling vague, this tells me whether I am running low on room and should compact or start fresh.
  • /agents manages sub-agents, and /mcp lists the connected MCP servers and their tools. These are how I check what capabilities are actually available in the current session rather than assuming.
  • /goal switches into a planning-forward mode before a large change, which pairs naturally with the plan-then-approve workflow I lean on.
  • /batch decomposes a change into independent units and runs them in isolated worktrees, the right tool for a large mechanical refactor that splits cleanly across files.
  • /loop runs a prompt or command on a recurring interval, useful for polling a long-running process or repeating a check.
  • /resume brings back a previous session, so a thread I closed yesterday is not gone for good.
  • /model switches the active model mid-session, so I can drop to something faster for grunt work and come back up for the hard thinking.

The commands are not the skill, though. The skill is knowing when context is the bottleneck versus when capability is. A vague answer late in a long session is usually a context problem that /compact fixes, not a model problem that a sharper prompt fixes. Learning to read which is which is worth more than memorizing the full command list.


How it all comes together on a normal day

Configuration is only worth it if it changes the day. Here is how mine actually plays out.

I start the morning by asking Claude where I left things. It reads the recent commits, checks for uncommitted work, and gives me a status report in seconds. That eliminates the "where was I" tax that used to eat the first ten minutes of every session.

When a feature comes in, the spec-first skill kicks in before any code: a one-page plan with goal, scope, non-goals, and success criteria that I approve or edit. Approval is the gate. Nothing substantial gets built until the plan is right, which is far cheaper than building the wrong thing fast.

During the build, I lean on parallelism. If I am evaluating two libraries, two research agents go read about them at once. If I am implementing something testable, one context writes the code while a read-only reviewer waits to tear it apart. When the work spans many files in a mechanical way, a refactoring specialist handles the repetitive transformation while I keep the architectural decisions.

And throughout, the verification loop runs. Hooks format every file Claude touches, so diffs stay clean and I never ask it to "fix the formatting." A pre-command guard blocks the genuinely dangerous Git operations unless I override them, which has saved me from a stray force-push more than once. The build and the tests are right there for Claude to run, so when it says something is done, it has usually watched it pass.

None of this is the model being magic. It is the model operating inside a system I built one correction at a time.


Why the second way wins over time

The reason the configured approach pulls away from the casual one is compounding. A good prompt helps you once. A good CLAUDE.md line helps you on every task in that project, forever, until you change it. A sharp sub-agent definition pays out every time that kind of work shows up. A rule file you wrote in January is still saving you in May without a single additional keystroke.

That is the actual skill, and it is not prompting. Prompting is a perishable input you regenerate every time. Configuration is an asset that appreciates. The developers who get the most out of these tools are not the ones with the cleverest prompts. They are the ones who treat their setup as a living system, who turn every repeated mistake into a permanent rule, and who keep asking what they can hand off safely now that they could not hand off last month.

My setup today looks nothing like it did a year ago, and a year from now it will have moved again. The constant is the practice: notice friction, encode the fix, never solve the same problem by hand twice. Do that consistently and the tool stops being something you use and becomes something you have built.


If you want the day-to-day workflow that sits on top of this configuration, read how I ship faster with AI-augmented development. For extending Claude with your own tools, see my guide to building custom MCP servers. For the automation layer that powers the verification loop, read the hooks deep dive. And the topic framing here was inspired by Arpan Patel's Claude Code mastery post, taken in my own direction.