Agentic Workflows for Solo Developers: Build Your Own AI Team
I build side projects alone. No team, no PM, no QA engineer reviewing my pull requests. For years that meant making hard trade-offs: ship fast but skip tests, write features but neglect documentation, push code but skip the review. Something always dropped.
That changed when I started treating AI agents not as a chat window I occasionally ask questions in, but as a team I orchestrate. Today I ship features with tests, docs, and reviews — all as a solo developer. This post explains exactly how.
If you are new to AI Engineering as a discipline, start there for the broader context. This post is the practical, hands-on companion focused on what solo developers can do right now.
The Solo Developer's Dilemma
Every solo developer knows this feeling. You have a backlog of features, a handful of bugs, docs that are three versions behind, and zero test coverage on the module you shipped last Tuesday. You are simultaneously the architect, developer, tester, tech writer, and DevOps engineer.
The traditional answer was "just prioritize." But prioritizing means something does not get done. Tests get skipped. Documentation rots. Code reviews do not happen because there is nobody to review it.
The newer answer is agentic workflows — not as a buzzword, but as a practical way to fill the roles you cannot hire for.
What "Agentic" Actually Means (Without the Enterprise Jargon)
Let me cut through the noise. An agent is just a program that can take actions in a loop until a task is done. That is it. No blockchain, no complex infrastructure, no vendor platform required.
The difference between asking ChatGPT a question and running an agentic workflow:
| Chat | Agentic Workflow |
|---|---|
| You ask, it answers | You define a goal, it works toward it |
| Single turn | Multi-step with decisions |
| No side effects | Reads files, writes code, runs tests |
| You drive the process | The agent drives, you review |
When I say "build your own AI team," I mean configuring agents that handle specific roles — research, code review, testing, documentation, deployment — and orchestrating them into repeatable workflows. No custom model training. No infrastructure. Just smart configuration of tools that already exist.
For a deeper comparison of agents versus simple chatbots, see AI Agent vs Chatbot: What is the Difference.
The AI Team: Five Agents Every Solo Developer Needs
1. The Research Agent
Before writing any code, I need to understand the landscape. What library should I use? What changed in the latest version? What are the edge cases others have hit?
The research agent handles this. It scans documentation, reads through GitHub issues, summarizes findings, and presents options.
How I implement it: I use Claude Code with MCP servers connected to Perplexity for web search, GitHub for repository scanning, and Context7 for up-to-date library docs. A typical prompt looks like:
Research the best approach for implementing OAuth 2.0 PKCE flow
in a Next.js 15 app. Compare next-auth v5 vs lucia-auth vs
rolling our own with arctic. Focus on: setup complexity,
maintenance burden, and edge cases with token refresh.
Summarize in a comparison table.The agent searches documentation, reads GitHub issues, checks recent release notes, and returns a structured comparison. What used to take me 45 minutes of tab-hopping takes about 3 minutes.
2. The Code Review Agent
This is the agent I wish I had years ago. When you work solo, nobody catches the bug you introduced at 11 PM. Nobody points out that you duplicated logic from another module. Nobody asks "did you consider the error case?"
I configure my code review agent through CLAUDE.md rules and Claude Code skills:
## Code Review Rules
- Check for error handling on all async operations
- Flag any function longer than 40 lines
- Identify duplicated logic across modules
- Verify TypeScript types are not using `any`
- Check that new database queries have appropriate indexes
- Ensure API responses match the documented schemaAfter implementing a feature, I run the review agent against my changes. It catches real issues — not just style nits, but logic errors, missing edge cases, and security concerns.
3. The Testing Agent
Test generation is where agents deliver the most obvious ROI. Writing tests is tedious. Agents are very good at tedious.
I point the testing agent at a module and ask it to generate comprehensive test coverage:
Generate tests for the PaymentService class. Cover:
- Happy path for each public method
- Error cases (network failure, invalid input, timeout)
- Edge cases (zero amount, negative amount, currency mismatch)
- Integration test for the full checkout flow
Use vitest. Match the test patterns in tests/services/.The agent reads the implementation, understands the interfaces, and generates tests that actually test meaningful behavior — not just "function exists" assertions. I review and adjust, but the heavy lifting is done.
4. The Documentation Agent
I used to let documentation slide for weeks. Now documentation updates are part of every feature workflow.
The documentation agent reads the code changes, understands the public API surface, and generates or updates docs:
Update the API documentation for the /api/projects endpoint.
The handler now supports filtering by status and pagination
with cursor-based navigation. Follow the documentation
patterns in docs/api/. Include request/response examples.It also generates JSDoc comments, updates README sections, and maintains changelogs. The key is giving it clear patterns to follow — point it at your existing docs and it matches the style.
5. The Deployment Agent
Pre-deploy checks are where solo developers get burned. You forget to run the build. You miss a type error. You push a migration that breaks staging.
The deployment agent runs a checklist before every deploy:
Pre-deploy check for the payments feature branch:
1. Run full type check (tsc --noEmit)
2. Run all tests, report any failures
3. Check for console.log statements in src/
4. Verify all environment variables are documented
5. Check database migrations are reversible
6. Verify API endpoints match the OpenAPI spec
7. Report bundle size changes vs main branchThis is essentially a CI pipeline, but one I can run conversationally and adjust on the fly.
Practical Implementation with Claude Code
The theory is nice. Here is how I actually wire this up.
Sub-Agents for Parallel Work
Claude Code supports sub-agents — background tasks that run independently while you continue working. This is the core mechanism for running your "team" in parallel.
When I start a feature, I kick off multiple agents:
Run these in parallel:
1. Research: Check if react-query v6 has breaking changes
from v5 that affect our query invalidation patterns
2. Test: Generate test stubs for the new CacheService class
3. Docs: Draft the API documentation for the new /cache endpointsWhile those run, I write the implementation. By the time I am done, I have research findings to validate my approach, test stubs to fill in, and draft documentation to review.
CLAUDE.md as Your Team Playbook
Your CLAUDE.md file is not just configuration — it is your team's operating manual. I structure mine to encode the rules every "team member" should follow. For a comprehensive guide on structuring this file, see the Claude Code Best Practices reference.
# Project Rules
## Architecture Decisions
- All API routes use the controller-service-repository pattern
- Database access only through repository classes
- No direct Prisma calls outside /repositories
## Code Standards
- Functions max 30 lines
- All public functions need JSDoc comments
- Error handling: use Result<T, E> pattern, no throwing
## Testing Standards
- Minimum 80% coverage on service layer
- Integration tests for all API endpoints
- Use factory functions for test data, not raw objects
## Review Checklist
- No TODO comments without linked issues
- All new endpoints added to OpenAPI spec
- Database queries analyzed for N+1 patternsEvery agent reads this file. Every agent follows these rules. It is like having a team handbook that everyone actually reads.
Hooks as Automated Workflows
Hooks are the automation layer that makes agents feel like a real team. They trigger automatically based on events:
{
"hooks": {
"pre-commit": [
{
"command": "npx tsc --noEmit",
"description": "Type check before commit"
},
{
"command": "npx vitest run --changed",
"description": "Run tests on changed files"
}
],
"post-save": [
{
"command": "npx eslint --fix ${file}",
"description": "Auto-fix lint issues on save"
}
]
}
}Hooks ensure that regardless of which "agent" is writing code, the output meets your standards. Think of hooks as the team lead who checks everyone's work before it ships.
MCP Servers as Team Tools
MCP servers give your agents access to external tools — just like giving a team member access to your project management tool, your database, or your monitoring dashboard.
My standard setup includes:
- GitHub MCP: Read issues, create PRs, review code, manage releases
- Perplexity MCP: Web search for research tasks
- Context7 MCP: Up-to-date library documentation
- Notion MCP: Read and update project documentation
- Database MCP: Query production data safely (read-only)
Each MCP server extends what your agents can do without you manually copy-pasting context.
Real Workflow Examples
Workflow 1: Ship a Feature End-to-End
Here is my actual process for shipping a feature as a solo developer with agents:
Step 1 — Plan (5 min). I describe the feature to Claude Code and ask for a task breakdown. The agent reads the codebase, understands existing patterns, and proposes an implementation plan with files to create or modify.
Step 2 — Research (runs in background). I kick off a research sub-agent to check for any library updates, known issues, or patterns relevant to the feature.
Step 3 — Implement (30-60 min). I write the core implementation with Claude Code assisting. The agent follows the patterns in CLAUDE.md, creates files in the right locations, and handles boilerplate.
Step 4 — Test (runs in parallel). While I review the implementation, the testing agent generates test cases. I review and adjust the tests, then run the full suite.
Step 5 — Document (5 min). The documentation agent updates relevant docs, adds JSDoc comments, and updates the changelog.
Step 6 — Review (10 min). I run the code review agent against all changes. It flags issues I fix before committing.
Step 7 — Deploy check (3 min). The deployment agent runs pre-deploy validation. Green across the board, I push.
Total time: about 90 minutes for a feature that includes implementation, tests, docs, and review. Without agents, the same feature without tests or docs took about the same time. With agents, I get the complete package.
Workflow 2: Investigate and Fix a Bug
Step 1. I paste the error message and ask the agent to investigate. It searches the codebase, identifies potential causes, and narrows down the issue.
Step 2. The agent proposes a fix with an explanation of the root cause. I review the reasoning — this is important. Do not blindly accept bug fixes.
Step 3. I ask the testing agent to generate a regression test that reproduces the bug, then verify the fix resolves it.
Step 4. Code review agent checks the fix does not introduce new issues.
This workflow turns a 2-hour debugging session into a 20-minute investigation-and-fix cycle for most bugs.
Workflow 3: Research and Prototype
When evaluating a new technology or approach:
Step 1. Research agent surveys the landscape — documentation, GitHub stars and activity, known limitations, community sentiment.
Step 2. Based on findings, I ask Claude Code to build a minimal prototype. Not a toy demo, but something that exercises the actual integration points I care about.
Step 3. I evaluate the prototype against my requirements and make a decision.
This replaces the "spend a weekend experimenting" pattern with a focused 2-hour evaluation. For a detailed look at how I structure these Claude Code workflows, see Claude Code Workflows: Ship Faster with AI Development.
What Agents Cannot Do (Yet)
I want to be honest about the boundaries. Agents are a multiplier, not a replacement for thinking.
Architecture decisions. An agent can propose an architecture. It cannot judge whether that architecture fits your specific constraints, team dynamics, growth trajectory, and business model. I still make every architecture decision myself and use agents to validate and stress-test them.
Novel design. When you need something genuinely new — a creative UI pattern, an unconventional data model, an innovative feature — agents generate variations of what exists. They do not invent. The creative spark is still yours.
Stakeholder communication. Agents cannot sit in a meeting, read the room, negotiate scope, or manage expectations. The human side of building software is firmly in your domain.
Taste. Knowing when code is "good enough" versus when it needs more polish. Knowing which features to build and which to skip. Knowing when to take on tech debt and when to pay it down. These judgment calls require experience and context that agents do not have.
Debugging truly novel issues. Agents are excellent at pattern-matching known issues. When the bug is genuinely novel — a race condition in a specific deployment configuration, a memory leak triggered by an unusual data pattern — you still need to think through it yourself. The agent assists, but you lead.
Cost Management: When to Use Agents vs Doing It Yourself
Agents consume tokens, and tokens cost money. Here is my framework for deciding when to deploy an agent versus doing the work myself:
Use agents for:
- Repetitive tasks (test generation, documentation updates, boilerplate)
- Research that requires scanning many sources
- Code review (they catch things you miss when reviewing your own code)
- Any task where the agent's output needs minor editing, not a full rewrite
Do it yourself when:
- The task requires deep creative thinking
- You need to learn the concept (letting the agent do it means you do not learn)
- The context is so specific that you would spend more time explaining than doing
- The output quality needs to be perfect on the first try (agents get you 80-90% there)
Cost benchmarks from my usage: A typical feature workflow (research + implement + test + document + review) costs roughly $2-5 in API tokens. Compare that to the value of shipping a fully tested, documented feature in 90 minutes instead of half a day. The ROI is not even close.
My Actual Setup
Here is what I use daily on my side projects:
Primary tool: Claude Code with Opus as the main model. I use it for implementation, code review, and orchestration of other tasks.
CLAUDE.md files: Every project has a detailed CLAUDE.md with architecture decisions, code standards, testing patterns, and review checklists. This is the single highest-leverage thing I have done. It turns every Claude Code session into one that understands my project's conventions.
MCP servers: GitHub (for PR workflows), Perplexity (for research), Context7 (for library docs), Notion (for project management).
Hooks: Pre-commit type checking and linting. Post-save auto-formatting. These run regardless of whether I am typing code or an agent is generating it.
Skills: Custom Claude Code skills for common workflows — /review for deep code review, /test-gen for generating tests, /deploy-check for pre-deployment validation.
Workflow discipline: I never let an agent ship code I have not reviewed. The agent proposes, I approve. This keeps me in the loop on every change while still benefiting from the speed multiplier.
For a comprehensive comparison of agent frameworks if you want to build more custom solutions, check out AI Agent Framework Comparison 2026.
Getting Started
If you are a solo developer and this sounds like a lot, start small:
- Install Claude Code and set up a
CLAUDE.mdfor your main project. Just document your code conventions and architecture decisions. - Add one MCP server — GitHub is the most immediately useful.
- Try one workflow — next time you ship a feature, ask Claude Code to generate tests after you write the implementation.
- Expand gradually. Add hooks for automated checks. Add more MCP servers. Build custom skills for your repeated workflows.
You do not need to build all five agents on day one. Start with the one that addresses your biggest pain point. For most solo developers, that is either testing (because nobody writes enough tests alone) or code review (because reviewing your own code is nearly impossible).
The goal is not to replace yourself. It is to stop making trade-offs between shipping fast and shipping well. With the right agentic setup, a solo developer can do both.
Further Reading
- AI Engineering in 2026: The Complete Practitioner's Guide — the broader discipline behind agentic workflows
- Claude Code Best Practices: The Ultimate Reference Guide — 50 tips for getting the most out of Claude Code
- Mastering Claude Code Hooks — deep dive into automation with hooks
- Claude Code Workflows: Ship Faster — structured workflows for AI-assisted development
- AI Agent Framework Comparison 2026 — choosing the right tools for custom agents