In my last post I mentioned that I used Kiro to ship an entire blog migration in a single day — infrastructure, content, CI/CD, the works. I promised a deeper look at the AI tooling. Here it is.
How did I get into it? Well, on February 13th a very dear friend, Alex Wood, messaged and showed me a Package that he had been building with his Kiro Skills. He blogs about how he is doing Production Coding and that post only server to whet my appetite. Once he handed me the Package, I was off to the races. And I haven’t stopped. My delivery cycles have only gotten tighter and the value delivered has gone off the charts.
So, here’s short version: I’ve taken the Skills that Alex handed me and built a library of reusable AI workflows called skills that handle the repetitive parts of my development process. Spec writing, code implementation, code review, session management, deployment. Each skill is a markdown file that guides Kiro on how to execute a specific workflow, step by step, with human review gates built in. They live in the repo alongside the code they operate on.
This isn’t about asking an AI to “write me some code.” It’s about encoding the way I actually work into something repeatable.
What Are Skills?#
A Kiro skill is a markdown file that lives in your repo at .kiro/skills/{name}/SKILL.md. Each one describes a workflow — when to run it, what steps to follow, what inputs it needs, what outputs it produces, and what rules to follow. When you invoke a skill, Kiro reads the instructions and executes the workflow, using its tools (file system, terminal, web search, AWS CLI) to do the actual work.
Best of all Kiro’s implementation of Skills is backed by the Agent Skills spec. These aren’t just some random prompts that were shipped because they worked in Kiro. They’re real and used by some of the biggest name in the industry. Have a look for yourself.
Think of Skills like a runbook, except the “entity” following the runbook is an AI that can actually execute the steps.
Here’s the structure:
.kiro/
skills/
create-spec/
SKILL.md ← workflow instructions
capture-skill/
SKILL.md
session-handoff/
SKILL.md
implement-and-review-loop/
SKILL.md
...Skills are portable. I’ve copied skills between projects, adapting the project-specific parts while keeping the workflow structure intact. The blog you’re reading right now was built using skills originally developed for a completely different project.
Here’s the Most Valuable Player (MVP) of the skills collection. Capture Skills allows you to do anything from capture a collection of prompts you enter to read a blog post and ask for a delta of how they’re thinking about something and then add it to your Skills collection. It’s amazing.
Skill 1: Capture Skill#
This is the meta skill — the one that creates other skills. When I find myself repeating a workflow or explaining the same process to Kiro more than once, I run capture-skill to turn it into a reusable prompt.
# Capture Skill
Create a new Kiro CLI skill (prompt) from a conversation,
a pasted prompt, or a description.
## Input
Optional: a name for the new skill (e.g., "refactor-code").
Will ask if not provided.
## Process
### Step 1: Gather Source Material
Ask the user to provide one of:
1. A prompt they've written — paste the text directly
2. A description of what the skill should do — help draft it
3. Reference to earlier in this conversation — extract
the relevant workflow
### Step 2: Analyze and Structure
From the provided material, identify:
- Core purpose — what does this skill accomplish?
- Required inputs — what arguments does it need?
- Step-by-step process — break into clear phases
- Expected outputs — what files/artifacts are produced?
- Error cases — what could go wrong?
### Step 3: Draft the Skill
Create a markdown prompt following the structure of existing
skills in `.kiro/skills/`:
# {Skill Name}
{One-line description}
## Input
{Describe expected input or arguments}
## Process
### Phase 1: {Name}
{Steps...}
### Phase 2: {Name}
{Steps...}
## Output
{What the user gets when complete}
## Rules
- {Constraints and conventions}
### Step 4: Review with User
Present the draft and ask:
- Does this capture what you wanted?
- Any steps to add or remove?
- Confirm the skill name.
### Step 5: Save
1. Validate name: alphanumeric, hyphens, underscores only.
Max 50 characters.
2. Save to `.kiro/skills/{name}/SKILL.md`
3. Confirm creation and show how to use it:
`/skills` → select `{name}`
## Skill Naming Rules
- ✅ Valid: `review-pr`, `debug-test`, `deploy-stacks`
- ❌ Invalid: `my skill` (spaces), `review.code` (dots)
## Rules
- Keep skills focused — one skill per workflow.
- Include verification steps where appropriate.
- Make the skill self-contained — readable cold.
- Ask the user rather than guessing on unclear points.The key insight is that skills are self-improving. Every time I use a skill and find a gap — a missing step, a wrong assumption, an edge case — I either fix it inline or run capture-skill to create a new one. The library grows organically from real work, not from trying to anticipate every scenario upfront. Best of all, with these fixes, the skills become more inline with how you work. Just ask Kiro to “create or improve skills” and it analyzes the delta and self-improves. Magic.
Skill 2: Improve Skill#
The companion to capture-skill. After using a skill and noticing something off — a missing step, a rule that got violated, output that didn’t match expectations — I run improve-skill to make targeted fixes while the context is still fresh.
# Improve Skill
Review the current chat session where a skill was used and
improve it based on feedback.
## Workflow
1. Review the conversation first to identify skill gaps:
- Where the skill's output diverged from expectations
- Missing steps that had to be done manually
- Rules that were violated or missing
- Output format issues
2. If gaps are obvious from context, propose the specific
skills and fixes directly. If not clear, ask:
"Which skill needs improvement?" and "What went wrong?"
3. Read the current skill file(s) from `.kiro/skills/`.
4. Multiple skills can be improved in one pass — don't
force one-at-a-time.
5. For each skill, describe the change and apply it.
6. After applying changes, summarize all improvements made.
## Rules
- Don't rewrite the entire skill — make targeted
improvements.
- Preserve what's working well.
- Add examples from the current session where they
clarify the improvement.
- When a rule was violated, strengthen the rule text
(e.g., add ⚠️ MANDATORY markers) rather than just
restating it.
- Prefer adding guardrails (hard gates, warnings) over
relying on behavioral compliance.This is how skills evolve. capture-skill creates them, improve-skill refines them. The combination means the library gets better every session without requiring a dedicated “skill maintenance” effort.
Skill 3: Session Handoff#
This one solved a problem that was driving me crazy. I’d be deep in a feature, end the session, come back the next day, and spend valuable time re-explaining context to Kiro. What branch am I on? What task was I working on? What decisions did I make and why? What’s deployed? What’s broken?
The session-handoff skill captures all of that into a structured document:
# Session Handoff
Generate a session handoff document capturing the current working
state for the next session.
## Workflow
1. Check `git status`, `git branch`, and `git log --oneline -5`
for current state.
2. Review the conversation history to identify what was
accomplished this session.
3. Write the handoff to `.kiro/context/session-handoff.md`
(rolling file — overwrite each time).
## Required Sections
- BEFORE RESUMING: Blockers that must be resolved before any
work (e.g., expired credentials, pending merges, VPN).
This section goes first so the next session doesn't start
broken.
- IMMEDIATE NEXT STEPS: Numbered list of what to do first
next session (deploy, merge, test, etc.).
- CURRENT STATE: Branch name, clean/dirty, last commit,
phase progress (X/Y tasks), test counts, live URLs.
- WHAT WE DID THIS SESSION: Each task completed with details,
decisions made, bugs found and fixed.
- WHAT'S REMAINING: Table of next tasks with status and notes.
- AWS RESOURCES: Cloud resources with IDs, regions, and values
(Lambda, S3, SSM params, etc.).
- KEY FILES: Table of important file paths grouped by area.
- KEY DECISIONS: Numbered list of architectural decisions to
carry forward (with rationale).
- STAKEHOLDER PREFERENCES: Workflow preferences, naming
conventions, review process.
## Rules
- Be specific — include resource IDs, exact commands,
and file paths.
- Document decisions and their rationale, not just what
was done.
- Note any blockers or environment issues encountered.
- After saving, immediately run the `auto-memory` skill
to capture learnings into `.kiro/steering/memory.md`.
- After auto-memory, remind the user: "Any skills need
refining from this session? Run `improve-skill` if so."When I end a session, I say “handoff” and Kiro checks git status, reviews what we accomplished, and writes a handoff document to .kiro/context/session-handoff.md. It includes specific resource IDs, exact commands, file paths, and the reasoning behind decisions — not just what was done, but why.
Skill 4: Session Resume#
The companion to handoff. When I start a new session, I say “resume” and Kiro reads the handoff document, loads the project context from the steering files, checks git status to make sure reality matches the document, and presents a summary of where we left off and what’s next.
# Session Resume
Resume a previous working session by loading the latest
handoff context.
## Workflow
1. Read all steering files from `.kiro/steering/` to load
project context (structure, tech stack, branching rules,
product overview).
2. Read all agent definitions from `.kiro/agents/` to
understand available code review and automation agents.
3. Read all skill definitions from `.kiro/skills/` to
understand available workflows.
4. List files in `.kiro/context/` sorted by modification
date (newest first).
5. Read the most recent `session-handoff.md`.
6. Present a summary:
- Where we left off: Current branch, phase, and
immediate next steps
- What's live: Key resources and endpoints
- What's next: Top 3-5 action items from the handoff
- Blockers: Anything that needs attention before
resuming (expired creds, unpushed commits, etc.)
7. Run `git status` and `git branch` to confirm current
state matches the handoff doc. Flag any discrepancies.
8. Ask: "Ready to pick up from here? Or do you want to
pivot to something else?"
## Rules
- Don't dump the entire handoff doc — summarize the
actionable parts.
- If git state doesn't match the handoff, call it out
clearly.
- If there are unpushed commits or uncommitted changes,
mention them first.
- Load steering docs for project context but don't
recite them.
- Carry forward stakeholder preferences from the
handoff doc.The handoff/resume pair means I never lose context between sessions. I can pick up a project after a week away and be productive in under a minute. The AI reads its own notes from last time and knows exactly where things stand.
Skill 5: Create Spec#
This is one of my favorites for software engineering. We know that Spec-driven development drives better results out of LLMs. This builds on that assumption and super-charges it. The create-spec skill orchestrates a full specification pipeline — requirements gathering, high-level design, low-level design, and task planning — producing a single unified document that becomes the input for implementation. Want to go turn by turn? Have Kiro ask you questions. Want to bring in MCP Servers or “Tools”, easy, just tell it to use tools.
The workflow has human review gates after every phase. Kiro doesn’t just generate a spec and hand it to you. It generates the requirements, stops, and asks for feedback. Then it generates the high-level design based on the approved requirements, stops again, and asks for feedback. Each phase builds on the previous one, and nothing advances without approval.
# Create Spec
Orchestrate the full specification pipeline — requirements →
high-level design → low-level design → task plan — producing
a single unified spec document. This spec is the primary input
for `implement-and-review-loop`.
## Input
A description of what to build, or "resume" to continue an
in-progress spec.
## Output
A single file: `docs/{feature-name}-spec.md`
## Process
### Phase 0: Research (if AWS/infrastructure features)
If the feature involves AWS APIs, Lambda runtimes, SDK
capabilities, or infrastructure patterns:
1. Use tools to research — call search_documentation,
web_search, or read_documentation to verify assumptions
about API capabilities, SDK support, and runtime
limitations BEFORE writing requirements.
2. Document findings — add a "Research Findings" section
to the requirements with what's supported, what's not,
and links to docs.
3. Flag constraints early — if research reveals a limitation,
surface it in requirements as a constraint, not as a
surprise in HLD.
### Phase 1: Requirements
1. Gather the user's description, ask clarifying questions.
2. Produce the Requirements section.
3. STOP — present for review.
4. If feedback given, revise and re-present. Loop until
approved.
5. Save the spec file. Confirm: "Requirements approved.
Moving to High-Level Design."
### Phase 2: High-Level Design
1. Produce the HLD section using requirements as context.
2. STOP — present for review.
3. If feedback given, revise. Loop until approved.
4. Update the spec file. Confirm: "HLD approved. Moving
to Low-Level Design."
### Phase 3: Low-Level Design
1. Produce the LLD section using requirements + HLD
as context.
2. STOP — present for review.
3. If feedback given, revise. Loop until approved.
4. Update the spec file. Confirm: "LLD approved. Moving
to Task Plan."
### Phase 4: Task Plan
1. Break the LLD into implementation tasks with
dependencies.
2. Validate tasks with tools — for each task that
references AWS APIs, env vars, SDK methods, or CLI
commands, verify with tools before writing.
3. STOP — present for review.
4. If feedback given, revise. Loop until approved.
5. Update the spec file. Confirm: "Spec complete! Ready
for implement-and-review-loop."
### Phase 5: Final Summary
Present:
- Spec file path
- Requirement count (FR + NFR)
- Component count from LLD
- Task count with dependency waves
- "Run implement-and-review-loop to start building."
## Resuming an In-Progress Spec
If the user says "resume":
1. Find the most recent spec in `docs/` ending in
`-spec.md`.
2. Determine which phase is next.
3. Pick up from there.
## Rules
- Human review gate after every phase — never
auto-advance.
- Each phase builds on the previous.
- Every requirement must be traceable through HLD → LLD
→ at least one task.
- If a spec already exists for this feature, ask whether
to revise it or start fresh.Phase 0 is worth calling out. Before writing any requirements, Kiro researches the AWS services involved — checking API capabilities, SDK support, and known limitations. This prevents the painful mid-design pivot where you discover that the thing you designed doesn’t actually work the way you assumed. I’ve had that happen enough times to make research a mandatory first step.
The output is a single markdown file with requirements, design, and tasks all in one place. Every requirement traces through the design to at least one implementation task. When I hand this to the implement-and-review-loop skill, it has full context for every task it works on.
Skill 6: Implement and Review Loop#
This is where the actual coding happens, and it’s where things get agentic. The implement-and-review-loop skill chains implementation and code review in an automated cycle. It reads a task from the spec, implements it, runs a multi-agent code review, fixes the findings, and re-reviews until the code is clean.
# Implement and Review Loop
Orchestrate an implement → review → fix cycle for tasks
in a spec. Chains implementation and review in a loop
until code is clean.
THIS IS THE DEFAULT ENTRY POINT for implementation work.
When the user asks to "implement", "build", or "code",
use this skill.
## Input
A task number, "next task", or "implement all open tasks".
The spec is read from `docs/` — look for `*-spec.md`.
## Process
### Phase 1: Implement
1. Read the task from the spec.
2. Implement the changes described.
3. Run guard-rails — all build gates must pass.
### Phase 1.5: Verify Build
After implementation and before review, run guard-rails:
1. All build gates must pass (hard fail blocks the loop).
2. All test gates must pass (hard fail blocks the loop).
3. New code coverage check — flag untested public functions.
4. Secrets scan — block if detected.
5. Branch check — block if on main.
### Phase 2: Review (MANDATORY — never skip)
Invoke review agents in parallel:
- Security reviewer
- Infrastructure reviewer
- Maintainability reviewer
- Performance reviewer
- Test quality reviewer
Each finding gets a severity:
- 🔴 Must Fix — broken builds, security issues
- 🟡 Should Fix — maintainability, performance
- 🟢 Nit — style, naming (logged but not acted on)
### Phase 3: Fix (if actionable findings exist)
1. For each actionable finding (🔴 + 🟡), apply the fix.
2. Re-run the relevant test suite(s).
3. If tests fail, feed the error back and retry the fix
(max 2 retries per finding).
### Phase 4: Re-review (if fixes were applied)
1. Run quick-review on the fix diff only.
2. If new 🔴 or 🟡 findings emerge, loop back to Phase 3.
3. Max 3 total review→fix iterations to prevent infinite
loops. If still unresolved after 3 passes, present
remaining findings to user for manual decision.
### Phase 5: Present Final State
STOP. Present to user:
- Summary of files created/modified
- Test count (total passing)
- Spec progress (X/Y tasks complete)
- Review iterations completed
- Any remaining findings that couldn't be auto-resolved
- Newly eligible tasks
- "Ready to commit?"
After approval, offer the full chain:
1. build-and-deploy
2. push-and-pr
3. session-handoff
### Phase 6: Commit (only after approval)
1. Stage specific files (not `git add .`).
2. Commit with descriptive message including review stats.
### Phase 7: Next Task (batch mode only)
If running "implement all", move to next eligible task
and repeat from Phase 1.
## Rules
- NEVER skip Phase 2 (review). Every task gets reviewed.
- When batching: implement one task → review → fix →
commit → next task. Do NOT batch multiple tasks into
a single review.
- Max 3 review→fix iterations. Escalate to human after.
- Never commit without user approval.The review phase is the part I’m most particular about. It uses specialized review agents — separate AI personas that each focus on one aspect of code quality. The security reviewer is paranoid about auth bypasses and hardcoded secrets. The infrastructure reviewer thinks about what breaks at 3 AM. The maintainability reviewer thinks about the developer who has to modify this code six months from now.
Here’s the full set of agent definitions. Each one lives in .kiro/agents/ as a JSON file:
{
"name": "review-security",
"description": "Security-focused code reviewer",
"prompt": "You are a security-focused code reviewer.
Focus exclusively on:
- Authentication and authorization: Are auth checks
present and correct? Can they be bypassed?
- Input validation: Are all user inputs validated and
sanitized? SQL injection, XSS, command injection?
- Secrets management: Are secrets, keys, or credentials
hardcoded or logged?
- Data exposure: Does the API return more data than
necessary? Are error messages leaking internals?
- Dependency risks: Are there known-vulnerable packages?
- IAM permissions: Are AWS IAM policies least-privilege?
- Encryption: Is data encrypted at rest and in transit?
For each finding: state the risk, rate severity, suggest
a fix. Be paranoid. Assume attackers will find every
weakness.",
"tools": ["fs_read", "code", "grep", "glob"]
}{
"name": "review-infrastructure",
"description": "AWS and infrastructure-focused code reviewer",
"prompt": "You are an AWS infrastructure code reviewer.
Focus exclusively on:
- CDK patterns: Cross-stack coupling? Correct use of
RemovalPolicy? Stack dependency ordering?
- IAM: Least-privilege policies? Overly broad wildcards?
Missing condition keys?
- Encryption: S3 encryption enabled? KMS keys where
needed? SSL/TLS enforced?
- Networking: Security groups too permissive? Public
access where it shouldn't be?
- Cost: Over-provisioned resources? Missing lifecycle
rules? Inefficient storage classes?
- Monitoring: Missing CloudWatch alarms or metrics?
- Resilience: Single points of failure? Missing multi-AZ?
- Tagging: Resources missing required tags?
For each finding: explain the operational risk, rate
severity, suggest a fix. Think about what breaks at
3 AM when nobody is watching.",
"tools": ["fs_read", "code", "grep", "glob"]
}{
"name": "review-maintainability",
"description": "Maintainability-focused code reviewer",
"prompt": "You are a maintainability-focused code reviewer.
Focus exclusively on:
- Code organization: Are files, classes, and methods in
the right place?
- Naming: Are names descriptive and consistent?
- Separation of concerns: Are responsibilities properly
divided?
- DRY violations: Is there duplicated logic that should
be extracted?
- Error handling: Are errors handled consistently?
- Configuration: Is config manageable across environments?
- Documentation: Are public APIs documented?
- Testability: Is the code structured for easy unit
testing? Are dependencies injectable?
For each finding: explain why it hurts maintainability,
rate severity, suggest a refactoring. Think about the
developer who has to modify this code 6 months from now.",
"tools": ["fs_read", "code", "grep", "glob"]
}{
"name": "review-performance",
"description": "Performance-focused code reviewer",
"prompt": "You are a performance-focused code reviewer.
Focus exclusively on:
- Resource allocation: Are objects created unnecessarily
per-request? Are HTTP clients and SDK clients reused?
- Data fetching: N+1 problems? Missing pagination?
- Memory: Unnecessary allocations? Missing disposal?
- Async patterns: Blocking calls in async methods?
Thread pool starvation risks?
- Caching: Are there missed caching opportunities?
- Payload sizes: Are API responses unnecessarily large?
- Infrastructure: Over-provisioned resources? Missing
auto-scaling?
For each finding: explain the performance impact, rate
severity, suggest a fix. Think about what happens at
10x and 100x the current load.",
"tools": ["fs_read", "code", "grep", "glob"]
}{
"name": "review-test-quality",
"description": "Test quality and coverage reviewer",
"prompt": "You are a test quality reviewer.
Focus exclusively on:
- Coverage gaps: Are new code paths covered by tests?
- Edge cases: Boundary conditions? Null inputs? Empty
collections?
- Assertion quality: Are assertions specific? Do tests
verify behavior or just that code doesn't throw?
- Test isolation: Do tests depend on external state or
ordering?
- Mock usage: Are dependencies properly mocked?
- Naming: Do test names describe scenario and outcome?
- Negative tests: Are failure paths tested?
Also flag when production code changes have NO
corresponding test changes.
For each finding: explain what could go undetected,
rate severity, suggest a test to add. Assume every
untested path will break in production.",
"tools": ["fs_read", "code", "grep", "glob"]
}Each agent gets read-only access to the codebase and produces structured findings. The loop collects all findings, applies fixes for the actionable ones, and re-reviews until clean. The whole cycle — implement, review with five agents, fix, re-review — happens without me touching the keyboard. I review the final summary and approve or request changes.
The Compound Effect#
Individually, any one of these skills saves time. But the real value is in how they chain together. A typical feature development session looks like this:
- Resume — load context from last session
- Create spec — requirements → design → tasks (with human gates)
- Implement and review — code → 5-agent review → fix → re-review
- Handoff — save state for next session
Each skill produces artifacts that the next skill consumes. The spec feeds the implementation loop. The implementation loop produces code that gets reviewed. The handoff captures everything for the resume. It’s a pipeline, and each stage is a reusable, improvable component.
I’ve been running this workflow on a real project for several months now. The skills have been refined through actual use — every time something goes wrong or a step is missing, I fix the skill. They’re not theoretical. They’re battle-tested.
Getting Started#
If you want to try this yourself, start small. Don’t try to build the whole pipeline at once. Pick one workflow that you repeat often — maybe it’s how you write commit messages, or how you review PRs, or how you set up a new feature branch. Write a skill for that one thing. Use it a few times. Improve it. Then add another.
The Kiro skills documentation covers the format and conventions. But, honestly, you can just have Kiro start writing skills for you. And then you improve them over time. Skills are just markdown files in your repo — no special tooling beyond Kiro itself.
The best skills come from real friction, not from imagining what might be useful. Pay attention to the moments where you’re explaining the same process to your AI assistant for the third time. That’s a skill waiting to be written.
Until next time — go build something cool! Keith
