Files
claude-skills/bulletproof/SKILL.md
T
2026-03-21 19:36:11 +03:00

14 KiB

name, description
name description
bulletproof Use when building a feature, refactoring, fixing a complex bug, changing architecture, or starting any non-trivial coding task. 12-stage verified dev workflow from research to deploy. Adapted for Claude.ai (no sub-agents), Python/Docker/Traefik/MikroTik/embedded stacks, Gitea CI/CD, and SonarQube.

Bulletproof — Adaptive Development Workflow

Based on: Artemiy Miller's Bulletproof v5.0 Adapted for: Claude.ai · Arthur Abelentsev's infrastructure stack Version: 5.1-aa · March 2026

Core Principle

Code to solve problems, not code for code's sake.

Before EVERY change ask: "Does this actually solve our problem? Is this the most efficient solution?" If the answer isn't clear — stop, research alternatives, pick the best one.


Pick Your Mode

Not every task needs the full pipeline.

Size Examples Mode Stages
S Bug fix, config tweak, 1-2 files Lightweight 1 → 4 → 5 → 6 → 7 → Gates
M New feature, module refactor, 3-10 files Standard Stages 1-10
L Architecture change, new service/container, 10+ files Full Stages 1-12 (all)

How stages relate: Stages 5-6-7 (Self-Audit, Verification, Impact) run inside each implementation phase as an inner loop. Stages 8-12 run once after all phases complete as an outer loop.


Stack-Specific Conventions (ALWAYS applies)

These conventions apply automatically to all code produced under this workflow:

Python

  • Use ~= (compatible release) in requirements.txt, never == or >=
  • Formatting: ruff format + ruff check
  • Type hints on all public functions
  • pathlib.Path over os.path

Docker / Compose

  • Filename: always compose.yaml (never docker-compose.yml)
  • Pin image tags to specific versions, never use :latest in production
  • Health checks for every service
  • Named volumes over bind mounts for persistent data

Gitea

  • Clone via ssh://gitea-lan/<org>/<repo>.git
  • One commit per logical change — if changes span multiple files, stage and commit together
  • Gitea Actions CI for build/test/deploy pipelines
  • Deploy pattern: .deploy.env + Gitea Secrets

Infrastructure (Traefik, MikroTik, WireGuard)

  • Traefik v3 with Docker provider; labels on compose services
  • Let's Encrypt via DNS challenge for wildcard certs
  • MikroTik config changes: always test with /system scheduler rollback timer before commit
  • WireGuard peer configs: document AllowedIPs and routing table in comments

Embedded Firmware

  • For any embedded/MCU/firmware task: read the embedded-firmware-engineer skill first. It contains NASA/JPL Power of Ten rules, banned functions, DMA/cache coherence, GPIO policy, watchdog strategy, brown-out testing, and code review checklists specific to bare-metal and RTOS development.
  • PlatformIO as build system; platformio.ini must pin platform and framework versions
  • Build flags: -Wall -Werror -Wextra -Wpedantic

Stage 1: Deep Research

Mode: Read-Only. No code. No changes.

  • Investigate the problem area: structure, patterns, dependencies, existing tests
  • WebSearch: Who has already solved this problem? How did they solve it? What is the most efficient known solution? Don't reinvent — find the best existing approach first.
  • Analyze all findings and make a conclusion: which solution is the BEST and why. The research artifact must end with a clear recommendation, not just a list of options.
  • Save to thoughts/research/YYYY-MM-DD-<task>.md (see templates/research.md for format)

Stage 2: Spec / PRD

Mode: Write specs only. No code.

Spec = WHAT and WHY. Not how. Spec = contract.

  • Read Research Artifact from thoughts/research/
  • Create specs/YYYY-MM-DD-<n>.md (see templates/spec.md for format)
  • Key sections: Problem, Goal, Scope, Acceptance Criteria, Constraints, Non-Goals

Skip for size S tasks.


Stage 3: Planning + Questions

Mode: Write plans only. No code yet.

  • Read both Spec (specs/) and Research (thoughts/research/)
  • Find gaps: what's unthought? What edge cases? What could break?
  • Be creative and proactive: anticipate ALL possible problems BEFORE writing code. Think several steps ahead. What could go wrong in a week? A month? Under load? With unexpected user behavior? Solve problems before they exist.
  • WebSearch: How have others solved this exact problem? What libraries/patterns exist? What's the proven best practice? Choose the most efficient solution, not the first one that comes to mind.
  • After verifying the approach — rewrite the plan into an improved version incorporating all findings, edge cases, and research results. Not just patch it — rewrite it better.

Challenge Loop (mandatory before finalizing plan)

Before finalizing the plan, answer 3 questions:

1. DOES THIS SOLVE THE PROBLEM?
   Compare every plan item against acceptance criteria from spec.
   If any criterion is uncovered — the plan is incomplete.

2. IS THIS THE MOST EFFICIENT SOLUTION?
   Search: who has already solved this problem? What approach did they use?
   Name 2-3 alternative approaches (including ones found via research).
   For each: pros, cons, effort.
   Justify why the chosen approach is better than all alternatives.

3. IS THERE "CODE FOR CODE'S SAKE"?
   Every change must directly serve acceptance criteria.
   If a change isn't tied to solving the problem — remove it.
   Drive-by refactoring = separate task, not part of this one.

Review Cycle

  1. Claude drafts the plan
  2. User reviews in chat, adds notes/corrections
  3. Claude addresses all notes, rewrites affected sections
  4. Repeat until user approves

Questions for User

  • Only for real forks where there's a genuine decision to make
  • For each question: recommend which option you think is best and why
  • Don't ask the obvious

Final Plan

Create plans/YYYY-MM-DD-<n>.md (see templates/plan.md for full template with Challenge Log, phases, prompts)


Stage 4: Phased Implementation

Each phase = separate logical unit, feature branch.

Order within each phase:

  1. Create/switch to feature branch: feature/<task>
  2. Update status → in_progress
  3. TDD: tests FIRST (red)
  4. Implement: code to make tests pass (green)
  5. Refactor (if needed)
  6. Self-Audit (Stage 5)
  7. Verification (Stage 6)
  8. Impact Analysis (Stage 7)
  9. Gates (see Gates section)
  10. Commit — one commit per logical change, descriptive message
  11. Status → completed, write to Changelog
  12. Handoff (write progress/<task>-handoff.md, see templates/handoff.md)

Stage 5: Self-Audit (after each phase)

Mandatory BEFORE marking completed:

Check the phase implementation:

1. SPEC COMPLIANCE
   Open spec. Walk through every acceptance criterion.
   For each: implemented? Where exactly in code?
   If any not covered — finish it.

2. CHALLENGE THE SOLUTION
   Look at the written code with fresh eyes.
   Does this actually solve the problem from spec?
   Is there a simpler/more efficient way?
   Any "code for code's sake" — changes unrelated to the task?

Stage 6: Verification — Deep Bug Hunt

Not just linting. Thoughtful review with false-positive filtering.

Step 1: Find errors

Check ALL code from this phase for:
- Logic errors (wrong conditions, off-by-one, race conditions)
- Data handling (null/undefined, type mismatches)
- Security (injection, auth bypass, exposed secrets)
- Performance (N+1 queries, memory leaks, unnecessary allocations)
- Docker: health check failures, volume mount conflicts, port collisions
- Infrastructure: Traefik label typos, routing priority conflicts

Step 2: Verify bugs are REAL

For EACH found bug:
1. Is this a REAL bug or a false positive?
2. Can you prove this bug is reproducible?
3. If you can't prove it — it's NOT a bug. Don't touch it.

RULE: Don't fix code "for beauty" or "just in case".
Fix ONLY proven bugs that actually affect functionality.
Every "fix" without proof = risk of introducing a new bug.

Step 3: Logic and efficiency check

Final code cleanliness check:
- Logic: is the data flow correct from input to output?
- Efficiency: any redundant operations?
- Readability: is the code understandable without comments?
BUT: don't refactor "for beauty". Only if it affects correctness.

Stage 7: Impact Analysis — "Did we break anything?"

The most underestimated stage. 75% of AI agents break previously working code.

MANDATORY CHECK BEFORE MERGE:

1. REGRESSION
   What other modules/functions depend on changed files?
   Run ALL project tests (not just current phase).
   If anything broke — this is priority #1.

2. SIDE EFFECTS
   Did any contracts/interfaces change (API, props, types)?
   If yes — who uses them? Are all consumers updated?
   Docker: did any service ports, volumes, or network names change?
   Traefik: do routing rules still resolve correctly?

3. THINK AHEAD
   What problems could these changes cause in a week/month?
   Edge cases we haven't tested?
   What happens with: zero data? Huge data? Concurrent requests?
   What if the user does something unexpected?

4. COMPATIBILITY
   Backward compatibility preserved?
   Data migrations needed?
   Docker volume data backward-compatible with new container version?
   Feature flags needed for gradual rollout?

Stage 8: Integration Check

  • All phases completed → run gates across entire project
  • Audit: everything from spec implemented?
  • Every acceptance criterion → fulfilled?

Stage 9: Code Review (fresh perspective)

Review as if seeing this code for the first time.

See agents/code-reviewer.md for the full review checklist.

Key areas:

  • Edge cases, race conditions, backward compat, security, error handling, performance
  • Docker/Compose: service dependencies, restart policies, resource limits
  • Infrastructure: Traefik routing, TLS configuration, firewall rules

Warning: AI reviewing its own code has blind spots. For critical infrastructure changes — flag for human review.


Stage 10: Security Scan (for M and L)

# SonarQube analysis (preferred — already in the stack)
# Push to Gitea → Gitea Actions triggers SonarQube scan

# Alternative: local semgrep
semgrep --config=auto .

For Docker/Compose changes, additionally check:

  • No secrets in compose.yaml or Dockerfiles (use .env or Gitea Secrets)
  • Images from trusted registries only
  • No privileged containers without justification
  • Network segmentation: services not exposed beyond what's needed

Stage 11: Fixes + Re-verification

If review/scan found issues:

  1. Fix (only proven bugs — rule from Stage 6)
  2. Re-run gates
  3. Repeat Impact Analysis (Stage 7) — fixes didn't break anything else?
  4. Re-review if major changes were made

Stage 12: Cleanup + Deploy

  • Archive plan: mv plans/<file> plans/archive/
  • Keep spec as documentation
  • Squash merge → main (via Gitea PR)
  • Deploy — ONLY on explicit user request

Deterministic Gates

A phase CANNOT be completed without passing ALL required gates.

Tier 1: Required (block the phase)

Python projects:

ruff check .                              # 0 lint errors
ruff format --check .                     # formatting verified
pytest --tb=short -q                      # all tests green
python -m py_compile <main_module>.py     # syntax OK

Docker/Compose projects:

docker compose -f compose.yaml config     # compose file valid
docker compose build                      # all images build
docker compose up -d && sleep 10 && \
  docker compose ps --format json | \
  python3 -c "import sys,json; \
    svcs=json.loads(sys.stdin.read()); \
    exit(0 if all(s['Health']=='healthy' or s['State']=='running' for s in svcs) else 1)"
                                          # all services healthy

Embedded (PlatformIO):

pio check                                 # static analysis
pio run                                   # firmware builds
pio test                                  # unit tests pass (native)
# Python
pip-audit                                 # dependency vulnerabilities
mypy --strict .                           # type checking (if project uses mypy)

# Docker
docker scout cves <image>                 # image CVE scan (if available)

# General
semgrep --config=auto .                   # security patterns

Tier 3: Deep Security (SonarQube)

# Via Gitea Actions pipeline — push triggers analysis
# Or manually:
sonar-scanner -Dsonar.projectKey=<key> -Dsonar.host.url=<url>

If a gate fails — fix and re-run. Never skip.


Git Discipline

  • Each task = feature/<task> branch
  • One commit per logical change — group related file changes into a single commit
  • Commit after each passed gate (checkpoint for rollback)
  • NEVER push to main directly
  • Squash merge on completion (via Gitea PR)
  • Clone format: ssh://gitea-lan/<org>/<repo>.git

Model Recommendations

Stage Model Why
Research, Planning Opus Cross-file reasoning, deep analysis
Implementation Sonnet Speed, cost-efficiency
Code Review, Security Opus Deep analysis, fresh perspective

Project Structure

project/
├── specs/                      # WHAT and WHY
├── plans/                      # HOW
│   └── archive/                # completed plans
├── thoughts/research/          # research artifacts
├── progress/                   # handoff files
├── compose.yaml                # Docker Compose (if applicable)
├── platformio.ini              # PlatformIO config (if embedded)
├── requirements.txt            # Python deps with ~= specifiers
├── sonar-project.properties    # SonarQube config (if applicable)
└── .gitea/
    └── workflows/              # Gitea Actions CI/CD