--- name: bulletproof description: Use when building a feature, refactoring, fixing a complex bug, changing architecture, or starting any non-trivial coding task. 12-stage verified dev workflow from research to deploy. Adapted for Claude.ai (no sub-agents), Python/Docker/Traefik/MikroTik/embedded stacks, Gitea CI/CD, and SonarQube. --- # Bulletproof — Adaptive Development Workflow > **Based on:** Artemiy Miller's Bulletproof v5.0 > **Adapted for:** Claude.ai · Arthur Abelentsev's infrastructure stack > **Version:** 5.1-aa · March 2026 ## Core Principle **Code to solve problems, not code for code's sake.** Before EVERY change ask: "Does this actually solve our problem? Is this the most efficient solution?" If the answer isn't clear — stop, research alternatives, pick the best one. --- ## Pick Your Mode Not every task needs the full pipeline. | Size | Examples | Mode | Stages | |------|----------|------|--------| | **S** | Bug fix, config tweak, 1-2 files | Lightweight | 1 → 4 → 5 → 6 → 7 → Gates | | **M** | New feature, module refactor, 3-10 files | Standard | Stages 1-10 | | **L** | Architecture change, new service/container, 10+ files | Full | Stages 1-12 (all) | **How stages relate:** Stages 5-6-7 (Self-Audit, Verification, Impact) run **inside each implementation phase** as an inner loop. Stages 8-12 run **once after all phases complete** as an outer loop. --- ## Stack-Specific Conventions (ALWAYS applies) These conventions apply automatically to all code produced under this workflow: ### Python - Use `~=` (compatible release) in `requirements.txt`, never `==` or `>=` - Formatting: `ruff format` + `ruff check` - Type hints on all public functions - `pathlib.Path` over `os.path` ### Docker / Compose - Filename: always `compose.yaml` (never `docker-compose.yml`) - Pin image tags to specific versions, never use `:latest` in production - Health checks for every service - Named volumes over bind mounts for persistent data ### Gitea - Clone via `ssh://gitea-lan//.git` - **One commit per logical change** — if changes span multiple files, stage and commit together - Gitea Actions CI for build/test/deploy pipelines - Deploy pattern: `.deploy.env` + Gitea Secrets ### Infrastructure (Traefik, MikroTik, WireGuard) - Traefik v3 with Docker provider; labels on compose services - Let's Encrypt via DNS challenge for wildcard certs - MikroTik config changes: always test with `/system scheduler` rollback timer before commit - WireGuard peer configs: document AllowedIPs and routing table in comments ### Embedded Firmware - **For any embedded/MCU/firmware task: read the `embedded-firmware-engineer` skill first.** It contains NASA/JPL Power of Ten rules, banned functions, DMA/cache coherence, GPIO policy, watchdog strategy, brown-out testing, and code review checklists specific to bare-metal and RTOS development. - PlatformIO as build system; `platformio.ini` must pin platform and framework versions - Build flags: `-Wall -Werror -Wextra -Wpedantic` --- ## Stage 1: Deep Research **Mode: Read-Only. No code. No changes.** - Investigate the problem area: structure, patterns, dependencies, existing tests - **WebSearch: Who has already solved this problem? How did they solve it? What is the most efficient known solution?** Don't reinvent — find the best existing approach first. - **Analyze all findings and make a conclusion: which solution is the BEST and why.** The research artifact must end with a clear recommendation, not just a list of options. - Save to `thoughts/research/YYYY-MM-DD-.md` (see `templates/research.md` for format) --- ## Stage 2: Spec / PRD **Mode: Write specs only. No code.** **Spec = WHAT and WHY. Not how. Spec = contract.** - Read Research Artifact from `thoughts/research/` - Create `specs/YYYY-MM-DD-.md` (see `templates/spec.md` for format) - Key sections: Problem, Goal, Scope, Acceptance Criteria, Constraints, Non-Goals **Skip for size S tasks.** --- ## Stage 3: Planning + Questions **Mode: Write plans only. No code yet.** - Read **both** Spec (`specs/`) and Research (`thoughts/research/`) - Find gaps: what's unthought? What edge cases? What could break? - **Be creative and proactive: anticipate ALL possible problems BEFORE writing code.** Think several steps ahead. What could go wrong in a week? A month? Under load? With unexpected user behavior? Solve problems before they exist. - **WebSearch: How have others solved this exact problem? What libraries/patterns exist? What's the proven best practice?** Choose the most efficient solution, not the first one that comes to mind. - After verifying the approach — **rewrite the plan into an improved version** incorporating all findings, edge cases, and research results. Not just patch it — rewrite it better. ### Challenge Loop (mandatory before finalizing plan) ``` Before finalizing the plan, answer 3 questions: 1. DOES THIS SOLVE THE PROBLEM? Compare every plan item against acceptance criteria from spec. If any criterion is uncovered — the plan is incomplete. 2. IS THIS THE MOST EFFICIENT SOLUTION? Search: who has already solved this problem? What approach did they use? Name 2-3 alternative approaches (including ones found via research). For each: pros, cons, effort. Justify why the chosen approach is better than all alternatives. 3. IS THERE "CODE FOR CODE'S SAKE"? Every change must directly serve acceptance criteria. If a change isn't tied to solving the problem — remove it. Drive-by refactoring = separate task, not part of this one. ``` ### Review Cycle 1. Claude drafts the plan 2. User reviews in chat, adds notes/corrections 3. Claude addresses all notes, rewrites affected sections 4. Repeat until user approves ### Questions for User - Only for real forks where there's a genuine decision to make - For each question: **recommend which option you think is best and why** - Don't ask the obvious ### Final Plan Create `plans/YYYY-MM-DD-.md` (see `templates/plan.md` for full template with Challenge Log, phases, prompts) --- ## Stage 4: Phased Implementation **Each phase = separate logical unit, feature branch.** Order within each phase: 1. Create/switch to feature branch: `feature/` 2. Update status → `in_progress` 3. **TDD**: tests FIRST (red) 4. **Implement**: code to make tests pass (green) 5. **Refactor** (if needed) 6. **Self-Audit** (Stage 5) 7. **Verification** (Stage 6) 8. **Impact Analysis** (Stage 7) 9. **Gates** (see Gates section) 10. **Commit** — one commit per logical change, descriptive message 11. Status → `completed`, write to Changelog 12. **Handoff** (write `progress/-handoff.md`, see `templates/handoff.md`) --- ## Stage 5: Self-Audit (after each phase) **Mandatory BEFORE marking `completed`:** ``` Check the phase implementation: 1. SPEC COMPLIANCE Open spec. Walk through every acceptance criterion. For each: implemented? Where exactly in code? If any not covered — finish it. 2. CHALLENGE THE SOLUTION Look at the written code with fresh eyes. Does this actually solve the problem from spec? Is there a simpler/more efficient way? Any "code for code's sake" — changes unrelated to the task? ``` --- ## Stage 6: Verification — Deep Bug Hunt **Not just linting. Thoughtful review with false-positive filtering.** ### Step 1: Find errors ``` Check ALL code from this phase for: - Logic errors (wrong conditions, off-by-one, race conditions) - Data handling (null/undefined, type mismatches) - Security (injection, auth bypass, exposed secrets) - Performance (N+1 queries, memory leaks, unnecessary allocations) - Docker: health check failures, volume mount conflicts, port collisions - Infrastructure: Traefik label typos, routing priority conflicts ``` ### Step 2: Verify bugs are REAL ``` For EACH found bug: 1. Is this a REAL bug or a false positive? 2. Can you prove this bug is reproducible? 3. If you can't prove it — it's NOT a bug. Don't touch it. RULE: Don't fix code "for beauty" or "just in case". Fix ONLY proven bugs that actually affect functionality. Every "fix" without proof = risk of introducing a new bug. ``` ### Step 3: Logic and efficiency check ``` Final code cleanliness check: - Logic: is the data flow correct from input to output? - Efficiency: any redundant operations? - Readability: is the code understandable without comments? BUT: don't refactor "for beauty". Only if it affects correctness. ``` --- ## Stage 7: Impact Analysis — "Did we break anything?" **The most underestimated stage. 75% of AI agents break previously working code.** ``` MANDATORY CHECK BEFORE MERGE: 1. REGRESSION What other modules/functions depend on changed files? Run ALL project tests (not just current phase). If anything broke — this is priority #1. 2. SIDE EFFECTS Did any contracts/interfaces change (API, props, types)? If yes — who uses them? Are all consumers updated? Docker: did any service ports, volumes, or network names change? Traefik: do routing rules still resolve correctly? 3. THINK AHEAD What problems could these changes cause in a week/month? Edge cases we haven't tested? What happens with: zero data? Huge data? Concurrent requests? What if the user does something unexpected? 4. COMPATIBILITY Backward compatibility preserved? Data migrations needed? Docker volume data backward-compatible with new container version? Feature flags needed for gradual rollout? ``` --- ## Stage 8: Integration Check - All phases `completed` → run gates across entire project - Audit: everything from spec implemented? - Every acceptance criterion → fulfilled? --- ## Stage 9: Code Review (fresh perspective) **Review as if seeing this code for the first time.** See `agents/code-reviewer.md` for the full review checklist. Key areas: - Edge cases, race conditions, backward compat, security, error handling, performance - Docker/Compose: service dependencies, restart policies, resource limits - Infrastructure: Traefik routing, TLS configuration, firewall rules **Warning**: AI reviewing its own code has blind spots. For critical infrastructure changes — flag for human review. --- ## Stage 10: Security Scan (for M and L) ```bash # SonarQube analysis (preferred — already in the stack) # Push to Gitea → Gitea Actions triggers SonarQube scan # Alternative: local semgrep semgrep --config=auto . ``` For Docker/Compose changes, additionally check: - No secrets in compose.yaml or Dockerfiles (use .env or Gitea Secrets) - Images from trusted registries only - No privileged containers without justification - Network segmentation: services not exposed beyond what's needed --- ## Stage 11: Fixes + Re-verification If review/scan found issues: 1. Fix (only proven bugs — rule from Stage 6) 2. Re-run gates 3. Repeat Impact Analysis (Stage 7) — fixes didn't break anything else? 4. Re-review if major changes were made --- ## Stage 12: Cleanup + Deploy - Archive plan: `mv plans/ plans/archive/` - Keep spec as documentation - Squash merge → main (via Gitea PR) - **Deploy — ONLY on explicit user request** --- ## Deterministic Gates A phase CANNOT be `completed` without passing ALL required gates. ### Tier 1: Required (block the phase) **Python projects:** ```bash ruff check . # 0 lint errors ruff format --check . # formatting verified pytest --tb=short -q # all tests green python -m py_compile .py # syntax OK ``` **Docker/Compose projects:** ```bash docker compose -f compose.yaml config # compose file valid docker compose build # all images build docker compose up -d && sleep 10 && \ docker compose ps --format json | \ python3 -c "import sys,json; \ svcs=json.loads(sys.stdin.read()); \ exit(0 if all(s['Health']=='healthy' or s['State']=='running' for s in svcs) else 1)" # all services healthy ``` **Embedded (PlatformIO):** ```bash pio check # static analysis pio run # firmware builds pio test # unit tests pass (native) ``` ### Tier 2: Recommended (for M and L) ```bash # Python pip-audit # dependency vulnerabilities mypy --strict . # type checking (if project uses mypy) # Docker docker scout cves # image CVE scan (if available) # General semgrep --config=auto . # security patterns ``` ### Tier 3: Deep Security (SonarQube) ```bash # Via Gitea Actions pipeline — push triggers analysis # Or manually: sonar-scanner -Dsonar.projectKey= -Dsonar.host.url= ``` If a gate fails — fix and re-run. Never skip. --- ## Git Discipline - Each task = `feature/` branch - **One commit per logical change** — group related file changes into a single commit - Commit after each passed gate (checkpoint for rollback) - NEVER push to main directly - Squash merge on completion (via Gitea PR) - Clone format: `ssh://gitea-lan//.git` --- ## Model Recommendations | Stage | Model | Why | |-------|-------|-----| | Research, Planning | Opus | Cross-file reasoning, deep analysis | | Implementation | Sonnet | Speed, cost-efficiency | | Code Review, Security | Opus | Deep analysis, fresh perspective | --- ## Project Structure ``` project/ ├── specs/ # WHAT and WHY ├── plans/ # HOW │ └── archive/ # completed plans ├── thoughts/research/ # research artifacts ├── progress/ # handoff files ├── compose.yaml # Docker Compose (if applicable) ├── platformio.ini # PlatformIO config (if embedded) ├── requirements.txt # Python deps with ~= specifiers ├── sonar-project.properties # SonarQube config (if applicable) └── .gitea/ └── workflows/ # Gitea Actions CI/CD ```