Files
pm-skills/pm-ai-shipping/skills/intended-vs-implemented/SKILL.md
T
Pawel Huryn 8202bdd7f1 Release v2.0.0: add pm-ai-shipping plugin, red-team execution skill, refresh README
New
- pm-ai-shipping (9th plugin) — AI Shipping Kit: document a vibe-coded app, audit
  security/performance against intended behavior, map test coverage, and compile a
  reviewer-ready shipping packet (2 skills, 5 commands).
- pm-execution: strategy-red-team skill + /red-team-prd command (now 16 skills, 11 commands).

Changed
- Bump all versions 1.0.1 -> 2.0.0 (marketplace.json + all 9 plugin.json) in lockstep.
- README: new plugins.png hero + examples.png in "How It Works"; counts updated to
  9 plugins / 68 skills / 42 commands across tagline, install block, and per-plugin sections.
- CLAUDE.md: 9-plugin structure, plugin table, and version note updated.

Validator: 9 plugins, 68 skills, 42 commands, 110 components, 0 warnings.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-05 18:49:54 +02:00

3.6 KiB

name, description
name description
intended-vs-implemented The method for finding the gap between what a system is supposed to do and what the code actually does — the class of bug generic scanners miss because they have no model of intent. Defines what counts as documented intent, what counts as implementation evidence, which mismatches matter, and how to avoid hand-wavy findings. Use when auditing AI-built code, reviewing access control against documented permissions, or checking whether a codebase matches its own documentation.

Intended vs. Implemented: Auditing the Gap

Purpose

A linter scans code in a vacuum. It can tell you the code is internally consistent; it cannot tell you the code does what you meant, because it has no model of your intent. The highest-value security and correctness bugs live in that gap — a permission documented but never enforced, a "cron-only" endpoint anyone can call, a field marked public-only that leaks private data.

This skill is the method for finding that gap. It is the differentiator: it only works when intent has been written down first (see the shipping-artifacts skill), and that's exactly why commodity tools can't replicate it.

Context

Use this when documented intent exists — permissions.md, architecture.md, variables.md, etc. If those docs are absent or stale, that absence is itself the first finding: you cannot audit intent you never recorded. Recommend documenting first, then auditing.

Method

  1. Establish intent. Read the /documentation/*.md set as the source of truth for what should be true: who may access what, which boundaries are trusted, which data is public. Treat the docs as claims to verify, not as proof.

  2. Gather implementation evidence. Read the code that enforces (or fails to enforce) each claim. Evidence is a cited file and line — the actual authorization check, the actual query filter, the actual sanitizer. "It's probably handled upstream" is not evidence; the code path is.

  3. Compare claim to code, one boundary at a time. For each documented rule, ask: does an enforcement point actually implement it, on the server, on every path? Distrust comments like "internal only," "admin only," or "validated elsewhere" — verify them in code.

  4. Classify each mismatch by whether it matters. A mismatch matters when crossing it lets a real actor reach data, money, infrastructure, or another tenant they shouldn't. It does not matter when the only person affected is the actor themselves on their own data. Drop cosmetic drift; keep boundary-crossing drift.

  5. Avoid hand-wavy findings. Every finding names: the documented intent (quote the doc), the implemented reality (cite the code), the attacker and victim, and the concrete fix. If you cannot cite both sides of the gap, it is a question to investigate, not a finding to report.

What counts

  • Intent: a documented rule, boundary, scope, or public/private classification.
  • Implementation evidence: a cited enforcement point (or its provable absence) in the code.
  • A mismatch that matters: doc says one thing, code does another, and the difference crosses a trust, cost, data, or tenant boundary.

Notes

  • Documented-but-unenforced is a finding on its own — rank it by what crossing the gap exposes.
  • Undocumented-but-enforced is usually fine, but flag it: the docs are now stale, which weakens the next audit.
  • This method feeds the security and performance audits; it does not replace their sink-level analysis — it adds the intent axis they lack.
  • Never fabricate intent to manufacture a gap. If the docs are silent, say the docs are silent.