Fix validator nested-name misparse, unify with CI, add authorized-use banner

Issues found in review: 1. tools/validate-skill.py: parse_frontmatter operated on the stripped line, so an indented nested `name:` (under framework-mapping lists, e.g. `name: 'Create Fake Materials: Fake Website'`) clobbered the skill's top-level `name`. That produced 94 spurious "invalid kebab-case name" failures out of 762. Now indented (non-list) key lines are ignored, so only top-level keys define frontmatter fields. Result: 762/762 pass. 2. Two divergent validators: the CI workflow had its own weaker inline parser (no subdomain/tag/description checks) requiring a different field set than tools/validate-skill.py. CI now delegates to tools/validate-skill.py --all (single source of truth); REQUIRED_FIELDS aligned to include version/author/license. The duplicate-name and stats steps are unchanged. 3. README: added an explicit authorized-&-lawful-use disclaimer next to the existing "not affiliated with Anthropic" note, since the library ships offensive/dual-use techniques. No skill content changed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-26 19:54:37 +03:00 · 2026-06-22 18:09:19 +02:00
parent 13a1c4afd9
commit 5f5edbb30b
3 changed files with 20 additions and 53 deletions
@@ -15,57 +15,11 @@ jobs:
    steps:
      - uses: actions/checkout@v4
-      - name: Validate SKILL.md frontmatter with Python
+      # Single source of truth: tools/validate-skill.py validates required
-        run: |
+      # frontmatter fields, kebab-case name, description length, subdomain, and
-          python3 << 'EOF'
+      # tag count. (Previously this step duplicated a weaker inline parser.)
-          import os
+      - name: Validate SKILL.md frontmatter
-          import re
+        run: python3 tools/validate-skill.py --all
          import sys
          REQUIRED_FIELDS = ['name', 'description', 'domain', 'subdomain', 'tags', 'version', 'author', 'license']
          errors = []
          checked = 0
          for root, dirs, files in os.walk('skills'):
              for file in files:
                  if file == 'SKILL.md':
                      path = os.path.join(root, file)
                      checked += 1
                      with open(path, 'r', encoding='utf-8') as f:
                          content = f.read()
                      # Check frontmatter exists
                      fm_match = re.match(r'^---\n(.*?)\n---', content, re.DOTALL)
                      if not fm_match:
                          errors.append(f"{path}: Missing YAML frontmatter")
                          continue
                      fm = fm_match.group(1)
                      # Check required fields
                      for field in REQUIRED_FIELDS:
                          if not re.search(rf'^{field}:', fm, re.MULTILINE):
                              errors.append(f"{path}: Missing required field '{field}'")
                      # Check name format (kebab-case)
                      name_match = re.search(r'^name:\s*(.+)$', fm, re.MULTILINE)
                      if name_match:
                          name = name_match.group(1).strip().strip('"')
                          if not re.match(r'^[a-z0-9-]+$', name):
                              errors.append(f"{path}: Name '{name}' must be kebab-case")
                          if len(name) > 64:
                              errors.append(f"{path}: Name '{name}' exceeds 64 characters")
          print(f"Checked {checked} SKILL.md files")
          if errors:
              print(f"\n{len(errors)} validation error(s):")
              for e in errors:
                  print(f"  ❌ {e}")
              sys.exit(1)
          else:
              print(f"✅ All {checked} skills valid")
          EOF
      - name: Check for duplicate skill names
        run: |
@@ -32,7 +32,9 @@
 ---
-> ⚠️ **Community Project** — This is an independent, community-created project. Not affiliated with Anthropic PBC. 
+> ⚠️ **Community Project** — This is an independent, community-created project. Not affiliated with Anthropic PBC.
 >
 > 🔐 **Authorized & lawful use only.** This library includes offensive and dual-use techniques (e.g. red-team C2, phishing simulation, exploitation) intended for **authorized penetration testing, security research, defense, and education**. Only use them against systems you own or have **explicit written permission** to test, and comply with all applicable laws and rules of engagement. You are solely responsible for how you use these skills. See [SECURITY.md](SECURITY.md) and [CODE_OF_CONDUCT.md](CODE_OF_CONDUCT.md).
 ## Give any AI agent the security skills of a senior analyst
@@ -10,7 +10,10 @@ import re
 import sys
 import glob
-REQUIRED_FIELDS = ["name", "description", "domain", "subdomain", "tags"]
+# Kept in sync with the CI workflow (.github/workflows/validate-skills.yml),
 # which now delegates to this script so there is a single source of truth.
 REQUIRED_FIELDS = ["name", "description", "domain", "subdomain", "tags",
                   "version", "author", "license"]
 # Canonical subdomain → set of accepted aliases (including canonical itself).
 # When a skill uses an alias, the validator accepts it but the canonical form
@@ -132,6 +135,14 @@ def parse_frontmatter(text):
            data[current_key] = list(list_values)  # copy so future mutations don't leak
            continue
        # Only TOP-LEVEL keys (column 0) define frontmatter fields. An indented
        # ``key: value`` line belongs to a nested structure (e.g. a framework
        # mapping object that has its own ``name:``/``id:``) and must NOT be
        # treated as a top-level field — otherwise a nested ``name:`` clobbers
        # the skill's real ``name``.
        if line[:1].isspace():
            continue
        # Handle inline list: tags: [a, b, c]
        m = re.match(r"^(\w[\w_-]*):\s*\[(.+)\]\s*$", stripped)
        if m: