Files
Anthropic-Cybersecurity-Skills/.github/workflows/validate-skills.yml
T
Homan Ansari 5f5edbb30b Fix validator nested-name misparse, unify with CI, add authorized-use banner
Issues found in review:

1. tools/validate-skill.py: parse_frontmatter operated on the stripped line, so
   an indented nested `name:` (under framework-mapping lists, e.g.
   `name: 'Create Fake Materials: Fake Website'`) clobbered the skill's
   top-level `name`. That produced 94 spurious "invalid kebab-case name"
   failures out of 762. Now indented (non-list) key lines are ignored, so only
   top-level keys define frontmatter fields. Result: 762/762 pass.

2. Two divergent validators: the CI workflow had its own weaker inline parser
   (no subdomain/tag/description checks) requiring a different field set than
   tools/validate-skill.py. CI now delegates to tools/validate-skill.py --all
   (single source of truth); REQUIRED_FIELDS aligned to include
   version/author/license. The duplicate-name and stats steps are unchanged.

3. README: added an explicit authorized-&-lawful-use disclaimer next to the
   existing "not affiliated with Anthropic" note, since the library ships
   offensive/dual-use techniques.

No skill content changed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-22 18:09:19 +02:00

83 lines
2.9 KiB
YAML

name: Validate SKILL.md files
on:
push:
paths:
- 'skills/**'
pull_request:
paths:
- 'skills/**'
jobs:
validate:
runs-on: ubuntu-latest
name: Validate SKILL.md frontmatter
steps:
- uses: actions/checkout@v4
# Single source of truth: tools/validate-skill.py validates required
# frontmatter fields, kebab-case name, description length, subdomain, and
# tag count. (Previously this step duplicated a weaker inline parser.)
- name: Validate SKILL.md frontmatter
run: python3 tools/validate-skill.py --all
- name: Check for duplicate skill names
run: |
python3 << 'EOF'
import os
import re
from collections import Counter
names = []
for root, dirs, files in os.walk('skills'):
for file in files:
if file == 'SKILL.md':
path = os.path.join(root, file)
with open(path, 'r', encoding='utf-8') as f:
content = f.read()
fm_match = re.match(r'^---\n(.*?)\n---', content, re.DOTALL)
if fm_match:
name_match = re.search(r'^name:\s*(.+)$', fm_match.group(1), re.MULTILINE)
if name_match:
names.append(name_match.group(1).strip().strip('"'))
duplicates = [name for name, count in Counter(names).items() if count > 1]
if duplicates:
print(f"❌ Duplicate skill names found: {duplicates}")
exit(1)
print(f"✅ No duplicate names in {len(names)} skills")
EOF
- name: Report skill counts
if: always()
run: |
echo "## Skill Database Stats" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
python3 << 'EOF'
import os
import re
from collections import Counter
subdomain_counts = Counter()
total = 0
for root, dirs, files in os.walk('skills'):
for file in files:
if file == 'SKILL.md':
total += 1
path = os.path.join(root, file)
with open(path, 'r', encoding='utf-8') as f:
content = f.read()
fm_match = re.match(r'^---\n(.*?)\n---', content, re.DOTALL)
if fm_match:
sd_match = re.search(r'^subdomain:\s*(.+)$', fm_match.group(1), re.MULTILINE)
if sd_match:
subdomain_counts[sd_match.group(1).strip()] += 1
print(f"**Total Skills: {total}**")
print("")
print("| Subdomain | Count |")
print("|-----------|-------|")
for sd, count in sorted(subdomain_counts.items(), key=lambda x: -x[1]):
print(f"| {sd} | {count} |")
EOF