Files
Anthropic-Cybersecurity-Skills/skills/detecting-indirect-prompt-injection/references/api-reference.md
T
mukul975 8cae0648ec Add 55 new skills across 3 new domains + 6 undercovered areas (762 -> 817)
Demand-driven expansion targeting the fastest-growing 2025-2026 threat and
skills categories (ISC2/WEF/CrowdStrike/Mandiant signals):

- AI Security (NEW domain, 12 skills): LLM red-teaming with garak/PyRIT,
  prompt injection (direct/indirect/RAG), MCP tool-poisoning, agentic tool
  invocation, guardrails, model/data poisoning, system-prompt leakage,
  embedding/vector weaknesses, model extraction, continuous red-teaming
- Supply Chain Security (NEW domain, 5 skills): SBOMs, dependency confusion,
  malicious-npm triage, typosquatting, SLSA/Sigstore provenance
- Hardware & Firmware Security (NEW domain, 4 skills): CHIPSEC/UEFI audit,
  Secure Boot bypass, TPM measured-boot attestation, ESP bootkit hunting
- Identity (10): Entra ID/ROADtools, GraphRunner, AADInternals, ADCS/Certipy,
  shadow credentials, coercion, BloodHound CE, device-code phishing, SSO abuse
- Cloud-native (8): Stratus, Pacu, CloudFox, container escape, K8s RBAC,
  Falco, Trivy, kube-bench
- Offensive C2 (6): Sliver, Havoc, NetExec, DPAPI, NTLM relay ESC8, redirectors
- DFIR (6): Hayabusa, Chainsaw, KAPE, Velociraptor, EZ Tools, Plaso
- Backfill (4): OpenCTI, MISP, honeytokens, post-quantum crypto migration

Each skill follows the repo taxonomy (SKILL.md + references/{standards,api-reference}.md
+ scripts/agent.py + LICENSE), with researched real tool commands (no placeholders),
complete frontmatter, and ATT&CK/ATLAS + NIST CSF mappings. Updates README domain
table, skill count, and index.json.
2026-06-22 19:08:16 +02:00

2.0 KiB

API Reference — Indirect Prompt Injection Detection

LLM Guard

Install: pip install llm-guard

API Description
from llm_guard.input_scanners import PromptInjection Import the injection scanner
PromptInjection(threshold=0.5, match_type=MatchType.FULL) Construct scanner (FULL or SENTENCE match)
scanner.scan(text) Returns (sanitized_text, is_valid, risk_score)
from llm_guard import scan_prompt Run multiple scanners over a prompt

is_valid == False indicates an injection was detected.

Transformers detector models

Install: pip install transformers torch

API Description
pipeline("text-classification", model=...) Load a classifier pipeline
protectai/deberta-v3-base-prompt-injection-v2 Open prompt-injection classifier (labels: SAFE / INJECTION)
meta-llama/Llama-Prompt-Guard-2-86M Meta jailbreak/injection classifier (gated license)

Content extraction

API Description
BeautifulSoup(html, "html.parser") Parse HTML
soup.find_all(string=lambda t: isinstance(t, Comment)) Extract HTML comments
pypdf.PdfReader(path).pages[i].extract_text() Extract PDF text
pytesseract.image_to_string(Image.open(path)) OCR text from an image
PIL.Image.open(path)._getexif() Read EXIF metadata

Normalization helpers

Technique Method
Strip zero-width chars str.translate over U+200B..U+FEFF
Strip Unicode tag chars filter ord in range 0xE0000-0xE007F
Canonicalize unicodedata.normalize("NFKC", text)
Decode Base64 base64.b64decode(token)
Decode ROT13 codecs.decode(text, "rot_13")

External References