Files
Anthropic-Cybersecurity-Skills/skills/assessing-vector-and-embedding-weaknesses/references/api-reference.md
T
mukul975 8cae0648ec Add 55 new skills across 3 new domains + 6 undercovered areas (762 -> 817)
Demand-driven expansion targeting the fastest-growing 2025-2026 threat and
skills categories (ISC2/WEF/CrowdStrike/Mandiant signals):

- AI Security (NEW domain, 12 skills): LLM red-teaming with garak/PyRIT,
  prompt injection (direct/indirect/RAG), MCP tool-poisoning, agentic tool
  invocation, guardrails, model/data poisoning, system-prompt leakage,
  embedding/vector weaknesses, model extraction, continuous red-teaming
- Supply Chain Security (NEW domain, 5 skills): SBOMs, dependency confusion,
  malicious-npm triage, typosquatting, SLSA/Sigstore provenance
- Hardware & Firmware Security (NEW domain, 4 skills): CHIPSEC/UEFI audit,
  Secure Boot bypass, TPM measured-boot attestation, ESP bootkit hunting
- Identity (10): Entra ID/ROADtools, GraphRunner, AADInternals, ADCS/Certipy,
  shadow credentials, coercion, BloodHound CE, device-code phishing, SSO abuse
- Cloud-native (8): Stratus, Pacu, CloudFox, container escape, K8s RBAC,
  Falco, Trivy, kube-bench
- Offensive C2 (6): Sliver, Havoc, NetExec, DPAPI, NTLM relay ESC8, redirectors
- DFIR (6): Hayabusa, Chainsaw, KAPE, Velociraptor, EZ Tools, Plaso
- Backfill (4): OpenCTI, MISP, honeytokens, post-quantum crypto migration

Each skill follows the repo taxonomy (SKILL.md + references/{standards,api-reference}.md
+ scripts/agent.py + LICENSE), with researched real tool commands (no placeholders),
complete frontmatter, and ATT&CK/ATLAS + NIST CSF mappings. Updates README domain
table, skill count, and index.json.
2026-06-22 19:08:16 +02:00

52 lines
2.3 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# API and Command Reference
## sentence-transformers (embedding generation)
| Call | Purpose |
|------|---------|
| `SentenceTransformer("all-MiniLM-L6-v2")` | Load an embedding model (384-dim) |
| `model.encode([texts])` | Return numpy array of embeddings |
| `model.encode(text, normalize_embeddings=True)` | L2-normalized vectors (for cosine) |
## scikit-learn similarity
| Call | Purpose |
|------|---------|
| `cosine_similarity(a, b)` | Pairwise cosine similarity matrix |
## Qdrant client (qdrant-client)
| Call | Purpose |
|------|---------|
| `QdrantClient(url="http://localhost:6333")` | Connect |
| `client.get_collection(name)` | Inspect vector size + distance metric |
| `client.count(name)` | Corpus size |
| `client.search(collection_name, query_vector, limit, query_filter)` | k-NN search with optional filter |
| `client.upsert(name, points=[PointStruct(id, vector, payload)])` | Insert/update points |
| `Filter(must=[FieldCondition(key, match=MatchValue(value))])` | Metadata filter (tenant isolation) |
## Chroma (chromadb)
| Call | Purpose |
|------|---------|
| `chromadb.Client()` / `PersistentClient(path)` | Connect |
| `collection.query(query_embeddings=[...], n_results=k, where={...})` | k-NN with metadata filter |
| `collection.add(ids, embeddings, metadatas, documents)` | Insert |
## Pinecone (pinecone-client)
| Call | Purpose |
|------|---------|
| `Pinecone(api_key=...)` | Connect |
| `index.query(vector=..., top_k=k, namespace="tenant", filter={...})` | k-NN; namespace = tenant boundary |
| `index.upsert(vectors=[(id, vec, meta)], namespace=...)` | Insert |
## Assessment metrics
| Metric | Meaning |
|--------|---------|
| Inversion cosine | Similarity between reconstructed candidate and target vector; high = recoverable. |
| Membership delta | top-1 score(in-corpus query) top-1 score(control query); large positive = membership leak. |
| Poison dominance | Fraction of unrelated queries returning the poison chunk in top_k. |
| Cross-tenant count | Number of foreign-tenant rows returned to a tenant query (should be 0). |
## vec2text (research baseline)
| Call | Purpose |
|------|---------|
| `vec2text.load_pretrained_corrector("gtr-base")` | Load inversion corrector for compatible embedder |
| `vec2text.invert_embeddings(embeddings, corrector)` | Reconstruct text from embeddings |