mirror of
https://github.com/mukul975/Anthropic-Cybersecurity-Skills.git
synced 2026-06-12 06:04:56 +03:00
27c6414ca5
Complete skill folder anatomy across all cybersecurity skills: - scripts/agent.py: 80-150 line Python agents using real libraries (impacket, boto3, azure-mgmt-*, kubernetes, pefile, yara, scapy, shodan, stix2, etc.) - references/api-reference.md: real API documentation with method signatures - LICENSE: MIT license for all skill folders
120 lines
3.4 KiB
Markdown
120 lines
3.4 KiB
Markdown
# API Reference: PDF Malware Analysis Tools
|
|
|
|
## PDFiD - PDF Keyword Scanner
|
|
|
|
### Syntax
|
|
```bash
|
|
pdfid.py document.pdf
|
|
pdfid.py -n document.pdf # Show all keywords (including zero counts)
|
|
pdfid.py -e document.pdf # Extra data (entropy)
|
|
pdfid.py -f document.pdf # Force scan (ignore header)
|
|
```
|
|
|
|
### Suspicious Keywords
|
|
| Keyword | Risk | Description |
|
|
|---------|------|-------------|
|
|
| `/JS` | HIGH | JavaScript code |
|
|
| `/JavaScript` | HIGH | JavaScript action |
|
|
| `/AA` | HIGH | Additional Actions (auto-execute) |
|
|
| `/OpenAction` | HIGH | Action on document open |
|
|
| `/Launch` | HIGH | Launch external application |
|
|
| `/EmbeddedFile` | MEDIUM | Embedded file object |
|
|
| `/AcroForm` | MEDIUM | Interactive form |
|
|
| `/JBIG2Decode` | HIGH | JBIG2 exploit vector (CVE-2009-0658) |
|
|
| `/RichMedia` | MEDIUM | Flash/multimedia content |
|
|
| `/XFA` | MEDIUM | XML Forms (script capable) |
|
|
| `/ObjStm` | LOW | Object streams (can hide objects) |
|
|
|
|
### Output Format
|
|
```
|
|
PDF Header: %PDF-1.7
|
|
obj 45
|
|
endobj 45
|
|
stream 12
|
|
/JS 2
|
|
/JavaScript 1
|
|
/OpenAction 1
|
|
/EmbeddedFile 0
|
|
```
|
|
|
|
## pdf-parser.py - PDF Object Parser
|
|
|
|
### Syntax
|
|
```bash
|
|
pdf-parser.py document.pdf # List all objects
|
|
pdf-parser.py -o 5 document.pdf # Show object 5
|
|
pdf-parser.py -s "/JS" document.pdf # Search for keyword
|
|
pdf-parser.py -f document.pdf # Filter streams
|
|
pdf-parser.py -c document.pdf # Show raw content
|
|
pdf-parser.py -d 5 document.pdf # Dump stream of object 5
|
|
pdf-parser.py --object 5 --filter document.pdf # Decompress stream
|
|
```
|
|
|
|
## peepdf - Interactive PDF Analysis
|
|
|
|
### Syntax
|
|
```bash
|
|
peepdf -i document.pdf # Interactive mode
|
|
peepdf -f document.pdf # Force analysis
|
|
peepdf -l document.pdf # Loose mode
|
|
```
|
|
|
|
### Interactive Commands
|
|
```
|
|
info # Document summary
|
|
tree # Object tree
|
|
object 5 # Show object
|
|
stream 5 # Show stream content
|
|
js_analyse # Analyze all JavaScript
|
|
extract js > output.js # Extract JavaScript
|
|
```
|
|
|
|
## Known PDF Exploit CVEs
|
|
|
|
| CVE | Component | Description |
|
|
|-----|-----------|-------------|
|
|
| CVE-2009-0658 | JBIG2Decode | Buffer overflow in JBIG2 decoder |
|
|
| CVE-2009-0927 | Collab.getIcon | JavaScript method exploit |
|
|
| CVE-2008-2992 | util.printf | Format string vulnerability |
|
|
| CVE-2010-0188 | LibTIFF | TIFF image processing overflow |
|
|
| CVE-2013-0640 | XFA | XML Forms Architecture exploit |
|
|
| CVE-2018-4990 | EmbeddedFile | Double-free in embedded files |
|
|
|
|
## YARA Rules for PDF Malware
|
|
|
|
### Example Rule
|
|
```yara
|
|
rule PDF_Suspicious {
|
|
meta:
|
|
description = "PDF with JavaScript and auto-execution"
|
|
strings:
|
|
$pdf = "%PDF-"
|
|
$js = "/JS" nocase
|
|
$openaction = "/OpenAction"
|
|
$launch = "/Launch"
|
|
condition:
|
|
$pdf at 0 and ($js and $openaction) or $launch
|
|
}
|
|
```
|
|
|
|
## Python PDF Libraries
|
|
|
|
### PyPDF2
|
|
```python
|
|
from PyPDF2 import PdfReader
|
|
reader = PdfReader("document.pdf")
|
|
print(len(reader.pages))
|
|
for page in reader.pages:
|
|
print(page.extract_text())
|
|
```
|
|
|
|
### pikepdf
|
|
```python
|
|
import pikepdf
|
|
pdf = pikepdf.open("document.pdf")
|
|
for obj_num in pdf.objects:
|
|
obj = pdf.get_object(obj_num)
|
|
if "/JS" in str(obj):
|
|
print(f"JavaScript in object {obj_num}")
|
|
```
|