Files
Anthropic-Cybersecurity-Skills/skills/implementing-secret-scanning-with-gitleaks/references/api-reference.md
T
mukul975 c47eed6a64 Production hardening: security fixes, code quality, 724 skills complete
- Fix 25 shell=True subprocess calls with list-based commands
- Fix 49 verify=False in defensive skills (env-var override)
- Add timeout to 231 HTTP/subprocess/socket calls
- Fix 6 SQL injection patterns with whitelist validation
- Replace 8 __import__() with standard imports
- Remove 701 unused imports across 442 files
- Add authorized-testing disclaimers to all offensive skills
- Complete 11 incomplete skill directories
- Expand 10 stub SKILL.md files with full content
- Fix 2 YAML parse errors in frontmatter
- Fix 5 pre-existing syntax errors
- Convert 22 hardcoded paths/ports to environment variables
- Back up 21 redundant skill pairs to .bak
- Fix 2 global declaration errors
- 724/724 skills with full folder anatomy (SKILL.md + agent.py + api-reference.md + LICENSE)
- 0 compile errors across all 724 agent.py files
2026-03-19 13:26:49 +01:00

181 lines
4.5 KiB
Markdown

# API Reference: Gitleaks Secret Scanning
## Libraries Used
| Library | Purpose |
|---------|---------|
| `subprocess` | Execute gitleaks CLI commands |
| `json` | Parse gitleaks JSON report output |
| `pathlib` | Handle repository and report file paths |
| `os` | Read `GITLEAKS_CONFIG` environment variable |
## Installation
```bash
# Install gitleaks binary
# macOS
brew install gitleaks
# Linux
curl -sSfL https://github.com/gitleaks/gitleaks/releases/latest/download/gitleaks_linux_x64 -o gitleaks
chmod +x gitleaks && sudo mv gitleaks /usr/local/bin/
# Docker
docker pull ghcr.io/gitleaks/gitleaks:latest
```
## CLI Commands
### Scan a Git Repository
```bash
gitleaks git --source=/path/to/repo --report-format=json --report-path=results.json
```
### Scan a Directory (Non-Git)
```bash
gitleaks dir --source=/path/to/code --report-format=json --report-path=results.json
```
### Scan from stdin
```bash
echo "aws_secret_access_key=AKIAIOSFODNN7EXAMPLE" | gitleaks stdin
```
### Key CLI Flags
| Flag | Description |
|------|-------------|
| `--source` | Path to repository or directory to scan |
| `--config`, `-c` | Path to custom gitleaks.toml config |
| `--report-format`, `-f` | Output format: `json`, `csv`, `junit`, `sarif` |
| `--report-path`, `-r` | Path to write the report file |
| `--baseline-path` | Ignore known findings from baseline file |
| `--exit-code` | Exit code when leaks found (default: 1) |
| `--redact` | Redact secrets in output (percent: 0-100) |
| `--verbose`, `-v` | Show verbose scan output |
| `--no-git` | Treat source as plain directory |
| `--log-level` | Log level: trace, debug, info, warn, error |
| `--max-target-megabytes` | Skip files larger than this size |
## Custom Configuration (.gitleaks.toml)
```toml
title = "Custom Gitleaks Config"
[extend]
useDefault = true # Extend the default ruleset
[[rules]]
id = "custom-internal-token"
description = "Internal API token pattern"
regex = '''(?i)internal[_-]?token\s*[:=]\s*['"]?([a-zA-Z0-9]{32,})'''
tags = ["internal", "token"]
keywords = ["internal_token", "internal-token"]
[[rules]]
id = "custom-db-password"
description = "Database password in config"
regex = '''(?i)(db|database|mysql|postgres)[_-]?pass(word)?\s*[:=]\s*['"]?[^\s'"]{8,}'''
tags = ["database", "password"]
[rules.allowlist]
paths = ['''test/.*''', '''mock/.*''']
regexTarget = "line"
regexes = ['''(?i)example|placeholder|changeme|test''']
[[allowlist.paths]]
regex = '''vendor/.*'''
[[allowlist.commits]]
sha = "abc123def456"
```
## Python Integration
### Run Gitleaks and Parse Results
```python
import subprocess
import json
from pathlib import Path
def scan_repository(repo_path, config_path=None):
cmd = [
"gitleaks", "git",
"--source", str(repo_path),
"--report-format", "json",
"--report-path", "/tmp/gitleaks-report.json",
"--exit-code", "0",
]
if config_path:
cmd.extend(["--config", str(config_path)])
result = subprocess.run(cmd, capture_output=True, text=True, timeout=300)
report_path = Path("/tmp/gitleaks-report.json")
if report_path.exists():
with open(report_path) as f:
findings = json.load(f)
return findings
return []
```
### Categorize Findings by Severity
```python
HIGH_SEVERITY_RULES = {
"aws-access-key", "aws-secret-key", "gcp-api-key",
"github-pat", "private-key", "generic-api-key",
}
def categorize_findings(findings):
high, medium, low = [], [], []
for f in findings:
rule = f.get("RuleID", "")
if rule in HIGH_SEVERITY_RULES:
high.append(f)
elif "password" in rule or "token" in rule:
medium.append(f)
else:
low.append(f)
return {"high": high, "medium": medium, "low": low}
```
## GitHub Actions Integration
```yaml
name: Gitleaks Secret Scan
on: [push, pull_request]
jobs:
gitleaks:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- uses: gitleaks/gitleaks-action@v2
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
```
## Output Format
```json
[
{
"Description": "Detected a Generic API Key",
"StartLine": 42,
"EndLine": 42,
"StartColumn": 15,
"EndColumn": 55,
"Match": "REDACTED",
"Secret": "REDACTED",
"File": "config/settings.py",
"Commit": "a1b2c3d4e5f6",
"Author": "developer@example.com",
"Date": "2025-01-15T10:30:00Z",
"RuleID": "generic-api-key",
"Tags": ["api", "key"],
"Fingerprint": "a1b2c3d4:config/settings.py:generic-api-key:42"
}
]
```