mirror of
https://github.com/mukul975/Anthropic-Cybersecurity-Skills.git
synced 2026-06-11 13:44:56 +03:00
350 lines
13 KiB
Markdown
350 lines
13 KiB
Markdown
---
|
|
name: deobfuscating-javascript-malware
|
|
description: >
|
|
Deobfuscates malicious JavaScript code used in web-based attacks, phishing pages, and
|
|
dropper scripts by reversing encoding layers, eval chains, string manipulation, and
|
|
control flow obfuscation to reveal the original malicious logic. Activates for requests
|
|
involving JavaScript malware analysis, script deobfuscation, web skimmer analysis, or
|
|
obfuscated dropper investigation.
|
|
domain: cybersecurity
|
|
subdomain: malware-analysis
|
|
tags: [malware, JavaScript, deobfuscation, web-malware, script-analysis]
|
|
version: 1.0.0
|
|
author: mahipal
|
|
license: MIT
|
|
---
|
|
|
|
# Deobfuscating JavaScript Malware
|
|
|
|
## When to Use
|
|
|
|
- Investigating a phishing page with obfuscated JavaScript that performs credential harvesting or redirect
|
|
- Analyzing a web skimmer (Magecart-style) injected into an e-commerce site
|
|
- Deobfuscating a JavaScript dropper that downloads and executes second-stage malware
|
|
- Examining malicious email attachments containing HTML files with embedded obfuscated scripts
|
|
- Analyzing browser exploit kits that use heavy JavaScript obfuscation to hide exploit delivery
|
|
|
|
**Do not use** for obfuscated JavaScript that is merely minified production code; use a standard beautifier instead.
|
|
|
|
## Prerequisites
|
|
|
|
- Node.js 18+ installed for executing and debugging JavaScript in a controlled environment
|
|
- Python 3.8+ with `jsbeautifier` library for code formatting
|
|
- Browser developer tools (Chrome DevTools) for controlled execution in an isolated browser
|
|
- CyberChef (https://gchq.github.io/CyberChef/) for encoding/decoding operations
|
|
- de4js or JStillery for automated JavaScript deobfuscation
|
|
- Isolated analysis VM with no access to production systems or sensitive data
|
|
|
|
## Workflow
|
|
|
|
### Step 1: Safely Extract and Examine the Obfuscated Script
|
|
|
|
Isolate the malicious JavaScript without executing it:
|
|
|
|
```bash
|
|
# Extract JavaScript from HTML file
|
|
python3 << 'PYEOF'
|
|
from html.parser import HTMLParser
|
|
|
|
class ScriptExtractor(HTMLParser):
|
|
def __init__(self):
|
|
super().__init__()
|
|
self.in_script = False
|
|
self.scripts = []
|
|
self.current = ""
|
|
|
|
def handle_starttag(self, tag, attrs):
|
|
if tag == "script":
|
|
self.in_script = True
|
|
self.current = ""
|
|
|
|
def handle_endtag(self, tag):
|
|
if tag == "script":
|
|
self.in_script = False
|
|
if self.current.strip():
|
|
self.scripts.append(self.current)
|
|
|
|
def handle_data(self, data):
|
|
if self.in_script:
|
|
self.current += data
|
|
|
|
with open("malicious_page.html") as f:
|
|
parser = ScriptExtractor()
|
|
parser.feed(f.read())
|
|
|
|
for i, script in enumerate(parser.scripts):
|
|
with open(f"script_{i}.js", "w") as f:
|
|
f.write(script)
|
|
print(f"Extracted script_{i}.js ({len(script)} bytes)")
|
|
PYEOF
|
|
|
|
# Beautify the extracted JavaScript
|
|
npx js-beautify script_0.js -o script_0_pretty.js
|
|
```
|
|
|
|
### Step 2: Identify Obfuscation Techniques
|
|
|
|
Categorize the obfuscation methods used:
|
|
|
|
```
|
|
Common JavaScript Obfuscation Techniques:
|
|
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
|
|
String Encoding:
|
|
- Hex encoding: "\x68\x65\x6c\x6c\x6f" -> "hello"
|
|
- Unicode escapes: "\u0068\u0065\u006c\u006c\u006f" -> "hello"
|
|
- Base64: atob("aGVsbG8=") -> "hello"
|
|
- charCodeAt/fromCharCode: String.fromCharCode(104,101,108,108,111)
|
|
- Array-based lookup: var _0x1234 = ["hello","world"]; _0x1234[0]
|
|
|
|
Eval Chains:
|
|
- eval(atob("..."))
|
|
- eval(unescape("..."))
|
|
- new Function("return " + decoded)()
|
|
- document.write("<script>" + decoded + "</script>")
|
|
- setTimeout(decoded, 0)
|
|
|
|
Control Flow:
|
|
- Switch-case dispatcher with shuffled case order
|
|
- Opaque predicates (always-true/false conditions)
|
|
- Dead code insertion
|
|
- Variable name mangling (_0x4a3b, _0xab12)
|
|
|
|
Anti-Analysis:
|
|
- Debugger traps: setInterval(function(){debugger;}, 100)
|
|
- Console detection: overriding console.log
|
|
- Timing checks: performance.now() deltas
|
|
- DevTools detection: window.outerWidth - window.innerWidth > 100
|
|
```
|
|
|
|
### Step 3: Remove Anti-Analysis Protections
|
|
|
|
Neutralize anti-debugging and anti-analysis traps:
|
|
|
|
```javascript
|
|
// Remove debugger traps before analysis
|
|
// Replace in the obfuscated script:
|
|
|
|
// Before:
|
|
setInterval(function() { debugger; }, 100);
|
|
|
|
// After (neutralized):
|
|
setInterval(function() { /* debugger removed */ }, 100);
|
|
|
|
// Neutralize DevTools detection
|
|
// Before:
|
|
if (window.outerWidth - window.innerWidth > 160) { window.location = "about:blank"; }
|
|
|
|
// After:
|
|
if (false) { window.location = "about:blank"; }
|
|
|
|
// Neutralize timing checks
|
|
// Override performance.now to return consistent values
|
|
const originalNow = performance.now;
|
|
performance.now = function() { return 0; };
|
|
```
|
|
|
|
### Step 4: Decode String Obfuscation Layers
|
|
|
|
Progressively decode encoded strings:
|
|
|
|
```python
|
|
# Python script to decode common JS obfuscation patterns
|
|
import re
|
|
import base64
|
|
import urllib.parse
|
|
|
|
def decode_hex_strings(code):
|
|
"""Replace \\xNN sequences with ASCII characters"""
|
|
def hex_replace(match):
|
|
hex_str = match.group(0)
|
|
try:
|
|
return bytes.fromhex(hex_str.replace("\\x", "")).decode("ascii")
|
|
except:
|
|
return hex_str
|
|
return re.sub(r'(?:\\x[0-9a-fA-F]{2})+', hex_replace, code)
|
|
|
|
def decode_unicode_escapes(code):
|
|
"""Replace \\uNNNN sequences with characters"""
|
|
def unicode_replace(match):
|
|
return chr(int(match.group(1), 16))
|
|
return re.sub(r'\\u([0-9a-fA-F]{4})', unicode_replace, code)
|
|
|
|
def decode_charcode_arrays(code):
|
|
"""Resolve String.fromCharCode calls"""
|
|
def charcode_replace(match):
|
|
codes = [int(c.strip()) for c in match.group(1).split(",")]
|
|
return '"' + "".join(chr(c) for c in codes) + '"'
|
|
return re.sub(r'String\.fromCharCode\(([0-9,\s]+)\)', charcode_replace, code)
|
|
|
|
def decode_base64_strings(code):
|
|
"""Resolve atob() calls with static strings"""
|
|
def atob_replace(match):
|
|
try:
|
|
decoded = base64.b64decode(match.group(1)).decode("utf-8")
|
|
return f'"{decoded}"'
|
|
except:
|
|
return match.group(0)
|
|
return re.sub(r'atob\(["\']([A-Za-z0-9+/=]+)["\']\)', atob_replace, code)
|
|
|
|
# Apply all decoders
|
|
with open("script_0.js") as f:
|
|
code = f.read()
|
|
|
|
code = decode_hex_strings(code)
|
|
code = decode_unicode_escapes(code)
|
|
code = decode_charcode_arrays(code)
|
|
code = decode_base64_strings(code)
|
|
|
|
with open("script_0_decoded.js", "w") as f:
|
|
f.write(code)
|
|
print("Decoded strings written to script_0_decoded.js")
|
|
```
|
|
|
|
### Step 5: Resolve Eval Chains Safely
|
|
|
|
Unwrap eval/Function constructor chains without executing:
|
|
|
|
```javascript
|
|
// Node.js script to safely resolve eval chains
|
|
// Run in isolated environment: node --experimental-vm-modules deobfuscate.js
|
|
|
|
const vm = require('vm');
|
|
|
|
// Create sandboxed context with logging
|
|
const sandbox = {
|
|
eval: function(code) {
|
|
console.log("=== EVAL INTERCEPTED ===");
|
|
console.log(code.substring(0, 500));
|
|
console.log("========================");
|
|
return code; // Return the code instead of executing it
|
|
},
|
|
document: {
|
|
write: function(html) {
|
|
console.log("=== DOCUMENT.WRITE INTERCEPTED ===");
|
|
console.log(html.substring(0, 500));
|
|
},
|
|
getElementById: function() { return { innerHTML: "" }; }
|
|
},
|
|
window: { location: { href: "" } },
|
|
atob: function(s) { return Buffer.from(s, 'base64').toString(); },
|
|
unescape: unescape,
|
|
setTimeout: function(fn) { if (typeof fn === 'string') console.log("TIMEOUT CODE:", fn); },
|
|
console: console,
|
|
String: String,
|
|
Array: Array,
|
|
parseInt: parseInt,
|
|
RegExp: RegExp,
|
|
};
|
|
|
|
const context = vm.createContext(sandbox);
|
|
|
|
// Load and execute the obfuscated script in sandbox
|
|
const fs = require('fs');
|
|
const code = fs.readFileSync('script_0.js', 'utf8');
|
|
|
|
try {
|
|
vm.runInContext(code, context, { timeout: 5000 });
|
|
} catch(e) {
|
|
console.log("Execution error (expected):", e.message);
|
|
}
|
|
```
|
|
|
|
### Step 6: Analyze the Deobfuscated Payload
|
|
|
|
Examine the revealed malicious logic:
|
|
|
|
```
|
|
Deobfuscated Malware Categories and IOC Extraction:
|
|
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
|
|
Credential Harvester:
|
|
- Form action URLs (exfiltration endpoints)
|
|
- XMLHttpRequest/fetch destinations
|
|
- Targeted input field names (username, password, cc_number)
|
|
|
|
Web Skimmer (Magecart):
|
|
- Payment form overlay injection
|
|
- Card data exfiltration URLs
|
|
- Keylogger event listeners (onkeypress, oninput)
|
|
|
|
Redirect Script:
|
|
- Destination URLs in location.href assignments
|
|
- Conditional redirects based on user-agent or referrer
|
|
- Cloaking logic (show benign content to bots)
|
|
|
|
Exploit Kit Landing:
|
|
- Browser/plugin version checks
|
|
- Exploit payload URLs
|
|
- Shellcode embedded as arrays or encoded strings
|
|
```
|
|
|
|
## Key Concepts
|
|
|
|
| Term | Definition |
|
|
|------|------------|
|
|
| **Eval Chain** | Nested layers of eval(), Function(), or document.write() calls that each decode one layer of obfuscation before passing to the next |
|
|
| **String Array Rotation** | Obfuscation technique storing all strings in a shuffled array and accessing them by computed index to hide string literals |
|
|
| **Dead Code Insertion** | Adding non-functional code blocks that never execute to increase analysis complexity and confuse pattern matching |
|
|
| **Opaque Predicate** | Conditional expression whose outcome is predetermined but difficult to determine statically; used to obscure control flow |
|
|
| **Anti-Debugging** | JavaScript techniques to detect and thwart browser DevTools or debugger usage including debugger statements and timing checks |
|
|
| **Web Skimmer** | Malicious JavaScript injected into e-commerce sites to steal payment card data from checkout forms (Magecart attack) |
|
|
|
|
## Tools & Systems
|
|
|
|
- **CyberChef**: GCHQ's web-based tool for encoding/decoding transformations useful for unwinding multi-layer obfuscation
|
|
- **de4js**: Online JavaScript deobfuscator supporting common obfuscation tools (obfuscator.io, JScrambler)
|
|
- **Node.js VM Module**: Sandboxed JavaScript execution environment for safely evaluating obfuscated code with intercepted APIs
|
|
- **Chrome DevTools**: Browser developer tools for stepping through JavaScript execution with breakpoints and console access
|
|
- **JSDetox**: JavaScript malware analysis tool providing execution emulation and deobfuscation
|
|
|
|
## Common Scenarios
|
|
|
|
### Scenario: Deobfuscating a Magecart Web Skimmer
|
|
|
|
**Context**: A compromised e-commerce site has obfuscated JavaScript injected into its checkout page. The script needs deobfuscation to identify the data exfiltration endpoint and determine what customer data was stolen.
|
|
|
|
**Approach**:
|
|
1. Extract the injected script from the page source (often appended to a legitimate JS file or loaded from an external domain)
|
|
2. Beautify the code and identify the obfuscation technique (typically string array + rotation + hex encoding)
|
|
3. Decode string encoding layers (hex -> Unicode -> base64) using the Python decoder script
|
|
4. Resolve the string array by evaluating the array definition and rotation function
|
|
5. Identify the form targeting logic (querySelector for payment form fields)
|
|
6. Extract the exfiltration URL from the XMLHttpRequest or fetch call
|
|
7. Document stolen data fields and exfiltration endpoint for incident response
|
|
|
|
**Pitfalls**:
|
|
- Executing obfuscated scripts on a connected system (the script may phone home during analysis)
|
|
- Not removing anti-debugging traps before using browser DevTools (infinite debugger loops)
|
|
- Missing additional obfuscation layers loaded dynamically from external URLs
|
|
- Overlooking base64-encoded inline images or data URIs that may contain additional scripts
|
|
|
|
## Output Format
|
|
|
|
```
|
|
JAVASCRIPT MALWARE DEOBFUSCATION REPORT
|
|
=========================================
|
|
Source: checkout.js (injected into example-shop.com)
|
|
Obfuscation: obfuscator.io (string array + rotation + hex encoding)
|
|
Layers Removed: 3
|
|
|
|
OBFUSCATION TECHNIQUES IDENTIFIED
|
|
[1] String array with 247 entries, rotated by 0x1a3
|
|
[2] Hex-encoded string references (\x68\x65\x6c\x6c\x6f)
|
|
[3] Base64-wrapped eval chain (2 layers)
|
|
[4] Anti-debugging: setInterval debugger trap
|
|
|
|
DEOBFUSCATED FUNCTIONALITY
|
|
Type: Magecart Payment Card Skimmer
|
|
Target Forms: input[name*="card"], input[name*="cc_"]
|
|
Data Captured: Card number, expiration, CVV, cardholder name
|
|
Exfil Method: POST via XMLHttpRequest
|
|
Exfil URL: hxxps://analytics-cdn[.]com/collect
|
|
Exfil Format: JSON { "cn": card_number, "exp": expiry, "cv": cvv }
|
|
Trigger: Form submit event on checkout page
|
|
|
|
EXTRACTED IOCs
|
|
Domains: analytics-cdn[.]com
|
|
IPs: 185.220.101[.]42
|
|
URLs: hxxps://analytics-cdn[.]com/collect
|
|
hxxps://analytics-cdn[.]com/gate.js
|
|
```
|