# Parallelization Patterns Reference How to decide what to parallelize and which pattern to use. ## Core Principle Parallelize when tasks are **independent**: no shared mutable state, no ordering dependency, and results can be combined without conflict. ## Pattern 1: Independent Tasks **When**: Two or more tasks share no state and have no ordering dependency. **Always parallel.** This is the simplest and most common case. ### Examples - Linting + type checking + unit tests (different tools, same codebase, read-only) - Researching two unrelated libraries - Generating tests for unrelated modules - Reviewing separate files ### Structure ``` [Dispatcher] |--- Agent A: lint src/ |--- Agent B: typecheck src/ |--- Agent C: run tests \--- Agent D: security scan [Collect all results] ``` ### Decision Criteria - Do they read/write the same files? No -> parallel - Does one need output from another? No -> parallel - Can they run in any order? Yes -> parallel ## Pattern 2: Fan-Out / Fan-In **When**: A single task can be split into N identical subtasks, then results are merged. ### Examples - Process each file in a directory independently - Run the same analysis on multiple services - Test multiple configurations - Investigate multiple potential causes of a bug ### Structure ``` [Dispatcher: split work into N chunks] |--- Agent 1: process chunk 1 |--- Agent 2: process chunk 2 |--- Agent 3: process chunk 3 \--- Agent N: process chunk N [Collector: merge results from all agents] ``` ### Implementation Split items across agents (round-robin, by directory, or by type), dispatch all simultaneously, collect results, handle failures by retrying individually, then merge into unified output. ## Pattern 3: Pipeline (Sequential) **When**: Output of step N is input to step N+1. **Must be sequential.** Cannot parallelize. ### Examples - Parse code -> analyze AST -> generate report - Fetch data -> transform -> validate -> persist - Write code -> run tests -> fix failures ### Structure ``` [Step 1: parse] --> [Step 2: analyze] --> [Step 3: report] ``` ### When Pipelines Contain Parallelizable Steps A pipeline stage itself might fan out: ``` [Step 1: identify files] --> [Step 2: analyze each file in parallel (fan-out/fan-in)] --> [Step 3: merge analysis into report] ``` ## Pattern 4: Pipeline with Parallel Stages **When**: Some pipeline stages can run in parallel, others must be sequential. ### Example: Feature Implementation ``` [Sequential: write plan] --> [Parallel: implement module A, implement module B, implement module C] --> [Sequential: integration test] --> [Parallel: write docs, update changelog] --> [Sequential: final review] ``` ## Decision Matrix | Task Characteristic | Pattern | Parallelizable? | |---|---|---| | No shared state, no ordering | Independent | Yes | | Same operation on many items | Fan-out/fan-in | Yes | | Output feeds next step | Pipeline | No | | Mixed dependencies | Pipeline + parallel stages | Partially | | Shared mutable state | Sequential or lock-based | No (usually) | | Non-deterministic ordering matters | Sequential | No | ## Common Parallel Task Patterns ### File-Per-Agent Split work by file or directory. Each agent owns its files exclusively. ``` Agent 1: src/auth/** Agent 2: src/orders/** Agent 3: src/users/** ``` **Best for**: code review, refactoring, test generation, documentation. **Watch out for**: shared utilities, cross-module imports. Assign shared code to one agent or make it read-only for all. ### Test Suite Splitting Split tests by module, type, or estimated runtime. ``` Agent 1: unit tests (fast) Agent 2: integration tests (medium) Agent 3: e2e tests (slow) ``` **Best for**: CI acceleration, pre-merge validation. ### Multi-Service Investigation When debugging spans multiple services, assign one agent per service. ``` Agent 1: investigate auth service logs Agent 2: investigate order service logs Agent 3: investigate payment service logs ``` **Best for**: distributed system debugging, incident response. ### Research Branches Explore multiple hypotheses or approaches simultaneously. ``` Agent 1: research approach A (Redis caching) Agent 2: research approach B (CDN edge caching) Agent 3: research approach C (application-level memoization) ``` **Best for**: technology evaluation, design exploration, root cause hypotheses. ## Anti-Patterns | Anti-Pattern | Problem | Fix | |---|---|---| | Parallelizing dependent tasks | Race conditions, wrong results | Identify dependencies first, use pipeline | | Too many agents | Overhead exceeds benefit | 2-5 agents is typical sweet spot | | No merge strategy | Results conflict or duplicate | Define merge/dedup logic before dispatching | | Shared file writes | Corruption, lost changes | Assign file ownership to one agent | | No failure handling | One failure blocks everything | Collect partial results, retry individually | ## Checklist Before Parallelizing 1. **List all tasks** that need to happen 2. **Draw dependencies** between them (which needs output from which?) 3. **Group independent tasks** into parallel batches 4. **Define the merge strategy** for collecting results 5. **Assign ownership** so no two agents write the same file 6. **Plan for failure** of individual agents 7. **Estimate whether parallelism helps** (overhead vs time saved) ## Quick Reference: Dispatch Decision - Single atomic operation -> just do it, no parallelism - Splittable into independent chunks -> fan-out/fan-in - Each step depends on previous output -> pipeline (sequential) - Mix of independent and dependent steps -> pipeline with parallel stages - Everything independent -> run all in parallel