5.6 KiB
Parallelization Patterns Reference
How to decide what to parallelize and which pattern to use.
Core Principle
Parallelize when tasks are independent: no shared mutable state, no ordering dependency, and results can be combined without conflict.
Pattern 1: Independent Tasks
When: Two or more tasks share no state and have no ordering dependency.
Always parallel. This is the simplest and most common case.
Examples
- Linting + type checking + unit tests (different tools, same codebase, read-only)
- Researching two unrelated libraries
- Generating tests for unrelated modules
- Reviewing separate files
Structure
[Dispatcher]
|--- Agent A: lint src/
|--- Agent B: typecheck src/
|--- Agent C: run tests
\--- Agent D: security scan
[Collect all results]
Decision Criteria
- Do they read/write the same files? No -> parallel
- Does one need output from another? No -> parallel
- Can they run in any order? Yes -> parallel
Pattern 2: Fan-Out / Fan-In
When: A single task can be split into N identical subtasks, then results are merged.
Examples
- Process each file in a directory independently
- Run the same analysis on multiple services
- Test multiple configurations
- Investigate multiple potential causes of a bug
Structure
[Dispatcher: split work into N chunks]
|--- Agent 1: process chunk 1
|--- Agent 2: process chunk 2
|--- Agent 3: process chunk 3
\--- Agent N: process chunk N
[Collector: merge results from all agents]
Implementation
Split items across agents (round-robin, by directory, or by type), dispatch all simultaneously, collect results, handle failures by retrying individually, then merge into unified output.
Pattern 3: Pipeline (Sequential)
When: Output of step N is input to step N+1.
Must be sequential. Cannot parallelize.
Examples
- Parse code -> analyze AST -> generate report
- Fetch data -> transform -> validate -> persist
- Write code -> run tests -> fix failures
Structure
[Step 1: parse] --> [Step 2: analyze] --> [Step 3: report]
When Pipelines Contain Parallelizable Steps
A pipeline stage itself might fan out:
[Step 1: identify files]
--> [Step 2: analyze each file in parallel (fan-out/fan-in)]
--> [Step 3: merge analysis into report]
Pattern 4: Pipeline with Parallel Stages
When: Some pipeline stages can run in parallel, others must be sequential.
Example: Feature Implementation
[Sequential: write plan]
--> [Parallel: implement module A, implement module B, implement module C]
--> [Sequential: integration test]
--> [Parallel: write docs, update changelog]
--> [Sequential: final review]
Decision Matrix
| Task Characteristic | Pattern | Parallelizable? |
|---|---|---|
| No shared state, no ordering | Independent | Yes |
| Same operation on many items | Fan-out/fan-in | Yes |
| Output feeds next step | Pipeline | No |
| Mixed dependencies | Pipeline + parallel stages | Partially |
| Shared mutable state | Sequential or lock-based | No (usually) |
| Non-deterministic ordering matters | Sequential | No |
Common Parallel Task Patterns
File-Per-Agent
Split work by file or directory. Each agent owns its files exclusively.
Agent 1: src/auth/**
Agent 2: src/orders/**
Agent 3: src/users/**
Best for: code review, refactoring, test generation, documentation.
Watch out for: shared utilities, cross-module imports. Assign shared code to one agent or make it read-only for all.
Test Suite Splitting
Split tests by module, type, or estimated runtime.
Agent 1: unit tests (fast)
Agent 2: integration tests (medium)
Agent 3: e2e tests (slow)
Best for: CI acceleration, pre-merge validation.
Multi-Service Investigation
When debugging spans multiple services, assign one agent per service.
Agent 1: investigate auth service logs
Agent 2: investigate order service logs
Agent 3: investigate payment service logs
Best for: distributed system debugging, incident response.
Research Branches
Explore multiple hypotheses or approaches simultaneously.
Agent 1: research approach A (Redis caching)
Agent 2: research approach B (CDN edge caching)
Agent 3: research approach C (application-level memoization)
Best for: technology evaluation, design exploration, root cause hypotheses.
Anti-Patterns
| Anti-Pattern | Problem | Fix |
|---|---|---|
| Parallelizing dependent tasks | Race conditions, wrong results | Identify dependencies first, use pipeline |
| Too many agents | Overhead exceeds benefit | 2-5 agents is typical sweet spot |
| No merge strategy | Results conflict or duplicate | Define merge/dedup logic before dispatching |
| Shared file writes | Corruption, lost changes | Assign file ownership to one agent |
| No failure handling | One failure blocks everything | Collect partial results, retry individually |
Checklist Before Parallelizing
- List all tasks that need to happen
- Draw dependencies between them (which needs output from which?)
- Group independent tasks into parallel batches
- Define the merge strategy for collecting results
- Assign ownership so no two agents write the same file
- Plan for failure of individual agents
- Estimate whether parallelism helps (overhead vs time saved)
Quick Reference: Dispatch Decision
- Single atomic operation -> just do it, no parallelism
- Splittable into independent chunks -> fan-out/fan-in
- Each step depends on previous output -> pipeline (sequential)
- Mix of independent and dependent steps -> pipeline with parallel stages
- Everything independent -> run all in parallel