mirror of
https://github.com/duthaho/claudekit.git
synced 2026-06-10 20:24:57 +03:00
197 lines
5.6 KiB
Markdown
197 lines
5.6 KiB
Markdown
# Parallelization Patterns Reference
|
|
|
|
How to decide what to parallelize and which pattern to use.
|
|
|
|
## Core Principle
|
|
|
|
Parallelize when tasks are **independent**: no shared mutable state, no ordering dependency, and results can be combined without conflict.
|
|
|
|
## Pattern 1: Independent Tasks
|
|
|
|
**When**: Two or more tasks share no state and have no ordering dependency.
|
|
|
|
**Always parallel.** This is the simplest and most common case.
|
|
|
|
### Examples
|
|
|
|
- Linting + type checking + unit tests (different tools, same codebase, read-only)
|
|
- Researching two unrelated libraries
|
|
- Generating tests for unrelated modules
|
|
- Reviewing separate files
|
|
|
|
### Structure
|
|
|
|
```
|
|
[Dispatcher]
|
|
|--- Agent A: lint src/
|
|
|--- Agent B: typecheck src/
|
|
|--- Agent C: run tests
|
|
\--- Agent D: security scan
|
|
[Collect all results]
|
|
```
|
|
|
|
### Decision Criteria
|
|
|
|
- Do they read/write the same files? No -> parallel
|
|
- Does one need output from another? No -> parallel
|
|
- Can they run in any order? Yes -> parallel
|
|
|
|
## Pattern 2: Fan-Out / Fan-In
|
|
|
|
**When**: A single task can be split into N identical subtasks, then results are merged.
|
|
|
|
### Examples
|
|
|
|
- Process each file in a directory independently
|
|
- Run the same analysis on multiple services
|
|
- Test multiple configurations
|
|
- Investigate multiple potential causes of a bug
|
|
|
|
### Structure
|
|
|
|
```
|
|
[Dispatcher: split work into N chunks]
|
|
|--- Agent 1: process chunk 1
|
|
|--- Agent 2: process chunk 2
|
|
|--- Agent 3: process chunk 3
|
|
\--- Agent N: process chunk N
|
|
[Collector: merge results from all agents]
|
|
```
|
|
|
|
### Implementation
|
|
|
|
Split items across agents (round-robin, by directory, or by type), dispatch all simultaneously, collect results, handle failures by retrying individually, then merge into unified output.
|
|
|
|
## Pattern 3: Pipeline (Sequential)
|
|
|
|
**When**: Output of step N is input to step N+1.
|
|
|
|
**Must be sequential.** Cannot parallelize.
|
|
|
|
### Examples
|
|
|
|
- Parse code -> analyze AST -> generate report
|
|
- Fetch data -> transform -> validate -> persist
|
|
- Write code -> run tests -> fix failures
|
|
|
|
### Structure
|
|
|
|
```
|
|
[Step 1: parse] --> [Step 2: analyze] --> [Step 3: report]
|
|
```
|
|
|
|
### When Pipelines Contain Parallelizable Steps
|
|
|
|
A pipeline stage itself might fan out:
|
|
|
|
```
|
|
[Step 1: identify files]
|
|
--> [Step 2: analyze each file in parallel (fan-out/fan-in)]
|
|
--> [Step 3: merge analysis into report]
|
|
```
|
|
|
|
## Pattern 4: Pipeline with Parallel Stages
|
|
|
|
**When**: Some pipeline stages can run in parallel, others must be sequential.
|
|
|
|
### Example: Feature Implementation
|
|
|
|
```
|
|
[Sequential: write plan]
|
|
--> [Parallel: implement module A, implement module B, implement module C]
|
|
--> [Sequential: integration test]
|
|
--> [Parallel: write docs, update changelog]
|
|
--> [Sequential: final review]
|
|
```
|
|
|
|
## Decision Matrix
|
|
|
|
| Task Characteristic | Pattern | Parallelizable? |
|
|
|---|---|---|
|
|
| No shared state, no ordering | Independent | Yes |
|
|
| Same operation on many items | Fan-out/fan-in | Yes |
|
|
| Output feeds next step | Pipeline | No |
|
|
| Mixed dependencies | Pipeline + parallel stages | Partially |
|
|
| Shared mutable state | Sequential or lock-based | No (usually) |
|
|
| Non-deterministic ordering matters | Sequential | No |
|
|
|
|
## Common Parallel Task Patterns
|
|
|
|
### File-Per-Agent
|
|
|
|
Split work by file or directory. Each agent owns its files exclusively.
|
|
|
|
```
|
|
Agent 1: src/auth/**
|
|
Agent 2: src/orders/**
|
|
Agent 3: src/users/**
|
|
```
|
|
|
|
**Best for**: code review, refactoring, test generation, documentation.
|
|
|
|
**Watch out for**: shared utilities, cross-module imports. Assign shared code to one agent or make it read-only for all.
|
|
|
|
### Test Suite Splitting
|
|
|
|
Split tests by module, type, or estimated runtime.
|
|
|
|
```
|
|
Agent 1: unit tests (fast)
|
|
Agent 2: integration tests (medium)
|
|
Agent 3: e2e tests (slow)
|
|
```
|
|
|
|
**Best for**: CI acceleration, pre-merge validation.
|
|
|
|
### Multi-Service Investigation
|
|
|
|
When debugging spans multiple services, assign one agent per service.
|
|
|
|
```
|
|
Agent 1: investigate auth service logs
|
|
Agent 2: investigate order service logs
|
|
Agent 3: investigate payment service logs
|
|
```
|
|
|
|
**Best for**: distributed system debugging, incident response.
|
|
|
|
### Research Branches
|
|
|
|
Explore multiple hypotheses or approaches simultaneously.
|
|
|
|
```
|
|
Agent 1: research approach A (Redis caching)
|
|
Agent 2: research approach B (CDN edge caching)
|
|
Agent 3: research approach C (application-level memoization)
|
|
```
|
|
|
|
**Best for**: technology evaluation, design exploration, root cause hypotheses.
|
|
|
|
## Anti-Patterns
|
|
|
|
| Anti-Pattern | Problem | Fix |
|
|
|---|---|---|
|
|
| Parallelizing dependent tasks | Race conditions, wrong results | Identify dependencies first, use pipeline |
|
|
| Too many agents | Overhead exceeds benefit | 2-5 agents is typical sweet spot |
|
|
| No merge strategy | Results conflict or duplicate | Define merge/dedup logic before dispatching |
|
|
| Shared file writes | Corruption, lost changes | Assign file ownership to one agent |
|
|
| No failure handling | One failure blocks everything | Collect partial results, retry individually |
|
|
|
|
## Checklist Before Parallelizing
|
|
|
|
1. **List all tasks** that need to happen
|
|
2. **Draw dependencies** between them (which needs output from which?)
|
|
3. **Group independent tasks** into parallel batches
|
|
4. **Define the merge strategy** for collecting results
|
|
5. **Assign ownership** so no two agents write the same file
|
|
6. **Plan for failure** of individual agents
|
|
7. **Estimate whether parallelism helps** (overhead vs time saved)
|
|
|
|
## Quick Reference: Dispatch Decision
|
|
|
|
- Single atomic operation -> just do it, no parallelism
|
|
- Splittable into independent chunks -> fan-out/fan-in
|
|
- Each step depends on previous output -> pipeline (sequential)
|
|
- Mix of independent and dependent steps -> pipeline with parallel stages
|
|
- Everything independent -> run all in parallel
|