The Algorithm Design Manual
The practical algorithm designer's reference — technique + problem catalog. Skiena's 'war stories' show how problems arise in practice. The Hitchhiker's Guide catalogs 75 classic problems with implementations.
- › Model real-world problems as known algorithmic problems (graph, DP, matching, etc.)
- › Select correct data structure from requirements (hash table vs BST vs heap vs union-find)
- › Choose sorting algorithm based on data characteristics and stability requirements
- › Apply BFS vs DFS based on problem (shortest path, cycle detection, topological sort)
- › Select shortest path algorithm: BFS / Dijkstra / Bellman-Ford / Floyd-Warshall
- › Recognize and formulate dynamic programming subproblems with recurrence relations
- › Design backtracking search with pruning strategies for constraint satisfaction
- › Apply simulated annealing and greedy heuristics for NP-hard instances
- › Identify NP-complete problems and choose: exact / approximation / heuristic approach
- › Reduce novel problems to known NP-hard problems to prove hardness
Install this skill and Claude can model any real-world problem as a known algorithmic problem, select the right data structure or graph algorithm, recognize NP-completeness, and implement or critique solutions grounded in Skiena's catalog of 75 classic problems
Transforms algorithm selection from guesswork to engineering — Claude maps messy real problems to known solutions instead of reinventing algorithms, catching complexity pitfalls before they become production incidents
- › Identifying a scheduling system with dependencies as topological sort on a DAG and recommending a shortest-path variant for critical-path analysis
- › Proving a proposed vehicle routing problem is a disguised TSP instance and recommending a greedy heuristic with a known approximation ratio
- › Designing the edit-distance recurrence for a string alignment feature, selecting bottom-up tabulation, and optimizing to O(n) space
The Algorithm Design Manual Skill
Core Philosophy
Two skills define good algorithm designers:
- Technique: knowing the fundamental design approaches (DP, DFS, backtracking, etc.)
- Modeling: abstracting a messy real-world problem into a clean, known algorithmic problem
Modeling is more important than technique. Most problems are disguised instances of classic problems. Your job is to recognize which one.
The Hitchhiker’s Mantra: Before implementing from scratch, find out what your problem is called, what is known about it, and where an implementation exists. Re-implementation is usually the wrong move.
Algorithm Analysis
Big-Oh Dominance Hierarchy
1 < log n < n < n log n < n² < n³ < 2ⁿ < n!
- O(n log n): what you can sort in — practical upper bound for most problems
- O(n²): painful but often acceptable up to n = 10,000
- O(2ⁿ): only for n ≤ 30-40 without clever tricks
- O(n!): only for n ≤ 12-15 without pruning
Logarithms Are Everywhere
- Binary search: lg n comparisons
- Balanced BST height: lg n
- Merge sort recursion depth: lg n
- Bits to represent n: lg n
- Whenever you repeatedly halve something, you get a logarithm
Data Structures: Choosing the Right One
Decision Tree
Need fast search + insert + delete?
├── Keys are integers in small range → array/bit vector
├── Ordered access needed → balanced BST (Red-Black, AVL)
├── No ordering needed, just membership → hash table
└── Text strings → suffix tree/array
Need min/max repeatedly?
└── Priority queue (binary heap)
Need connected component or cycle detection?
└── Union-Find (disjoint sets)
Need predecessor/successor efficiently?
└── Balanced BST or skip list
Hash Tables vs BSTs
| Operation | Hash Table | Balanced BST |
|---|---|---|
| Search | O(1) avg | O(log n) |
| Min/Max | O(n) | O(log n) |
| Predecessor | O(n) | O(log n) |
| Range query | O(n) | O(k + log n) |
Use hash tables when you only need equality lookup. Use BSTs when you need ordered operations.
Suffix Arrays/Trees
For string problems: substring search, longest common substring, string compression. Build suffix array in O(n log n), query in O(m log n) where m is pattern length.
Sorting: When and What
Applications of Sorting (surprising uses)
- Uniqueness testing: sort, then scan for adjacent duplicates
- Frequency counting: sort, then count runs
- Finding closest pairs: sort, check adjacent elements only
- Set intersection: merge two sorted arrays in O(n+m)
- Scheduling problems: sort by deadline, slack, ratio
Algorithm Selection
| Input | Best Choice |
|---|---|
| General data, large | Merge sort (stable, O(n log n) worst-case) |
| In-memory, don’t need stability | Quicksort (best constants, O(n log n) avg) |
| Nearly sorted | Insertion sort (O(n) on sorted input) |
| Need k-th smallest, not full sort | Quickselect O(n) avg |
| Bounded integers [0, k] | Counting sort O(n + k) |
| Large records, expensive comparison | Indirect sort (sort indices, then permute) |
Graph Algorithms
Graph Modeling (the key skill)
Almost any combinatorial problem can be modeled as a graph problem. When you see:
- Relationships between entities → undirected graph
- Dependencies, precedence → directed graph (DAG)
- Optimal connections → minimum spanning tree
- Shortest path → BFS (unweighted) or Dijkstra (positive weights) or Bellman-Ford (negative weights)
- Matching/assignment problems → bipartite matching (max flow)
- Circuit/scheduling → topological sort
BFS vs DFS: Use Cases
| BFS | DFS |
|---|---|
| Shortest path (unweighted) | Cycle detection |
| Level-by-level traversal | Topological sort |
| Connected components | Strongly connected components |
| Bipartiteness testing | Articulation vertices / bridges |
Minimum Spanning Trees
- Prim’s: grow the tree greedily from one vertex. Good for dense graphs with adjacency matrix.
- Kruskal’s: sort edges by weight, add if doesn’t create cycle (use union-find). Good for sparse graphs.
- Both: O(m log n)
Shortest Paths
Unweighted → BFS (O(n+m))
Non-negative weights → Dijkstra (O((n+m) log n) with priority queue)
Negative weights → Bellman-Ford (O(nm))
All pairs → Floyd-Warshall (O(n³))
DAG → Topological order + DP (O(n+m))
Dijkstra fails on negative edges — use Bellman-Ford if negative weights are possible.
Network Flow = Bipartite Matching
Max flow from source to sink = maximum matching in bipartite graph. Use to solve:
- Job/machine assignment
- Traffic routing
- Project scheduling with resource constraints
Key insight: model the problem as a flow network, use max-flow algorithm, interpret flow as the solution.
Dynamic Programming
Recognition Pattern
Apply DP when:
- The problem asks for an optimal value (min/max/count)
- The problem has overlapping subproblems (same subproblem solved repeatedly in recursion)
- Subproblems have optimal substructure (optimal solution uses optimal solutions to subproblems)
DP Recipe
- Define the subproblem — what question does
dp[i]ordp[i][j]answer? - Write the recurrence — express
dp[i]in terms of smallerdpvalues - Identify the base cases
- Determine computation order (top-down memoization or bottom-up tabulation)
- Extract the answer from the table
Classic DP Problems
| Problem | Subproblem | Recurrence |
|---|---|---|
| Edit distance | dp[i][j] = edits to match s[0..i] to t[0..j] | min(dp[i-1][j]+1, dp[i][j-1]+1, dp[i-1][j-1]+(s[i]≠t[j])) |
| Longest increasing subsequence | dp[i] = LIS ending at index i | max(dp[j]+1) for all j < i where a[j] < a[i] |
| 0/1 knapsack | dp[i][w] = max value using first i items with weight ≤ w | max(dp[i-1][w], dp[i-1][w-w_i]+v_i) |
| Matrix chain multiply | dp[i][j] = min cost to multiply matrices i..j | min over k of dp[i][k]+dp[k+1][j]+dims |
DP vs Greedy
- Greedy: make locally optimal choice at each step; no reconsideration
- DP: considers all possible splits/choices, caches subresults
- When greedy works, it’s faster. But proving greedy correctness is hard. DP is safer.
Backtracking
Structure
solve(candidate_solution):
if is_complete(candidate_solution):
process(candidate_solution)
return
for each extension of candidate_solution:
if is_valid(extension):
solve(extension)
undo(extension)
Pruning Strategies
- Feasibility pruning: stop if current partial solution can’t lead to a valid complete solution
- Bound pruning: stop if even the best possible completion can’t beat current best (branch and bound)
- Symmetry breaking: avoid generating equivalent solutions
- Ordering heuristics: try most constrained variable/value first (MRV in CSPs)
When Backtracking is the Right Tool
- Generate all valid configurations (permutations, subsets, combinations)
- Constraint satisfaction (sudoku, n-queens, graph coloring)
- Enumeration with pruning is much better than exhaustive search
Heuristic Methods for NP-Hard Problems
Simulated Annealing
When exact solution is infeasible:
- Start with any solution
- Make a random neighboring move
- Accept if better; accept if worse with probability e^(-ΔCost/T)
- Decrease temperature T over time (cooling schedule)
- At T→0, only improvements accepted (degenerates to hill-climbing)
Key parameters: initial temperature, cooling rate, neighborhood definition.
Greedy Heuristics
- Often 2-approximation or better for NP-hard problems
- Fast, simple, often surprisingly good
- Nearest neighbor for TSP: consistently ~20-25% above optimal
When to Use Approximation vs Exact
- Is n small enough for exact? (≤ 20 for brute force, ≤ 100 for DP on some problems)
- Do you need guaranteed optimal or “good enough”?
- Is there a known PTAS or constant-factor approximation?
NP-Completeness
What NP-Complete Means Practically
If a problem is NP-complete:
- Don’t waste time searching for a polynomial algorithm
- Identify which special cases are tractable (planarity? bounded treewidth? special structure?)
- Use approximation algorithms with provable bounds
- Use heuristics (simulated annealing, genetic algorithms) for practical instances
- Consider parameterized complexity (fixed-parameter tractable?)
Key NP-Complete Problems to Know
- SAT: satisfiability of boolean formulas — the canonical NP-complete problem
- 3-SAT: each clause has exactly 3 literals
- Clique: find k vertices all connected to each other
- Vertex Cover: find k vertices covering all edges
- Hamiltonian Cycle: visit every vertex exactly once
- TSP: shortest tour visiting all cities
- Subset Sum / Knapsack: find subset summing to target
- Graph Coloring: color graph with k colors, no adjacent vertices same color
Proving a Problem NP-Hard
Reduce a known NP-hard problem to your problem: “if I could solve your problem in polynomial time, I could also solve [known NP-hard problem] in polynomial time.”
How to Design Algorithms (Skiena’s Method)
- Understand the problem — work through small examples by hand
- Model the problem — what known algorithmic problem does this resemble?
- Consider simple approaches first — O(n²) beats a buggy O(n log n)
- Think about special cases — often the general case reduces to a special case you know
- Try each paradigm: DP, greedy, divide-and-conquer, backtracking, reduction to graph problem
- Seek a related problem — search the Hitchhiker’s Guide catalog