The Algorithm Design Manual

The Algorithm Design Manual, 2nd Edition · Steven S. Skiena ·730 pages

The practical algorithm designer's reference — technique + problem catalog. Skiena's 'war stories' show how problems arise in practice. The Hitchhiker's Guide catalogs 75 classic problems with implementations.

Capabilities (10)
  • Model real-world problems as known algorithmic problems (graph, DP, matching, etc.)
  • Select correct data structure from requirements (hash table vs BST vs heap vs union-find)
  • Choose sorting algorithm based on data characteristics and stability requirements
  • Apply BFS vs DFS based on problem (shortest path, cycle detection, topological sort)
  • Select shortest path algorithm: BFS / Dijkstra / Bellman-Ford / Floyd-Warshall
  • Recognize and formulate dynamic programming subproblems with recurrence relations
  • Design backtracking search with pruning strategies for constraint satisfaction
  • Apply simulated annealing and greedy heuristics for NP-hard instances
  • Identify NP-complete problems and choose: exact / approximation / heuristic approach
  • Reduce novel problems to known NP-hard problems to prove hardness
How to use

Install this skill and Claude can model any real-world problem as a known algorithmic problem, select the right data structure or graph algorithm, recognize NP-completeness, and implement or critique solutions grounded in Skiena's catalog of 75 classic problems

Why it matters

Transforms algorithm selection from guesswork to engineering — Claude maps messy real problems to known solutions instead of reinventing algorithms, catching complexity pitfalls before they become production incidents

Example use cases
  • Identifying a scheduling system with dependencies as topological sort on a DAG and recommending a shortest-path variant for critical-path analysis
  • Proving a proposed vehicle routing problem is a disguised TSP instance and recommending a greedy heuristic with a known approximation ratio
  • Designing the edit-distance recurrence for a string alignment feature, selecting bottom-up tabulation, and optimizing to O(n) space

The Algorithm Design Manual Skill

Core Philosophy

Two skills define good algorithm designers:

  1. Technique: knowing the fundamental design approaches (DP, DFS, backtracking, etc.)
  2. Modeling: abstracting a messy real-world problem into a clean, known algorithmic problem

Modeling is more important than technique. Most problems are disguised instances of classic problems. Your job is to recognize which one.

The Hitchhiker’s Mantra: Before implementing from scratch, find out what your problem is called, what is known about it, and where an implementation exists. Re-implementation is usually the wrong move.


Algorithm Analysis

Big-Oh Dominance Hierarchy

1 < log n < n < n log n < n² < n³ < 2ⁿ < n!
  • O(n log n): what you can sort in — practical upper bound for most problems
  • O(n²): painful but often acceptable up to n = 10,000
  • O(2ⁿ): only for n ≤ 30-40 without clever tricks
  • O(n!): only for n ≤ 12-15 without pruning

Logarithms Are Everywhere

  • Binary search: lg n comparisons
  • Balanced BST height: lg n
  • Merge sort recursion depth: lg n
  • Bits to represent n: lg n
  • Whenever you repeatedly halve something, you get a logarithm

Data Structures: Choosing the Right One

Decision Tree

Need fast search + insert + delete?
  ├── Keys are integers in small range → array/bit vector
  ├── Ordered access needed → balanced BST (Red-Black, AVL)
  ├── No ordering needed, just membership → hash table
  └── Text strings → suffix tree/array

Need min/max repeatedly?
  └── Priority queue (binary heap)

Need connected component or cycle detection?
  └── Union-Find (disjoint sets)

Need predecessor/successor efficiently?
  └── Balanced BST or skip list

Hash Tables vs BSTs

OperationHash TableBalanced BST
SearchO(1) avgO(log n)
Min/MaxO(n)O(log n)
PredecessorO(n)O(log n)
Range queryO(n)O(k + log n)

Use hash tables when you only need equality lookup. Use BSTs when you need ordered operations.

Suffix Arrays/Trees

For string problems: substring search, longest common substring, string compression. Build suffix array in O(n log n), query in O(m log n) where m is pattern length.


Sorting: When and What

Applications of Sorting (surprising uses)

  • Uniqueness testing: sort, then scan for adjacent duplicates
  • Frequency counting: sort, then count runs
  • Finding closest pairs: sort, check adjacent elements only
  • Set intersection: merge two sorted arrays in O(n+m)
  • Scheduling problems: sort by deadline, slack, ratio

Algorithm Selection

InputBest Choice
General data, largeMerge sort (stable, O(n log n) worst-case)
In-memory, don’t need stabilityQuicksort (best constants, O(n log n) avg)
Nearly sortedInsertion sort (O(n) on sorted input)
Need k-th smallest, not full sortQuickselect O(n) avg
Bounded integers [0, k]Counting sort O(n + k)
Large records, expensive comparisonIndirect sort (sort indices, then permute)

Graph Algorithms

Graph Modeling (the key skill)

Almost any combinatorial problem can be modeled as a graph problem. When you see:

  • Relationships between entities → undirected graph
  • Dependencies, precedence → directed graph (DAG)
  • Optimal connections → minimum spanning tree
  • Shortest path → BFS (unweighted) or Dijkstra (positive weights) or Bellman-Ford (negative weights)
  • Matching/assignment problems → bipartite matching (max flow)
  • Circuit/scheduling → topological sort

BFS vs DFS: Use Cases

BFSDFS
Shortest path (unweighted)Cycle detection
Level-by-level traversalTopological sort
Connected componentsStrongly connected components
Bipartiteness testingArticulation vertices / bridges

Minimum Spanning Trees

  • Prim’s: grow the tree greedily from one vertex. Good for dense graphs with adjacency matrix.
  • Kruskal’s: sort edges by weight, add if doesn’t create cycle (use union-find). Good for sparse graphs.
  • Both: O(m log n)

Shortest Paths

Unweighted → BFS (O(n+m))
Non-negative weights → Dijkstra (O((n+m) log n) with priority queue)
Negative weights → Bellman-Ford (O(nm))
All pairs → Floyd-Warshall (O(n³))
DAG → Topological order + DP (O(n+m))

Dijkstra fails on negative edges — use Bellman-Ford if negative weights are possible.

Network Flow = Bipartite Matching

Max flow from source to sink = maximum matching in bipartite graph. Use to solve:

  • Job/machine assignment
  • Traffic routing
  • Project scheduling with resource constraints

Key insight: model the problem as a flow network, use max-flow algorithm, interpret flow as the solution.


Dynamic Programming

Recognition Pattern

Apply DP when:

  1. The problem asks for an optimal value (min/max/count)
  2. The problem has overlapping subproblems (same subproblem solved repeatedly in recursion)
  3. Subproblems have optimal substructure (optimal solution uses optimal solutions to subproblems)

DP Recipe

  1. Define the subproblem — what question does dp[i] or dp[i][j] answer?
  2. Write the recurrence — express dp[i] in terms of smaller dp values
  3. Identify the base cases
  4. Determine computation order (top-down memoization or bottom-up tabulation)
  5. Extract the answer from the table

Classic DP Problems

ProblemSubproblemRecurrence
Edit distancedp[i][j] = edits to match s[0..i] to t[0..j]min(dp[i-1][j]+1, dp[i][j-1]+1, dp[i-1][j-1]+(s[i]≠t[j]))
Longest increasing subsequencedp[i] = LIS ending at index imax(dp[j]+1) for all j < i where a[j] < a[i]
0/1 knapsackdp[i][w] = max value using first i items with weight ≤ wmax(dp[i-1][w], dp[i-1][w-w_i]+v_i)
Matrix chain multiplydp[i][j] = min cost to multiply matrices i..jmin over k of dp[i][k]+dp[k+1][j]+dims

DP vs Greedy

  • Greedy: make locally optimal choice at each step; no reconsideration
  • DP: considers all possible splits/choices, caches subresults
  • When greedy works, it’s faster. But proving greedy correctness is hard. DP is safer.

Backtracking

Structure

solve(candidate_solution):
    if is_complete(candidate_solution):
        process(candidate_solution)
        return

    for each extension of candidate_solution:
        if is_valid(extension):
            solve(extension)
            undo(extension)

Pruning Strategies

  1. Feasibility pruning: stop if current partial solution can’t lead to a valid complete solution
  2. Bound pruning: stop if even the best possible completion can’t beat current best (branch and bound)
  3. Symmetry breaking: avoid generating equivalent solutions
  4. Ordering heuristics: try most constrained variable/value first (MRV in CSPs)

When Backtracking is the Right Tool

  • Generate all valid configurations (permutations, subsets, combinations)
  • Constraint satisfaction (sudoku, n-queens, graph coloring)
  • Enumeration with pruning is much better than exhaustive search

Heuristic Methods for NP-Hard Problems

Simulated Annealing

When exact solution is infeasible:

  1. Start with any solution
  2. Make a random neighboring move
  3. Accept if better; accept if worse with probability e^(-ΔCost/T)
  4. Decrease temperature T over time (cooling schedule)
  5. At T→0, only improvements accepted (degenerates to hill-climbing)

Key parameters: initial temperature, cooling rate, neighborhood definition.

Greedy Heuristics

  • Often 2-approximation or better for NP-hard problems
  • Fast, simple, often surprisingly good
  • Nearest neighbor for TSP: consistently ~20-25% above optimal

When to Use Approximation vs Exact

  • Is n small enough for exact? (≤ 20 for brute force, ≤ 100 for DP on some problems)
  • Do you need guaranteed optimal or “good enough”?
  • Is there a known PTAS or constant-factor approximation?

NP-Completeness

What NP-Complete Means Practically

If a problem is NP-complete:

  1. Don’t waste time searching for a polynomial algorithm
  2. Identify which special cases are tractable (planarity? bounded treewidth? special structure?)
  3. Use approximation algorithms with provable bounds
  4. Use heuristics (simulated annealing, genetic algorithms) for practical instances
  5. Consider parameterized complexity (fixed-parameter tractable?)

Key NP-Complete Problems to Know

  • SAT: satisfiability of boolean formulas — the canonical NP-complete problem
  • 3-SAT: each clause has exactly 3 literals
  • Clique: find k vertices all connected to each other
  • Vertex Cover: find k vertices covering all edges
  • Hamiltonian Cycle: visit every vertex exactly once
  • TSP: shortest tour visiting all cities
  • Subset Sum / Knapsack: find subset summing to target
  • Graph Coloring: color graph with k colors, no adjacent vertices same color

Proving a Problem NP-Hard

Reduce a known NP-hard problem to your problem: “if I could solve your problem in polynomial time, I could also solve [known NP-hard problem] in polynomial time.”


How to Design Algorithms (Skiena’s Method)

  1. Understand the problem — work through small examples by hand
  2. Model the problem — what known algorithmic problem does this resemble?
  3. Consider simple approaches first — O(n²) beats a buggy O(n log n)
  4. Think about special cases — often the general case reduces to a special case you know
  5. Try each paradigm: DP, greedy, divide-and-conquer, backtracking, reduction to graph problem
  6. Seek a related problem — search the Hitchhiker’s Guide catalog