Mathematics for Computer Science

Mathematics for Computer Science · Eric Lehman & Tom Leighton ·340 pages

MIT's discrete mathematics course textbook — proof techniques (induction, contradiction), number theory (modular arithmetic, Fermat/Euler), graph theory (trees, coloring, planar, Hall's theorem), recurrences (Master Theorem, linear recurrences), counting (combinations, inclusion-exclusion, pigeonhole), probability (Bayes, independence, birthday paradox), random variables (binomial, geometric), expected value (linearity), and tail bounds (Markov, Chernoff).

Capabilities (9)

› Apply induction (simple and strong) to prove algorithm correctness
› Use Fermat's Little Theorem and Euler's Theorem to compute modular inverses
› Analyze graphs: connectivity, trees, bipartiteness, planarity, Hall's theorem
› Solve divide-and-conquer recurrences with Master Theorem
› Count arrangements using sum/product rules, combinations, inclusion-exclusion, pigeonhole
› Apply Bayes' theorem correctly to conditional probability problems
› Use linearity of expectation (works even for dependent variables)
› Apply tail bounds: Markov (crude), Chebyshev (uses variance), Chernoff (exponential, best)
› Explain birthday paradox and its implications for hash collisions and cryptography

How to use

Install this skill and Claude can construct formal induction proofs for algorithm correctness, apply number-theoretic results for cryptographic modular arithmetic, solve recurrences with the Master Theorem, and select and apply the tightest tail bound for any probabilistic argument

Why it matters

Provides the rigorous mathematical foundation that algorithm analysis, cryptography, and randomized systems require — separating engineers who can prove correctness from those who can only test it

Example use cases

› Constructing a strong induction proof that a recursive divide-and-conquer algorithm maintains its invariant, then solving the recurrence with the Master Theorem
› Computing the expected number of insertions before a hash collision using the birthday paradox and recommending a minimum hash width for a given false-positive budget
› Using Fermat's Little Theorem to compute a modular inverse and walking through why RSA correctness depends on Euler's theorem applied to the product of two large primes

Mathematics for Computer Science Skill

Proof Techniques (Chapters 1-3)

Direct Proof

Prove P → Q by assuming P and deriving Q through logical steps.

Proof by Contradiction

Assume ¬Q, show it leads to contradiction:

Claim: √2 is irrational.
Assume √2 = p/q (rational, lowest terms).
Then 2 = p²/q² → p² = 2q² → p is even → p = 2k
→ 4k² = 2q² → q² = 2k² → q is even.
Contradiction: both even, but assumed lowest terms.

Induction Template

Claim: P(n) holds for all n ≥ 0.
Base case: Verify P(0) directly.
Inductive step: Assume P(n) (inductive hypothesis).
  Prove P(n+1) using the hypothesis.

Strong Induction

Can assume P(0), P(1), …, P(n) to prove P(n+1). Useful when n+1 depends on earlier values, not just n.

Number Theory (Chapters 4-5)

Modular Arithmetic

a ≡ b (mod n) iff n divides (a-b)
Addition, multiplication, exponentiation all work mod n
gcd(a, b): Euclidean algorithm: gcd(a, b) = gcd(b, a mod b), base case gcd(a, 0) = a

Key Theorems

Fermat’s Little Theorem: if p is prime and p ∤ a, then a^(p-1) ≡ 1 (mod p) → Used to find multiplicative inverses: a^(-1) ≡ a^(p-2) (mod p)

Euler’s Theorem: a^φ(n) ≡ 1 (mod n) where φ(n) is Euler’s totient function → For prime p: φ(p) = p-1 → For n = pq (distinct primes): φ(n) = (p-1)(q-1) (basis of RSA)

Fundamental Theorem of Arithmetic: every integer > 1 has a unique prime factorization.

Extended Euclidean Algorithm: finds x, y such that ax + by = gcd(a,b) — used to compute multiplicative inverses mod n.

Graph Theory (Chapters 6-7)

Definitions

Graph G = (V, E): vertices V, edges E ⊆ V×V
Degree: number of edges incident on a vertex; handshake lemma: Σ deg(v) = 2|E|
Path: sequence of distinct vertices connected by edges
Connected: path exists between every pair of vertices
Cycle: path that starts and ends at same vertex

Trees

A tree is a connected graph with no cycles.

n vertices, n-1 edges
Any two vertices connected by exactly one path
Removing any edge disconnects it; adding any edge creates a cycle

Spanning tree: subgraph that is a tree containing all vertices.

Coloring

k-colorable: vertices can be colored with k colors such that no two adjacent vertices share a color
Bipartite: 2-colorable; equivalent to having no odd cycles
4-Color Theorem: every planar graph is 4-colorable

Euler’s Formula (Planar Graphs)

For a connected planar graph: V - E + F = 2 (V vertices, E edges, F faces including outer face)

Hall’s Marriage Theorem

A bipartite graph G with parts X and Y has a perfect matching iff for every subset S ⊆ X: |N(S)| ≥ |S| (neighbors of S ≥ size of S).

Sums and Asymptotics (Chapters 10-11)

Common Sums

Geometric sum:  1 + r + r² + ... + r^n = (r^(n+1) - 1) / (r - 1)
Powers of 2:    1 + 2 + 4 + ... + 2^n = 2^(n+1) - 1
Harmonic:       1 + 1/2 + 1/3 + ... + 1/n = Θ(ln n)
Quadratic:      1 + 4 + 9 + ... + n² = n(n+1)(2n+1)/6 = Θ(n³)

Asymptotic Notation

f = O(g): f grows at most as fast as g
f = Ω(g): f grows at least as fast as g
f = Θ(g): f and g grow at the same rate

Big-O hierarchy (slowest → fastest growth):

1 < log n < √n < n < n log n < n² < n³ < 2^n < n!

Recurrences (Chapters 12-13)

Solving Divide-and-Conquer Recurrences

For T(n) = aT(n/b) + f(n):

Master Theorem (simplified):

If f(n) = O(n^(log_b(a) - ε)): T(n) = Θ(n^log_b(a))  — root-dominated
If f(n) = Θ(n^log_b(a)):       T(n) = Θ(n^log_b(a) · log n)  — balanced
If f(n) = Ω(n^(log_b(a) + ε)): T(n) = Θ(f(n))  — leaves-dominated

Merge sort: T(n) = 2T(n/2) + Θ(n) → T(n) = Θ(n log n) (balanced case)

Linear Recurrences

a_n = c₁a_(n-1) + c₂a_(n-2) + ... + c_k a_(n-k)

Solution: find roots of characteristic polynomial x^k - c₁x^(k-1) - ... - c_k = 0:

Distinct roots r₁, r₂, …: a_n = A₁r₁^n + A₂r₂^n + ...
Fit constants using initial conditions

Fibonacci: T(n) = T(n-1) + T(n-2) → characteristic: x² - x - 1 = 0 → roots φ = (1+√5)/2 ≈ 1.618, ψ = (1-√5)/2

Counting (Chapters 14-16)

Fundamental Rules

Sum rule: if A and B are disjoint, |A ∪ B| = |A| + |B|
Product rule: |A × B| = |A| × |B|
Bijection rule: if bijection A → B exists, |A| = |B|
Division rule: if k-to-1 function from A → B, |A| = k × |B|

Key Formulas

Permutations of n things, r at a time: P(n,r) = n!/(n-r)!
Combinations (choose r from n):        C(n,r) = n! / (r! × (n-r)!) = "n choose r"
Arrangements with repeats:             n! / (k₁! × k₂! × ... × kᵣ!)  [Bookkeeper rule]

Inclusion-Exclusion

|A ∪ B| = |A| + |B| - |A ∩ B|
|A ∪ B ∪ C| = |A| + |B| + |C| - |A∩B| - |A∩C| - |B∩C| + |A∩B∩C|

Pigeonhole Principle

If n+1 objects are placed in n bins, some bin contains ≥ 2 objects. Strong form: if N objects in k bins, some bin contains ≥ ⌈N/k⌉ objects.

Binomial Theorem

(x + y)^n = Σ C(n,k) × x^k × y^(n-k)

Probability (Chapters 18-20)

Four-Step Method

Define sample space Ω (all equally likely outcomes, or weight by probability)
Define event E ⊆ Ω
Assign probabilities to outcomes
Compute Pr[E] = Σ Pr[ω] for ω ∈ E

Conditional Probability and Bayes’ Theorem

Pr[A | B] = Pr[A ∩ B] / Pr[B]

Bayes' Theorem:
Pr[A | B] = Pr[B | A] × Pr[A] / Pr[B]

Medical test example: even a 99% accurate test on a rare disease (1 in 1000) gives <10% positive predictive value — because false positives dominate.

Independence

A and B independent iff Pr[A ∩ B] = Pr[A] × Pr[B]
Mutual independence (stronger than pairwise): all subsets are independent

Birthday Paradox

In a group of n people: Pr[some pair share birthday] ≥ 50% when n ≈ 23. Birthday principle: expect collision after ~√(2m) samples from m values (m = 365 for birthdays). → Hash collision after ~√(2^b) = 2^(b/2) attempts for b-bit hash.

Random Variables and Distributions (Chapters 21-23)

Distributions

Distribution	PMF	E[X]	Var[X]
Bernoulli(p)	Pr[X=1]=p, Pr[X=0]=1-p	p	p(1-p)
Binomial(n,p)	C(n,k)p^k(1-p)^(n-k)	np	np(1-p)
Geometric(p)	(1-p)^(k-1)p	1/p	(1-p)/p²

Linearity of Expectation

E[X₁ + X₂ + ... + Xₙ] = E[X₁] + E[X₂] + ... + E[Xₙ] Works even when variables are not independent — extremely powerful.

Coupon collector: expected number of trials to collect all n coupons: E[trials] = n × (1 + 1/2 + 1/3 + ... + 1/n) = n × H_n ≈ n ln n

Tail Bounds (Chapter 24)

Markov’s Inequality

For non-negative X: Pr[X ≥ t] ≤ E[X] / t → Crude but always applicable

Chebyshev’s Inequality

Pr[|X - μ| ≥ k·σ] ≤ 1/k² → Uses variance; much tighter than Markov

Chernoff Bounds

For sum of independent 0/1 variables with mean μ: Pr[X ≥ (1+δ)μ] ≤ e^(-μδ²/3) for small δ → Exponentially tight; most useful in practice

Union Bound: Pr[A₁ ∪ A₂ ∪ ... ∪ Aₙ] ≤ Pr[A₁] + Pr[A₂] + ... + Pr[Aₙ] → Combine with Chernoff to bound “bad event” probability across n trials