Mathematics for Computer Science
MIT's discrete mathematics course textbook — proof techniques (induction, contradiction), number theory (modular arithmetic, Fermat/Euler), graph theory (trees, coloring, planar, Hall's theorem), recurrences (Master Theorem, linear recurrences), counting (combinations, inclusion-exclusion, pigeonhole), probability (Bayes, independence, birthday paradox), random variables (binomial, geometric), expected value (linearity), and tail bounds (Markov, Chernoff).
- › Apply induction (simple and strong) to prove algorithm correctness
- › Use Fermat's Little Theorem and Euler's Theorem to compute modular inverses
- › Analyze graphs: connectivity, trees, bipartiteness, planarity, Hall's theorem
- › Solve divide-and-conquer recurrences with Master Theorem
- › Count arrangements using sum/product rules, combinations, inclusion-exclusion, pigeonhole
- › Apply Bayes' theorem correctly to conditional probability problems
- › Use linearity of expectation (works even for dependent variables)
- › Apply tail bounds: Markov (crude), Chebyshev (uses variance), Chernoff (exponential, best)
- › Explain birthday paradox and its implications for hash collisions and cryptography
Install this skill and Claude can construct formal induction proofs for algorithm correctness, apply number-theoretic results for cryptographic modular arithmetic, solve recurrences with the Master Theorem, and select and apply the tightest tail bound for any probabilistic argument
Provides the rigorous mathematical foundation that algorithm analysis, cryptography, and randomized systems require — separating engineers who can prove correctness from those who can only test it
- › Constructing a strong induction proof that a recursive divide-and-conquer algorithm maintains its invariant, then solving the recurrence with the Master Theorem
- › Computing the expected number of insertions before a hash collision using the birthday paradox and recommending a minimum hash width for a given false-positive budget
- › Using Fermat's Little Theorem to compute a modular inverse and walking through why RSA correctness depends on Euler's theorem applied to the product of two large primes
Mathematics for Computer Science Skill
Proof Techniques (Chapters 1-3)
Direct Proof
Prove P → Q by assuming P and deriving Q through logical steps.
Proof by Contradiction
Assume ¬Q, show it leads to contradiction:
Claim: √2 is irrational.
Assume √2 = p/q (rational, lowest terms).
Then 2 = p²/q² → p² = 2q² → p is even → p = 2k
→ 4k² = 2q² → q² = 2k² → q is even.
Contradiction: both even, but assumed lowest terms.
Induction Template
Claim: P(n) holds for all n ≥ 0.
Base case: Verify P(0) directly.
Inductive step: Assume P(n) (inductive hypothesis).
Prove P(n+1) using the hypothesis.
Strong Induction
Can assume P(0), P(1), …, P(n) to prove P(n+1). Useful when n+1 depends on earlier values, not just n.
Number Theory (Chapters 4-5)
Modular Arithmetic
a ≡ b (mod n)iff n divides (a-b)- Addition, multiplication, exponentiation all work mod n
gcd(a, b): Euclidean algorithm:gcd(a, b) = gcd(b, a mod b), base casegcd(a, 0) = a
Key Theorems
Fermat’s Little Theorem: if p is prime and p ∤ a, then a^(p-1) ≡ 1 (mod p)
→ Used to find multiplicative inverses: a^(-1) ≡ a^(p-2) (mod p)
Euler’s Theorem: a^φ(n) ≡ 1 (mod n) where φ(n) is Euler’s totient function
→ For prime p: φ(p) = p-1
→ For n = pq (distinct primes): φ(n) = (p-1)(q-1) (basis of RSA)
Fundamental Theorem of Arithmetic: every integer > 1 has a unique prime factorization.
Extended Euclidean Algorithm: finds x, y such that ax + by = gcd(a,b) — used to compute multiplicative inverses mod n.
Graph Theory (Chapters 6-7)
Definitions
- Graph G = (V, E): vertices V, edges E ⊆ V×V
- Degree: number of edges incident on a vertex; handshake lemma:
Σ deg(v) = 2|E| - Path: sequence of distinct vertices connected by edges
- Connected: path exists between every pair of vertices
- Cycle: path that starts and ends at same vertex
Trees
A tree is a connected graph with no cycles.
- n vertices, n-1 edges
- Any two vertices connected by exactly one path
- Removing any edge disconnects it; adding any edge creates a cycle
Spanning tree: subgraph that is a tree containing all vertices.
Coloring
- k-colorable: vertices can be colored with k colors such that no two adjacent vertices share a color
- Bipartite: 2-colorable; equivalent to having no odd cycles
- 4-Color Theorem: every planar graph is 4-colorable
Euler’s Formula (Planar Graphs)
For a connected planar graph: V - E + F = 2 (V vertices, E edges, F faces including outer face)
Hall’s Marriage Theorem
A bipartite graph G with parts X and Y has a perfect matching iff for every subset S ⊆ X: |N(S)| ≥ |S| (neighbors of S ≥ size of S).
Sums and Asymptotics (Chapters 10-11)
Common Sums
Geometric sum: 1 + r + r² + ... + r^n = (r^(n+1) - 1) / (r - 1)
Powers of 2: 1 + 2 + 4 + ... + 2^n = 2^(n+1) - 1
Harmonic: 1 + 1/2 + 1/3 + ... + 1/n = Θ(ln n)
Quadratic: 1 + 4 + 9 + ... + n² = n(n+1)(2n+1)/6 = Θ(n³)
Asymptotic Notation
f = O(g): f grows at most as fast as gf = Ω(g): f grows at least as fast as gf = Θ(g): f and g grow at the same rate
Big-O hierarchy (slowest → fastest growth):
1 < log n < √n < n < n log n < n² < n³ < 2^n < n!
Recurrences (Chapters 12-13)
Solving Divide-and-Conquer Recurrences
For T(n) = aT(n/b) + f(n):
Master Theorem (simplified):
If f(n) = O(n^(log_b(a) - ε)): T(n) = Θ(n^log_b(a)) — root-dominated
If f(n) = Θ(n^log_b(a)): T(n) = Θ(n^log_b(a) · log n) — balanced
If f(n) = Ω(n^(log_b(a) + ε)): T(n) = Θ(f(n)) — leaves-dominated
Merge sort: T(n) = 2T(n/2) + Θ(n) → T(n) = Θ(n log n) (balanced case)
Linear Recurrences
a_n = c₁a_(n-1) + c₂a_(n-2) + ... + c_k a_(n-k)
Solution: find roots of characteristic polynomial x^k - c₁x^(k-1) - ... - c_k = 0:
- Distinct roots r₁, r₂, …:
a_n = A₁r₁^n + A₂r₂^n + ... - Fit constants using initial conditions
Fibonacci: T(n) = T(n-1) + T(n-2) → characteristic: x² - x - 1 = 0 → roots φ = (1+√5)/2 ≈ 1.618, ψ = (1-√5)/2
Counting (Chapters 14-16)
Fundamental Rules
- Sum rule: if A and B are disjoint, |A ∪ B| = |A| + |B|
- Product rule: |A × B| = |A| × |B|
- Bijection rule: if bijection A → B exists, |A| = |B|
- Division rule: if k-to-1 function from A → B, |A| = k × |B|
Key Formulas
Permutations of n things, r at a time: P(n,r) = n!/(n-r)!
Combinations (choose r from n): C(n,r) = n! / (r! × (n-r)!) = "n choose r"
Arrangements with repeats: n! / (k₁! × k₂! × ... × kᵣ!) [Bookkeeper rule]
Inclusion-Exclusion
|A ∪ B| = |A| + |B| - |A ∩ B|
|A ∪ B ∪ C| = |A| + |B| + |C| - |A∩B| - |A∩C| - |B∩C| + |A∩B∩C|
Pigeonhole Principle
If n+1 objects are placed in n bins, some bin contains ≥ 2 objects. Strong form: if N objects in k bins, some bin contains ≥ ⌈N/k⌉ objects.
Binomial Theorem
(x + y)^n = Σ C(n,k) × x^k × y^(n-k)
Probability (Chapters 18-20)
Four-Step Method
- Define sample space Ω (all equally likely outcomes, or weight by probability)
- Define event E ⊆ Ω
- Assign probabilities to outcomes
- Compute Pr[E] = Σ Pr[ω] for ω ∈ E
Conditional Probability and Bayes’ Theorem
Pr[A | B] = Pr[A ∩ B] / Pr[B]
Bayes' Theorem:
Pr[A | B] = Pr[B | A] × Pr[A] / Pr[B]
Medical test example: even a 99% accurate test on a rare disease (1 in 1000) gives <10% positive predictive value — because false positives dominate.
Independence
- A and B independent iff
Pr[A ∩ B] = Pr[A] × Pr[B] - Mutual independence (stronger than pairwise): all subsets are independent
Birthday Paradox
In a group of n people: Pr[some pair share birthday] ≥ 50% when n ≈ 23. Birthday principle: expect collision after ~√(2m) samples from m values (m = 365 for birthdays). → Hash collision after ~√(2^b) = 2^(b/2) attempts for b-bit hash.
Random Variables and Distributions (Chapters 21-23)
Distributions
| Distribution | PMF | E[X] | Var[X] |
|---|---|---|---|
| Bernoulli(p) | Pr[X=1]=p, Pr[X=0]=1-p | p | p(1-p) |
| Binomial(n,p) | C(n,k)p^k(1-p)^(n-k) | np | np(1-p) |
| Geometric(p) | (1-p)^(k-1)p | 1/p | (1-p)/p² |
Linearity of Expectation
E[X₁ + X₂ + ... + Xₙ] = E[X₁] + E[X₂] + ... + E[Xₙ]
Works even when variables are not independent — extremely powerful.
Coupon collector: expected number of trials to collect all n coupons:
E[trials] = n × (1 + 1/2 + 1/3 + ... + 1/n) = n × H_n ≈ n ln n
Tail Bounds (Chapter 24)
Markov’s Inequality
For non-negative X: Pr[X ≥ t] ≤ E[X] / t
→ Crude but always applicable
Chebyshev’s Inequality
Pr[|X - μ| ≥ k·σ] ≤ 1/k²
→ Uses variance; much tighter than Markov
Chernoff Bounds
For sum of independent 0/1 variables with mean μ:
Pr[X ≥ (1+δ)μ] ≤ e^(-μδ²/3) for small δ
→ Exponentially tight; most useful in practice
Union Bound: Pr[A₁ ∪ A₂ ∪ ... ∪ Aₙ] ≤ Pr[A₁] + Pr[A₂] + ... + Pr[Aₙ]
→ Combine with Chernoff to bound “bad event” probability across n trials