PageRank graph ranking algorithm implemented from the eigenvector formulation using power iteration — the same algorithm that powered early Google Search.
A from-scratch PageRank implementation using the column-normalised transition matrix and power iteration (d=0.85 damping factor). Handles dangling nodes via uniform teleportation and validates convergence via L1 norm.
- PageRank as stationary distribution: how much time a random walker spends on each page at equilibrium
- Dangling node problem: pages with no out-links cause probability mass to leak — solved by teleportation
- Power iteration convergence: ||r_new − r||₁ < tol, typically < 50 iterations for most graphs
- Generalisation: PageRank works on any directed graph (citation networks, social networks, protein interaction networks)
Python NumPy NetworkX graph-theory pagerank eigenvectors
- Convergence in 23 iterations on a 5-node test graph (tol=1e-8)
- Results match NetworkX networkx.pagerank() to 6 decimal places
PageRank (and its variants HITS, TrustRank) underpins web search ranking, academic citation analysis (Google Scholar), and recommendation systems (collaborative filtering via graph walks).
pip install -r requirements.txt
python pweek3.py