What:
A way of signalling relevance for a webpage (by Sergey Brin & Larry Page).
Your page’s rank is determined by the number of links pointing to you, weighted by the PageRank of those pages linking to you. It’s recursive!
How it works:
- Imagine a surfer landing on page A with 3 outbound links.
- The surfer randomly chooses one of them and repeats.
- 85% of the time, that will happen. But 15% of the time, the surfer teleports to a random page on the internet.
- This avoids dead ends or spam loops.
Spiders:
- There’s a massive list of URL waiting to be visited.
- We grab one, download the HTML, extract links, add them back to the list.
- We follow the robots.txt to be polite.