Nowadays the Internet consists of several million pages, and it is increasingly easy to create new ones, even for nonexpert people. With this huge amount of web pages available, is it reasonable to think that all pages are equally "important"? This is question is basic for web search engines face, that is, web sites that search we page throughout the Internet based on some given criterion, usually, the inclusion in the web page of a certain word or set of words.
In fact, in most cases the user that asked for the search only looks at a few of the first results obtained, ignoring the remaining ones, usually a few thousands of pages! How can we be sue that the first listed results of the web search are the most "important" ones, that is, the web pages that are most likely to correspond to the what is expected from the search?
The solution to this problem used by the popular web search engine Google gives, to each existing web page, a numerical value, called the PageRank, that reflects its "importance". Hence, when performing a web search, the first listed results given to the user are just the ones that have the highest value of PageRank.
The following page explains
- how to mathematically define the PageRank
- how to actually compute the PageRank
- how to interpret the PageRank as a probability
(*) This work was carried out under the guidance of Professor Maria Carvalho from the Universidade of Porto, under a grant by the Calouste Gulbenkian Foundation to develop a project for the promotion of Mathematics in Atractor.
Difficulty level: University