21 March 2004 27 comments Mathematics, Python

There are many articles on the net about how the PageRank algorithm works that all copy from the original paper written by the very founders of Google Larry Page and Sergey Brin. Google itself also has a very good article that explain it with no formulas or numerical explanations. Basically PageRank is like social networks. If you're mentioned by someone important, your importance increases and the people you mention gets upped as well.

We recently had a coursework in discrete mathematics to calculate PageRank values for all web pages in a web matrix. To be able to do this you have to do many simplifications and you're limited in terms of complexity to keep it possible to do "by hand". I wrote a little program that calculates the PageRank for any web with no simplifications. The outcome is that I can quickly calculate the PageRank values for each page.

Here's how to use it:

```
from PageRank import PageRanker
web = ((0, 1, 0, 0),
(0, 0, 1, 0),
(0, 0, 0, 1),
(1, 0, 0, 0))
pr = PageRanker(0.85, web)
pr.improve_guess(100)
print pr.getPageRank()
```

Think of the entries in the matrix as A to D along every row and which page it has a link to along the column. In the above example it means that A has a link to B, B as a link to C, C has a link to D, D has a link to A.

The PageRank values when you run 100 iterations with no random jumps is:

```
[ 0.25 0.25 0.25 0.25 ]
```

Pretty obvious isn't it. One complication with the PageRank algorithm is that even if every page has an outgoing link, you don't always cover *everything* by just following links. That's why to sometimes need to random start over again from a randomly selected webpage. This is we we use 8.5 in the above example. That qualitativly means that there's a 15% chance that you randomly start on a random webpage and iterate from there.

Let's have a "more complex" web model:

```
web = ((0, 1, 0, 0),
(0, 0, 1, 0),
(0, 1, 0, 1),
(1, 1, 0, 0))
```

Running the algorithm again we find:

```
[ 0.14285725 0.28571447 0.28571419 0.2857141 ]
```

Notice how page B has the same PageRank as C and D even though page B has two links coming in to it. This is because it spreads it popularity to other pages. It also matters that the initial guess is that every page is equal initially.

Enough said, download the script yourself and make sure you have Python and the numarray Python module installed.

- Previous:
- What's so bad about HTML guys? 20 March 2004
- Next:
- EuroPython in Sweden, I should go 24 March 2004

- Related by Keyword:
- DoneCal on MumbaiMirror 03 February 2011
- How did Google do that? 14 July 2007
- Cheeky Explore (Dictionary of Everything) 26 May 2005
- PlogRank - my own PageRank application 21 May 2004
- Google PageRank matrix calculator (graphically) 19 May 2004

- Related by Text:
- jQuery and Highslide JS 08 January 2008
- I'm back! Peterbe.com has been renewed 05 June 2005
- Anti-McCain propaganda videos 12 August 2008
- I'm Prolog 01 May 2007
- Ever wondered how much $87 Billion is? 04 November 2003

Peterhttp://www.top25web.com/pagerank.php

AnonymousPeterGeorge HotellingPeterAnonymoushttp://www.pagerank.net/pagerank-checker/

MichelleAnonymoushttp://pagerank-checksum.homelinux.com/

MarryI have been given the task of getting links for our websites that have good page rank on the links directories.

In addition we have many categories so your site will be place on an appropriate page.

If you would like to trade links please send me your website details.

P.S.: I got your e-mail publicly listed on your webpage . Our apologies if you do not wish to take part in a link exchange.

Best Regards,

Marry Tailor

Peter BengtssonIf you're going to have a link exchange, do that with sites that mention similar keywords.

asrin _improve shouldn't

invariantmatrix multiply with _linkedmatrix and then add itself to _randmatrix ??

caykoylusandorfMahmoudplease tell me the no of iteration can be get the accurate result

and i want the example of use the eginevalue to determine the Page Rank

thanks

Mahmoudi want reaal data to compute Page Rank

MahmoudWing WongtermotermoHere is an implementation in pure python without using numaarray and the complete N*N matrix. Enjoy!

http://mechanicalpoetry.blogspot.com/2008/01/scalable-pagerank-in-pure-python-76.html

hakanhttp://www.pyromus.com

michaelhttp://www.birminghamfriends.co.uk

HectorI am from Mexico and also am speaking English, give please true I wrote the following sentence: "It just like any other kind of dating that goes on."

Thank :-) Hector.

Website Hosting IndiaThanks for sharing, but we have to install ruby on rails on server

Peter BengtssonZoltan Toth-CzifraPeter BengtssonAlex LecaI found this web app to predict in real time Google Page Rank based on the original Google algorithm:

http://www.webuka.com

Give it a try!