Run example in Giraph: PageRank

Latest Update: Check Questions: 4, 5, 7, 8 for changes. Instead of using the internal InputFormat and OutputFormat that SimplePageRankComputation has (which are currently buggy), I use others to make it work!

I’ve noticed an increase of the views for the Shortest Paths example,  so I decided to post my fairytale with PageRank as well. Please! Any suggestions, improvements, positive/negative feedback about these posts are more than welcome! I will respect you till the end of time 😉

So, let’s ask ourselves.

~~~~~ Q#1: What’s the PageRank problem?

Problem Description: Assign a weight to each node of a graph which represents its relative importance inside the graph. PageRank usually refers to a set of webpages, and tries to measure which ones are the most important in comparison with the rest from the set. The importance of a webpage is measured by the number of incoming links, i.e. references it receives from other webpages.

Continue reading

Run Example in Giraph: Shortest Paths

When planning to run a code in Giraph, I ask myself some questions. When I answer to all my questions, I move to actually implement and run the code. (so I kinda discuss a lot with myself :p). Let’s have a look to this inner discussion – while running the Shortest Paths problem.

~~~~~ Q#1: What’s the Shortest Path problem?

Problem Description: Find the shortest path between 2 vertices in a graph, so that the sum of weights of the edges in the path is minimized. The example given in Giraph finds the shortest path from each vertex to the source-vertex.

~~~~~ Q#2: How can this be implemented in Giraph?

Think “Pregely”: Since in Pregel the same code is executed in all vertices at the same time, we need to think as we are a vertex. Continue reading