While trying to write a possible introduction for my thesis report, I ended up reading an interesting paper listing Challenges in Parallel Graph Processing. According to this paper, graph problems have some characteristics that act as impediments to efficiently parallelize their solutions. These are (I list the same names given in the paper for their easier identification in the paper):
Data-driven computations: When a graph algorithm performs computations based on the structure of vertices and edges in the graph, then these computations are called data-driven. Parallelizing such computations by partitioning them can be intricate, since the graph structure changes during computations and so does the structure of computations as well.
Unstructured problems: Graph problems constitute of unstructured and highly irregular data. This irregularity leads to poor data partitioning and therefore poor parallelism. Consequently, scalability reaches a barrier due to the poor data partitioning.
Poor locality: Graphs are usually large enough for not fitting in one machine. Thus, partitioning and storing subgraphs of the initial graph in several machines is essential and implies the existence of cut edges between vertices lying in different partitions. This poor locality induces inevitably lower performance.
High data access to computation ratio: Graph algorithms often have two goals; to explore the structure of a graph and to perform large computations on the graph data. It is probable that computations are so large, that lead to long runtime waiting for memory fetches.
In (hopefully) simpler and fewer words, it is difficult to parallelize graph problems because: (a) computations depend on the graph structure which changes with each computation, (b) data is hard to be partitioned for computation, (c) data locality is not completely feasible because there is high connectivity among data and when partitioning the data to perform computations in parallel, it is expected not to have all data needed in one computation, (d) the computations become very long because of waiting to fetch data from other partitions.
In the next chapter, the authors give a list with parallel architectures and programming models, their strong points and drawbacks. Among them, we see the well-known MPI for distributed memory machines, OpenMP and POSIX threads for shared-memory computers.
Afterwards, they describe the challenges regarding both hardware and software for achieving high performance in parallel graph algorithms. These are:
- Task granularity: Where to introduce parallelism? Machines that allow fine-grained parallelism are more suitable for most graph operations.
- Memory contention: Does the system support global address space? If yes, then processes can try to access the same memory at the same time leading to reduced performance. One proposed solution is to consider the graph as a read-only object, thus the graph algorithm will keep its writes locally and memory contention will take place under specific conditions.
- Load balancing: It is probable that processors may have more work to do than others. It is necessary to provide methods for re-assigning work among processors to preserve balance and improve performance.
- Simultaneous queries: In scenarios with several people accessing the graph at the same time and sending queries, it is vital to have a system able to handle simultaneous queries rather than just respond to a single query at a time.
- Software development: While developing a software of graph processing, all challenges should be taken into consideration. Additionally, it’s not a secret that the software should be flexible, extensible, portable and maintainable.
The authors continue by presenting a mapping of graph algorithms to specific hardware and software approaches to parallel graph processing (Parallel BGL and MTGL).
I only posted what I found more interesting for my work, so for the chapters I didn’t give details, you can refer to the paper 🙂
Happy Reading! 😛