In this project, we developed an efficient algorithm to compute connected components in large graphs using the distributed memory parallel systems, based on the Shiloach-Vishkin PRAM algorithm. We propose an edge-based adaptation of this classic algorithm and optimize it to improve its practical efficiency in distributed systems. This algorithm is capable of finding connected components in large undirected graphs (> 50B edges) using thousands of cores. We also introduce a dynamic approach that analyzes the graph and selectively runs BFS algorithm for short diameter graph components. This algorithmic selection is done at run-time. Our work shows that this hybrid approach performs better than using one of these two methods. If the graph is small enough to fit in your single compute node, we recommend using recent shared-memory implementations instead as they perform better. See  for more details.
Getting the Code
You will need g++(version 4.9+), an MPI implementation and cmake to compile the code.
git clone --recursive https://github.com/ParBLiSS/parconnect.git
mkdir build_directory && cd build_directory
Above steps should generate an executable parconnect in the bin folder.
Executing the Code
Our implementation has in-built parallel graph generators and parsers for benchmarking. It has following three input modes, the first for de Bruijn graphs, the second for generic graph edgelist files and the third for synthetic kronecker graphs.
- de Bruijn graphs
mpirun -np 16 ./parconnect --input dbg --file test.fastq
The input here is a fastq format file, containing DNA sequences. We provide a sample file here for you to test the code. Large metagenomic datasets can be downloaded from the Joint Genome Institute or MG-RAST databases.
mpirun -np 16 ./parconnect --input generic --file test.list
Here the file is simply a list of edges, one edge per line. Each vertex id could be within 8-byte integer range. An example file can be downloaded here. We also provide easy access to two metagenomic graphs, M1 and M3 (from Jain et al. ) in the edge list format.
- Kronecker graphs
mpirun -np 16 ./parconnect --input kronecker --scale 18
This graph is generated internally following the graph500 specifications.
The timer for benchmarking starts right after loading the edges into memory and ends with computing connected component ids for each vertex. Also, we acknowledge the use of kmerind, mxx, plfit and CombBLAS libraries in our implementation.
NOTE: Our current implementation only runs when process count is a perfect square. This restriction is because the CombBLAS version we use for BFS runs on a square process grid only.
- Chirag Jain, Patrick Flick, Tony Pan, Oded Green and Srinivas Aluru. "An Adaptive Parallel Algorithm for Computing Connected Components." IEEE Transactions on Parallel and Distributed Systems (2017). Paper
- Patrick Flick, Chirag Jain, Tony Pan and Srinivas Aluru. "A parallel connectivity algorithm for de Bruijn graphs in metagenomic applications." Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, ACM, 2015. Paper SC16 Reproducibility Initiative Winner