Subhajit Sahu, a Ph.D student with Prof. Kishore Kothapalli participated in the 30th European Conference on Parallel and Distributed Computing (Euro-Par 2024) held in Madrid, Spain from 26 to 30 August. Subhajit Sahu presented a paper on DF* PageRank: Incrementally Expanding Approaches for Updating PageRank on Dynamic Graphs (Artifact), the authors of this paper are Subhajit Sahu; Kishore Kothapalli; Hemalatha Eedi, JNTUH College of Engineering Hyderabad; Sathya Peri, IIT Hyderabad.
Here is the summary of their research work as explained by the authors: PageRank is a widely used centrality measure that assesses the significance of vertices in a graph. Efficiently updating PageRank on dynamic graphs is essential for various applications due to the increasing scale of datasets. This paper introduces our Dynamic Frontier (DF) and Dynamic Frontier with Pruning (DF-P) approaches.4 Given a batch update comprising edge insertions and deletions, these approaches iteratively identify vertices likely to change their ranks with minimal overhead. On a server featuring a 64-core AMD EPYC-7742 processor, our approaches outperform Static and Dynamic Traversal PageRank by 5.2x/15.2x and 1.3x/3.5x respectively – on real-world dynamic graphs, and by 7.2x/9.6x and 4.0x/5.6x on large static graphs with random batch updates. Our approaches scale at a rate of 1.8x/1.7x for every doubling of threads.
Additionally, Subhajit presented papers in two workshops. The first workshop was High-Performance e-Science (HiPES 2024) where he presented a paper on GVE-LPA: Fast Label Propagation Algorithm (LPA) for Community Detection in the Shared Memory Setting. Here is the summary of the research work as explained by the authors – Subhajit Sahu; Kishore Kothapalli and Dip Sankar Banerjee, IIT Jodhpur:
Community detection is the problem of identifying natural divisions in networks. Efficient parallel algorithms for this purpose are crucial in various applications, particularly as datasets grow to substantial scales. This technical report presents an optimized parallel implementation of the Label Propagation Algorithm (LPA), a high speed community detection method, for shared memory multicore systems. On a server equipped with dual 16-core Intel Xeon Gold 6226R processors, our LPA, which we term as GVE-LPA, outperforms FLPA, igraph LPA, and NetworKit LPA by 139x, 97, 000x, and 40x respectively – achieving a processing rate of 1.4B edges/s on a 3.8B edge graph. In addition, GVE-LPA scales at a rate of 1.7x every doubling of threads.
The second workshop was on Asynchronous Many-Task systems for Exascale (AMTE-2024) where he presented a paper on GVEL: Fast Graph Loading in Edgelist and Compressed Sparse Row (CSR) formats. Here is the summary of the research work as explained by the authors – Subhajit Sahu and Prof. Kishore Kothapalli:
Efficient IO techniques are crucial in high-performance graph processing frameworks like Gunrock and Hornet, as fast graph loading can help minimize processing time and reduce system/cloud usage charges. This research study presents approaches for efficiently reading an Edgelist from a text file and converting it to a Compressed Sparse Row (CSR) representation. On a server with dual 16-core Intel Xeon Gold 6226R processors and MegaRAID SAS-3 storage, our approach, which we term as GVEL, outperforms Hornet, Gunrock, and PIGO by significant margins in CSR reading, exhibiting an average speedup of 78x, 112x, and 1.8x, respectively. For Edgelist reading, GVEL is 2.6x faster than PIGO on average, and achieves a Edgelist read rate of 1.9 billion edges/s. For every doubling of threads, GVEL improves performance at an average rate of 1.9x and 1.7x for reading Edgelist and reading CSR respectively.
Subhajit Sahu also presented a paper on Fast Parallel Approach for Neighborhood-based Link Prediction by Disregarding Large Hubs at Euro-Par 2024 – Ph.D Symposium. The chair of the symposium was Prof. Raffaele Montella. Here is the summary of the research work as explained by the authors Subhajit Sahu and Prof. Kishore Kothapalli:
Link prediction can help rectify inaccuracies in various graph algorithms, stemming from unaccounted-for or overlooked links within networks. However, many existing works use a baseline approach, which incurs unnecessary computational costs due to its high time complexity. Further, many studies focus on smaller graphs, which can lead to misleading conclusions. This submission introduces our parallel approach, called LHub, which predict links using neighborhood-based similarity measures on large graphs. LHub is a heuristic approach that disregards large hubs, based on the idea that high-degree nodes contribute little similarity among their neighbors. On a server equipped with dual 16-core Intel Xeon Gold 6226R processors, LHub is on average 1019x faster than not disregarding hubs, especially on web graphs and social networks, while maintaining similar prediction accuracy. Notably, LHub achieves a link prediction rate of 38.1M edges/s and improves performance at a rate of 1.6x for every doubling of threads.
Subhajit Sahu also participated in a session on Multidisciplinary, Domain-Specific and Applied Parallel and Distributed Computing which was chaired Prof. Thomas Ludwig.
Pictures of EuroPar-2024: https://www.flickr.com/photos/europar2024/albums/
September 2024