知方号

知方号

Distributed PageRank Computation: An Improved Theoretical Study

Abstract

PageRank is a classic measure that effectively evaluates the node importance in large graphs, and has been applied in numerous applications ranging from data mining, Web algorithms, recommendation systems, load balancing, search, and identifying connectivity structures. Computing PageRank for large graphs is challenging and this has motivated the studies of distributed algorithms to compute PageRank. Previously, little works have been spent on the distributed PageRank algorithms with provably desired complexity and accuracy. Given a graph with n nodes and if we model the distributed computation model as the well-known congested clique model, the state-of-the-art algorithm takes O(√logn) communication rounds to approximate the PageRank value of each node in G, with a probability at least 1−1/n. In this paper, we present improved distributed algorithms for computing PageRank. Particularly, our algorithm performs O(log log√n) rounds (a significant improvement compared with O(√logn) rounds) to approximate the PageRank values with a probability at least 1−1/n. Moreover, under a reasonable assumption, our algorithm also reduces the edge bandwidth (i.e., the maximum communication message size that can be exchanged through an edge during a communication round) by a O(logn) factor compared with the state-of-the-art algorithm. Finally, we show that our algorithm can be adapted to efficiently compute another variant of PageRank, i.e., the batch one-hop Personalized PageRanks, in O(log logn) communication rounds.

版权声明:本文内容由互联网用户自发贡献,该文观点仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容, 请发送邮件至lizi9903@foxmail.com举报,一经查实,本站将立刻删除。