Lunch Virtual 11

Colloquium Virtual Complexity at C3-UNAM

Universities for Science Consortium



Nearly-optimal prediction of missing links in networks

Aaron Clauset

University of Colorado Boulder & Santa Fe Institute
@c3.unam YouTube chanel: CentrodeCienciasdelaComplejidadC3
Friday, March 12 at 1:00 PM
Contact: cgg@unam.mx



Abstract:
Predicting missing links in networks is a fundamental task in network analysis and modeling. However, current link prediction algorithms exhibit wide variations in their accuracy, and we lack a general understanding of which methods work better in which contexts. In this talk, I'll describe a novel meta-learning solution to this problem, which makes predictions that appear to be nearly optimal by learning to combine three classes of prediction methods: community detection algorithms, structural features like degrees and triangles, and network embeddings. We evaluate 203 component methods individually and in stacked generalization on (i) synthetic data with known structure, for which we analytically calculate the optimal link prediction performance, and (ii) a large corpus of 548 structurally diverse networks from social, biological, technological, information, economic, and transportation domains. Across settings, supervised stacking nearly always performs best and produces nearly-optimal performance on synthetic networks. Moreover, we show that accuracy saturates quickly, and near-optimal predictions typically requires only a handful of component methods. Applied to real data, we quantify the utility of each method on different types of networks, and then show that the difficulty of predicting missing links varies considerably across domains: it is easiest in social networks and hardest in technological networks. I'll close with forward-looking comments on the limits of predictability for missing links in complex networks and on the utility of stacked generalizations for achieving them.

Joint work with Amir Ghasemian, Homa Hosseinmardi, Aram Galstyan, and Edoardo Airoldi.

Short Bio:
Aaron Clauset is an Associate Professor in the Department of Computer Science and the BioFrontiers Institute at the University of Colorado Boulder, and is External Faculty at the Santa Fe Institute. He received a PhD in Computer Science, with distinction, from the University of New Mexico, a BS in Physics, with honors, from Haverford College, and was an Omidyar Fellow at the prestigious Santa Fe Institute. In 2016, he was awarded the Erdos-Renyi Prize in Network Science, and since 2017, he has been a Deputy Editor responsible for the Social, Computing, and Interdisciplinary Sciences at Science Advances.
Clauset is an internationally recognized expert on network science, data science, and machine learning for complex systems. His work has appeared in many prestigious scientific venues, including Nature, Science, PNAS, SIAM Review, Science Advances, Nature Communications, AAAI, and ICDM. His work has also been covered in the popular press by Quanta Magazine, the Wall Street Journal, The Economist, Discover Magazine, Wired, the Boston Globe and The Guardian.