Algorithms for influence maximization and seed minimization

Graph is a basic mathematical tool that models information about identities as well as their complex relationships from various real world problems. It has been found important applications in analysis on social networks, route planning, telecommunication, etc. In recent years, the complexity and sc...

Full description

Saved in:
Bibliographic Details
Main Author: Shi, Yanchen
Other Authors: Sun Aixin
Format: Theses and Dissertations
Language:English
Published: 2019
Subjects:
Online Access:https://hdl.handle.net/10356/104632
http://hdl.handle.net/10220/49508
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-104632
record_format dspace
spelling sg-ntu-dr.10356-1046322020-07-02T06:26:26Z Algorithms for influence maximization and seed minimization Shi, Yanchen Sun Aixin School of Computer Science and Engineering Engineering::Computer science and engineering::Information systems::Database management Graph is a basic mathematical tool that models information about identities as well as their complex relationships from various real world problems. It has been found important applications in analysis on social networks, route planning, telecommunication, etc. In recent years, the complexity and scale of real world graphs have increased dramatically. In particular, international social networks can comprise of hundreds of millions of users and up to billions of relationships. Thus, even algorithms with decent time or space complexities meet challenges dealing with large-scale networks for the queries like influence maximization and seed minimization on social networks. In this thesis, we investigate two problems on large scale networks for aforementioned applications, i.e., influence maximization and seed minimization. Given G a social network, M a probabilistic propagation model, and a small number k > 0, the influence maximization problem aims to find the largest expectation of the number of influenced nodes that k nodes can trigger under this pre-defined model M. This problem is derived from viral marketing, where a company gives away free samples to a small number of influential individuals in order to create a cascade of adoption via word-of-mouth effect. This study proposes a two-phase approach Influence Maximization via Martingales (IMM) that meets both practical efficiency and theoretical guarantees. In particular, IMM returns an (1 − 1/e − ε)-approximate solution with at least 1 − n ^(−ℓ) probability in an O((k+ℓ)(n+m)logn/ε^2 ) running time. IMM is further extended to fit in triggering model and time-continuous model. We experimentally evaluate IMM with the state-of-the-art benchmarks under several diffusion models and parameter settings, using large networks with up to 1.4 billion edges. The experimental results show that our approach consistently outperforms the states of the art in terms of efficiency. The seed minimization problem is a variant problem of the influence maximization with the same origin from advertising. Given a social network G and a covering threshold t, the seed minimization problem is aimed to find a seed set S that has an expected influence nodes not less than t·n and minimizes the size of S. Compared to the influence maximization that maximizes the influence given a certain budget, the seed minimization problem hopes to retrench the expense to the minimum number while keeping the influence above a predefined threshold. To solve the problem, we propose GSM, a greedy algorithm with tight approximations, high generalization and easy implementations. In particular, it yields a ⌈(1 + ϕ)log(tn)⌉-approximate solution with at least 1 − n ^(−ℓ) probability, where ℓ and ϕ are both tunable. We experimentally evaluate GSM in several settings of both t and β, and it is often orders of magnitude faster compared to the traditional greedy benchmark MINTSS. GSM also gives an impressive performance on a large graph Twitter with more than a billion edges. Master of Engineering 2019-08-01T04:48:58Z 2019-12-06T21:36:34Z 2019-08-01T04:48:58Z 2019-12-06T21:36:34Z 2019 Thesis Shi, Y. (2019). Algorithms for influence maximization and seed minimization. Master's thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/104632 http://hdl.handle.net/10220/49508 10.32657/10220/49508 en 104 p. application/pdf
institution Nanyang Technological University
building NTU Library
country Singapore
collection DR-NTU
language English
topic Engineering::Computer science and engineering::Information systems::Database management
spellingShingle Engineering::Computer science and engineering::Information systems::Database management
Shi, Yanchen
Algorithms for influence maximization and seed minimization
description Graph is a basic mathematical tool that models information about identities as well as their complex relationships from various real world problems. It has been found important applications in analysis on social networks, route planning, telecommunication, etc. In recent years, the complexity and scale of real world graphs have increased dramatically. In particular, international social networks can comprise of hundreds of millions of users and up to billions of relationships. Thus, even algorithms with decent time or space complexities meet challenges dealing with large-scale networks for the queries like influence maximization and seed minimization on social networks. In this thesis, we investigate two problems on large scale networks for aforementioned applications, i.e., influence maximization and seed minimization. Given G a social network, M a probabilistic propagation model, and a small number k > 0, the influence maximization problem aims to find the largest expectation of the number of influenced nodes that k nodes can trigger under this pre-defined model M. This problem is derived from viral marketing, where a company gives away free samples to a small number of influential individuals in order to create a cascade of adoption via word-of-mouth effect. This study proposes a two-phase approach Influence Maximization via Martingales (IMM) that meets both practical efficiency and theoretical guarantees. In particular, IMM returns an (1 − 1/e − ε)-approximate solution with at least 1 − n ^(−ℓ) probability in an O((k+ℓ)(n+m)logn/ε^2 ) running time. IMM is further extended to fit in triggering model and time-continuous model. We experimentally evaluate IMM with the state-of-the-art benchmarks under several diffusion models and parameter settings, using large networks with up to 1.4 billion edges. The experimental results show that our approach consistently outperforms the states of the art in terms of efficiency. The seed minimization problem is a variant problem of the influence maximization with the same origin from advertising. Given a social network G and a covering threshold t, the seed minimization problem is aimed to find a seed set S that has an expected influence nodes not less than t·n and minimizes the size of S. Compared to the influence maximization that maximizes the influence given a certain budget, the seed minimization problem hopes to retrench the expense to the minimum number while keeping the influence above a predefined threshold. To solve the problem, we propose GSM, a greedy algorithm with tight approximations, high generalization and easy implementations. In particular, it yields a ⌈(1 + ϕ)log(tn)⌉-approximate solution with at least 1 − n ^(−ℓ) probability, where ℓ and ϕ are both tunable. We experimentally evaluate GSM in several settings of both t and β, and it is often orders of magnitude faster compared to the traditional greedy benchmark MINTSS. GSM also gives an impressive performance on a large graph Twitter with more than a billion edges.
author2 Sun Aixin
author_facet Sun Aixin
Shi, Yanchen
format Theses and Dissertations
author Shi, Yanchen
author_sort Shi, Yanchen
title Algorithms for influence maximization and seed minimization
title_short Algorithms for influence maximization and seed minimization
title_full Algorithms for influence maximization and seed minimization
title_fullStr Algorithms for influence maximization and seed minimization
title_full_unstemmed Algorithms for influence maximization and seed minimization
title_sort algorithms for influence maximization and seed minimization
publishDate 2019
url https://hdl.handle.net/10356/104632
http://hdl.handle.net/10220/49508
_version_ 1681057707213193216