If we can try to convince a subset of individuals to adopt a new product or innovation, and the goal is to trigger a large cascade of further adoptions, which set of individuals should we target? Motivated by the design of viral marketing strategies, this question has attracted tremendous attention in different areas of computer science, especially a branch of artificial intelligence called computational social choice.
Consider a graph G=(V,E) and an initial profile where each node is active or inactive. The graph G represents a social network, where a node is an individual and an edge between two nodes corresponds to a relation between the respective individuals, e.g. friendship or common interests. The state of a node indicates whether it is positive/negative about a product or it is aware/unaware of a piece of information. Consider a threshold function T which assigns an integer value between 1 and d(v) to each node v in V, where d(v) is the degree of v. The threshold of node v is the number of nodes which must be active in the neighborhood of v to change its state from inactive to active.
Consider the following iterative process. In each time step, the states of inactive nodes (if any) are updated according to the following rule: each inactive node becomes active if the number of active nodes in its neighborhood is not smaller than its threshold (and the active nodes remain active). The process runs until no more node can switch from inactive to active.
Target Set Selection Problem. A node set S is called a target set whenever the following holds: If all nodes in S are initially active, then all nodes become active by the end of the process. What is the minimum size of a target set for a given graph G?
The Target Set Selection Problem is known to be NP-Complete and hard to approximate. However, several approximation and randomized algorithms have been proposed for the problem. Most of these algorithms have been introduced and studied from a theoretical perspective with the main focus on the approximation guarantee and time/space complexity of the algorithm in the worst case scenarios. The goal of this project is to approach the problem from a more experimental and practical perspective.
Task 1. Implement the aforementioned approximation/randomized algorithms and compare their performance on different graph structures, in particular graph data from online social platforms such as Twitter and Facebook. (There is a rich set of such graph data publically available which can be used.)
Task 2. Propose heuristic algorithms (potentially based on improvement or combination of the previous algorithms) which outperform the prior algorithms on real-world social graph data.
Note. The goal of the project can be adjusted to match the student’s background and interests. The project can be taken as a 6, 12, or 24 unit course.
- A good understanding of algorithm design techniques, complexity classes, and graph theory.
- Good programming skills.
- Have taken/taking at least two of the following courses: COMP1100, MATH1005, COMP1600, COMP3600, COMP4600, MATH2301
ANU website: https://comp.anu.edu.au/people/ahad-zehmakan/
Personal website: https://sites.google.com/view/ahadnzehmakan
Please feel free to contact me if you have any questions. I would be glad to have a discussion on whether this project suits you.