Shiri Chechik
Krill Prize 2017
Tel Aviv University
Dr. Shiri Chechik
Much of the research done on the foundations of algorithms, assumes that the algorithm has full knowledge of the input data and on future changes to the data. In spite of the theoretical appeal and simplicity of this setting, the assumption that the algorithm has “full knowledge” does not always hold. Uncertainty and partial knowledge arise in many settings, for example:
- where the data is very large (e.g. so called “large data” derived from internet transactions activity, where merely reading the entire data once is impossible, and only sampling is possible),
- where the data is not static, i.e., changes occur over time (e.g., social networks where information is fluid, and often time critical),
- where processing of the data is distributed over computation nodes, and each node has only local information. The large body of work in the literature on streaming, dynamic, online, fault-tolerant, and distributed algorithms, in fact, deals with variations of the cases described
Graph data is by and large the most ubiquitous form of data used in practice. Graphs are used for representing networks (e.g. social networks, semantic networks, communication networks and more general flow networks, epidemiology networks, biologic interaction networks), as well as complicated linear algebraic operators (matrices), pairwise similarity relations between abstract objects and so on. In spite of the importance of graph data in the algorithm design community, our current understanding of these problems in the streaming, dynamic, fault tolerant and distributed algorithm setting is very incomplete. In particular:
- There are major discrepancies between the efficiency of deterministic and randomized
- The situation concerning amortized individual worst case guarantees is not well understood.
- It is unclear how performance changes as a function of the E.g., what type of changes are considered in dynamic settings? How much information may be transmitted in distributed settings, etc.
Dealing with such “partial knowledge” and uncertainty settings, while maintaining the quality of the results, raises complex issues that often add new and non-trivial mathematical challenges. I plan to study algorithmic questions at the interface with graph theory, data structures and distributed computing. In particular, I am interested in exploring the limits of uncertainty environments and determine what can and cannot be achieved in this setting. Designing solutions that handle uncertainty and partial knowledge is a huge field that covers many different dis_ciplines, such as distributed networks, biology (where often the network is too big), online algorithms, streaming, etc. I plan to focus on three main fundamental areas, where I have identified gaps in our understanding and where I believe that progress will have a significant impact. The first area relates to dynan1ic algorithms, that is, designing algorithms that maintain some key functionality or property of the graph, while the graph itself changes over time. The second area concerns distributed graph algorithms and in particular symmetry breaking problems (e.g., coloring, maximal independent sot, etc.). The third involves social networks, and in particular designing efficient algorithms for extracting useful information from social networks. I will next describe in more detail some of the problems I plan to study.
Dynamic algorithms In many real-world applications, including communication networks, VLSI design, graphics, and assembly planning, graphs may change over time. This raises the essential need to design solutions that can efficiently adapt to these changes, e.g., GPS navigation systems are required to provide alternative routes in case of congestion, the internet needs to adapt to new users and new pages and to be resilient to potential failures.
In the last three decades there has been a growing interest in dynamic graph algorithms. The trivial approach to solving problems in changing circumstances is to solve the problem from scratch subsequent to every change. E.g., when road congestion levels change, one can recompute the new shortest path, given the current state of affairs. If the new input is somehow close to the previous input, one suspects that this should involve less computation than the trivial approach of forgetting everything and starting from scratch. In fact, this has been shown to be true in a wide (and growing) variety of problems, some of which are described hereinafter.
I plan to study the fundamental dynamic graph algorithms with large discrepancy between the state of the art upper and lower bounds. A partial list of these problems includes: distance oracles, dynamic single source shortest path and all pairs shortest path, decremental single-source reachability (SSR), decremental strongly connected components (SCC), and dynamic connectivity for undirected graphs , etc.
Distributed computing and Symmetry breaking In the last three decades there was a huge shift from central-based computing to distributed-based computing. We are surrounded by distributed networks and they play a central part of our everyday life, that includes, the internet, cellular networks, World Wide Web, wireless sensor networks, aircraft control systems and many more. Even biological systems, ranging from the molecular to the organism level, are distributer.d.
Distributed Computing is a massive and growing field of study. It deals with settings in which many processors
work in parallel, communicating and coordinating their actions by passing messages. These processors work together in order to achieve a common goal. Symmetry brealdng sits at the heart of distributed computing theory and is arguably one of the most important problem in distributed network computing. The input iu this setting is a distributed graph, where each node in the graph represents a processor and each processor (node) can transfer messages to only its neighbors in the graph. Symmetry breaking concerns with studying problems such as Maximal Independent Set (MIS), Maximal Matching (MM), and proper graph Coloring. In the MIS problem, for example, after the algorithm ends, each node !mows if it is in the MIS or not and the set of nodes that are in the MIS forms a valid MIS in the input graph. The main question in this framework is whether these tasks can be solved locally, that is, by exchanging messages between nodes at short distance in the graph.
Besides their theoretical appeal, symmetry breaking algorithms are also used as fundamental building block in many applications. (e.g. coloring has been used for scheduling in wireless networks, MIS has been used for resource scheduling for parallel threads in a multi-core environment). This area has been a subject of intensive research since the beginning of the 80s. However, despite extensive study, some major problems remain open and the gaps between upper and lower bounds are in many cases exponential.
Social Networks Recently, social networks have gained significant popularity, attracting millions of users, on a daily-basis. The significant popularity of these sites and the enormous information they capture, make the task of studying and analyzing these networks extremely important. In a preliminary study, we address the question of how well can one estimate the similarities of two nodes by observing only the connections of a social network.
I plan to continue pursuing a better understanding of social networks, and analyzing both its theoretical and practical aspects. I think that this area is still in its infancy and there are many directions I wish to explore. In particular, I am interested in studying efficient algorithms for extracting useful information from social networks, e.g., finding similarities between users, detecting communities, identifying influential people or suspicious activity within the network. Such information can be used to optimize a variety of applications including search engines and advertising delivery systems. I am also interested in studying the behavior and propagation mechanisms of information spreading processes in social networks. More specifically, my research aims to identify basic parameters (of the society, the network, and the type of information) that govern the speed, efficiency, and shape of the propagation, as well as to develop mechanisms that control or manipulate these processes, e.g., change their speed or shape.