Selected Presentations


Inventing Algorithms via Deep Learning,

Abstract:Deep learning is a part of daily life, owing to its successes in computer vision and natural language processing. In these applications, the success of the model-free deep learning approach can be attributed to a lack of a (mathematical) generative model. In yet other applications, the data is generated by a simple model and performance criterion mathematically clear and training samples infinitely abundant, but the space of algorithmic choices is enormous (example: chess). Deep learning has recently shown strong promise in these problems too (example: alphazero). In this talk, we study two such canonical problems of great scientific and engineering interest through the lens of deep learning, specifically reliable communication over noisy media where we successfully revisit classical open problems in information theory; we show that creatively trained and architected neural networks can beat state of the art on the AWGN channel with noisy feedback by a 100 fold improvement in bit error rate. Apart from the obvious practical value, this study of mathematically precise problems sheds light on the mysteries of deep learning methods: training example choices, architectural design decisions and loss function/learning methodologies.
Talks at: CISS plenary at Princeton (March 2018), MIT (April 2018), Michigan and Stanford (May 2018).
1. Deepcode: Feedback Codes via Deep Learning, NIPS '18,
2. Communication Algorithms via Deep Learning, ICLR, '18,

Learning in Gated Neural Networks,

Abstract: Mixture-of-Experts (MoE) is a widely popular neural network architecture and is a basic building block of highly successful modern neural networks, for example, Gated Recurrent Units (GRU) and Attention networks. However, despite the empirical success, finding an efficient and provably consistent algorithm to learn the parameters remains a long standing open problem for more than two decades. In this talk, we introduce the first algorithms that learns the true parameters of a MoE model for a wide class of non-linearities with global consistency guarantees.
Talk at: Princeton (May 2018).
1. Learning One-hidden-layer Neural Networks under General Input Distributions, preprint,
2. Breaking the gridlock in Mixture-of-Experts: Consistent and Efficient Algorithms, preprint,



Geometries of Word Embeddings,

Abstract: Real-valued word vectors have transformed NLP applications; popular examples are word2vec and GloVe, recognized for their ability to capture linguistic regularities via simple geometrical operations. In this talk, we demonstrate further striking geometrical properties of the word vectors. First we show that a very simple, and yet counter-intuitive, postprocessing technique, which makes the vectors "more isotropic", renders off-the-shelf vectors even stronger. Second, we show that a sentence containing a target word is well represented by a low rank subspace; subspaces associated with a particular sense of the target word tend to intersect over a line (one-dimensional subspace). We harness this Grassmannian geometry to disambiguate (in an unsupervised way) multiple senses of words, specifically so on the most promiscuously polysemous of all words: prepositions. A surprising finding is that rare senses, including idiomatic/sarcastic/metaphorical usages, are efficiently captured. Our algorithms are all unsupervised and rely on no linguistic resources; we validate them by presenting new state-of-the-art results on a variety of multilingual benchmark datasets.
Talks at: MIT (April, 2017), Berkeley, Stanford and Princeton (May 2017) and EPFL (June 2017).
1. Geometry of Compositionality, AAAI '17,
2. Geometry of Polysemy, ICLR, '17,
3. Representing Sentences as Low-rank subspaces, ACL '17,
4. Prepositions in Context, EMNLP '18,

Anonymity in the Bitcoin Peer-to-Peer Network,

Abstract: Bitcoin enjoys a public perception of being a privacy-preserving financial system. In reality, Bitcoin has a number of privacy vulnerabilities, including the well-studied fact that transactions can be linked through the public blockchain. More recently, researchers have demonstrated deanonymization attacks that exploit a lower-layer weakness: the Bitcoin peer-to-peer (P2P) networking stack. In particular, the P2P network currently forwards content in a structured way that allows observers to deanonymize users by linking their transactions to the originating IP addresses. In this work, we first demonstrate that current protocols exhibit poor anonymity guarantees, both theoretically and in practice. Then, we consider a first-principles redesign of the P2P network, with the goal of providing strong, provable anonymity guarantees. We propose a simple networking policy called Dandelion, which achieves nearly-optimal anonymity guarantees at minimal cost to the network's utility. We present a practical implementation of Dandelion, whose source code is available at
1. Dandelion: Redesigning the Bitcoin Network for Anonymity, Sigmetrics '17,
2. Anonymity Properties of the Bitcoin P2P Network, NIPS '17,

Tutorial: Information Limits on Finding and Hiding Message Sources on Networks: Social Media and Cryptocurrencies,

Abstract: Broadcasting is a common theme in networks, from the spread of diseases to the propagation of viral tweets. There is a long history of modeling and analyzing the spread of epidemics, where network and spreading models are typically derived from the real world. Problems of interest include analyzing the speed of contagion, computing a disease’s probability of extinction, and identifying the patient zero for an epidemic. Renewed interest has emerged in this topic with the advent of online networks like social media and cryptocurrencies, which ask users to broadcast content over a computer network. However, these new applications differ from epidemics in one key respect: they require the source of broadcasted messages to remain anonymous. Reasons for anonymity range from freedom of speech to financial privacy. With this added constraint comes a key advantage: network designers are often able to control the network structure and/or propagation models in computer networks. This has led to a complementary line of work that attempts to design networking stacks that inherently protect the anonymity of a message source.
In this tutorial, we present recent results on source identification and hiding in the context of applications that broadcast information over networks. We discuss models and results studied by the information theory, physics, and computer science communities. We then provide a gentle introduction to methods for source identification, including the theoretical tools needed to study these problems. We conclude by discussing recent methods for designing networks that inherently hide the source of a message. The tutorial will bring together tools from different areas like branching processes, measure concentration, and extreme value theory, all in the context of cutting-edge technologies like cryptocurrencies and anonymous social media.
Tutorial at IEEE International Symposium on Information Theory (ISIT) 2017.