PhD Fellowship Summit 2016
August 23 - 24


Location: 1897 Landings Drive, Mountain View 94043 

What: A small, intimate summit, with ample opportunity for discussion. Our goals are to build stronger ties between Google and rising stars in their field and build a roadmap for possible future collaborations.


Schedule (subject to change)

Monday, August 22                                                                                                 

7:00 pm - 9:00 pm: Welcome Reception at Avatar Hotel Pool

Tuesday, August 23                                                                                                 
08:30 am: Shuttles leave The Avatar (we leave at 8:30 am sharp! Please be outside at 8:25 am)

09:30 am: Welcome, John Giannandrea, Head of Research & Machine Intelligence

09:45 am: Keynote: Jeff Dean, Google Senior Fellow

10:15 am: Keynote, Q&A

10:30 am: Break 

11:00 am: Machine Perception, Kevin Murphy, Research Scientist

11:45 am: Lunch

01:00 pm: Gmail Performance, Rebecca Isaacs, Research Scientist

01:45 pm: 
Technical Infrastructure at Google, Amin Vahdat, Google Fellow

02:30 pm: Break

02:45 pm: Speech Recognition, Francoise Beaufays, Principal Scientist

03:30 pm: Deep Learning & Music, Doug Eck, Senior Staff Research Scientist

04:30 pm : Break

04:45 pm: Internship Projects - Applying research at scale, Jakub Konecny & Amy Zhang

05:30 pm: Poster & Beer

06:30 pm: Dinner

08:30 pm: Shuttle leaves for The Avatar
 
Wednesday, August 24                                                                                              
08:30 am: Shuttles leave The Avatar

09:00 am: Check-in & Group Assignments 

09:15 am: Leave for breakouts

09:30 am: Breakouts

11:30 am: Return to Google - LMK2-1-Hudson
 
11:30 am: Lunch inside Hudson

12:00 pm: Panel Discussion during lunch

01:00 pm: Send off Message, Maggie Johnson, Director of Education and University Relations

01:30 pm: Shuttles Leave to SFO


Googlers

Umber Ahmed
Domagoj Babic
Jon Barron
Francoise Beaufays
Kshipra Bhawalkar
Mariko Cates Barajas 
Yuning Chai
Charles Chen
Sunny Consolvo
Vijay D'Silva
Alex Davies
Jeff Dean
Carole Dulong
Doug Eck 
Ulfar Erlingsson
Cristi Estan
Ian Fischer
John Giannandrea
Maya Gupta
Hakan Hacigumus
Steven Hand
Matthew Henderson
Rebecca Isaacs
Viren Jain
Maggie Johnson
Michael Johnson
Jakub Konecny
Thomas Leung
David Lowe
Mohammad Mahdian
Dana Movshovitz-Attias
Kevin Murphy
Sobhan Naderi Parizi
Daniel Nanas
Paul Natsev
Kyle Nesbit
Peter Norvig
Abhijit Ogale
Caroline Pantofaru
Jack Poulson
Hayes Raffle
Michael Rennacker
Patrick Riley
Daniel Russell
Negar Saei
Melanie Saldana
Mitja Trampus
Amin Vahdat
Brent Welch
John Wilkes
Jeff Wohlgemuth
Cong Yu
Amy Zhang
Xiao Zhang

PhD Fellows 
Aaron Parks, Mobile, University of Washington
Abhijnan Chakraborty, Social, Indian Institute of Technology Kharagpur, India
Amy Xian Zhang, HCI, Massachusetts Institute of Technology
Andrew Crotty, Systems, Brown University
Aron Monszpart, Machine Perception, University College London
Arvind Neelakantan, Machine Learning, University of Massachusetts, Amherst
Arvind Satyanarayan, HCI, Stanford University
Avisek Lahiri, Machine Perception, Indian Institute of Technology Kharagpur
Bahar Salehi, NLP, University of Melbourne
Bo Xin, Machine Learning, Peking University
Cameron Po-Hsuan Chen, Computational Neuroscience, Princeton University
Carl Vondrick, Machine Perception, Massachusetts Institute of Technology
Carl-Johann Simon-Gabriel, Statistics, Max Planck Institute Tübingen
Chia-Yin Tsai, Machine Perception, Carnegie Mellon University
Damian Vizar, Security, EPFL CS
Daniel Jaymin Mankowitz, Machine Learning, Technion - Israel Institute of Technology
Eugen Beck, Machine Perception, RWTH Aachen - Human Language Technology and Pattern Recognition
Gabriel Reyes, HCI, Georgia Institute of Technology
George Prekas, Systems, EPFL
Grace Lindsay, Computational Neuroscience, Columbia University
Himanshu Jain, Machine Learning, Indian Institute of Technology, Delhi
Ilias Marinos, Systems, University of Cambridge - Computer Laboratory
Ionel Gog, Systems, University of Cambridge
Jakub Jakub Konecny, Algorithms, University of Edinburgh
Jana Jana Giceva, Systems, ETH Zurich
Jinpeng Wang, NLP, Peking University, China
Jose Camacho Collados, NLP, Sapienza - Università di Roma
Josip Djolonga, Statistics, ETH Zurich
Jungdam Won, Robotics, Seoul National Univ.
Kartik Nayak, Security, University of Maryland, College Park
Kay Ousterhout, Systems, University of California, Berkeley
Keerti Choudhary, Algorithms, IIT Kanpur
Koki Nagano, HCI, University of Southern California
Lei Kang, Mobile, University of Wisconsin
Lucas Maystre, Machine Learning, EPFL CS
Ludwig Schmidt, Machine Learning, Massachusetts Institute of Technology
Łukasz Mazurek, Security, University of Warsaw
Marcelo Sousa, SWE, University of Oxford
Martino Sorbaro Sindaci, Computational Neuroscience, The University of Edinburgh
Nadav Cohen, Machine Learning, Hebrew University
Nicolas Papernot, Security, Pennsylvania State University, University Park
Ohad Fried, Graphics, Princeton University
Olivier Bachem, Machine Learning, ETH Zurich CS
Osbert Bastani, SWE, Stanford University
Palash Dey, Algorithms, Indian Institute of Science, Bangalore
Pallavi Maiya H P, Programming Languages, Indian Institute of Science
Qian Ge, Systems, UNSW Australia, Data 61
Rad Niazadeh, Market Algorithms, Cornell University
Riley Spahn, Privacy, Columbia University
Roee Litman, Machine Learning, Tel-Aviv University
Sadra Yazdanbod, Market Algorithms, Georgia Tech
Sandy Heydrich, Market Algorithms, Saarland University - Saarbrucken GSCS
Saurabh Gupta, Machine Perception, UC Berkeley
Shandian Zhe, Machine Learning, Purdue University, West Lafayette
Siqi Liu, Computational Neuroscience, University of Sydney
Siva Reddy, NLP, University of Edinburgh
Suining He, Mobile Computing, The Hong Kong University of Science and Technology
Tauhidur Rahman, Mobile, Cornell University
Thang Bui, Speech, University of Cambridge
Tianqi Chen, Machine Learning, University of Washington
Victoria Caparrós, Systems, ETH Zurich
Wei Liu, Machine Perception, University of North Carolina at Chapel Hill
Xiang Ren, SWE, University of Illinois, Urbana-Champaign
Xingyu Zeng, Machine Perception, The Chinese University of Hong Kong
Yu-Wei Chao, Machine Perception, University of Michigan, Ann Arbor
Yuhao Zhu, Mobile, University of Texas, Austin
Yuxin Chen, Machine Learning, ETH Zurich
Yves-Laurent Kom Samo, Machine Learning, University of Oxford
Zhenzhe Zheng, Mobile Computing, Shanghai Jiao Tong University



Amy Zhang
Wikum: Bridging Threaded Discussion and Wikis Using Recursive Summarization
Long and deeply threaded discussions abound on the internet today, on topics ranging from political issues to group decision-making. However, it is difficult to navigate through or get an overview of such discussions due to the large volume and many back-and-forth conversations and digressions.
We introduce a novel artifact called a summary tree which incorporates short summaries of subthreads of discussion directly into the main discussion threading structure at different levels.
To create this artifact, we introduce a process of recursive summarization, where participants working in small doses can build upon previously written summaries of smaller sub-threads to summarize ever-greater portions. We build a tool called Wikum, which implements this process as well as additional design decisions to aid in quality summary creation, such as encouraging the use of citations within summaries. Evaluations found that Wikum was overall faster and easier to use than a control condition using Google Docs.

Aron Monszpart
SMASH: Physics-­guided Reconstruction of Collisions from Videos
Knowing physical properties of real-life objects is important in settings, such as scene understanding, robotics or animation. Internal properties (e.g. elasticity) can be measured for some materials in controlled environments with sensitive sensors and customized, expensive setups. Objects occurring in real-life however have arbitrary shapes and are stiff or fragile, making these measurements much harder to perform. Our work investigates, how purely observing objects interacting allows us to reason about some of their internal physical properties.

Bahar Salehi
Language Independent Analysis of Multiword Expressions
Multiword expressions (MWEs) are combinations of words with lexical, syntactic or semantic idiosyncrasy. Among the interesting features of MWEs, their semantic idiosyncrasy (or compositionality) has been of particular interest to NLP researchers. An MWE is fully compositional if its meaning is predictable from its component words, and it is non-compositional (or idiomatic) if not. For example, with shoot the breeze, we have semantic idiosyncrasy, as the meaning of "to chat" in usages such as `It was good to shoot the breeze with you' cannot be predicted from the meanings of the component words shoot and breeze. In this study, I will introduce our language-independent methods to predict the degree of semantic compositionality of MWEs.

Bo Xin
Maximal Sparsity with Deep Networks
The iterations of many sparse estimation algorithms are comprised of a fixed linear filter cascaded with a thresholding nonlinearity, which collectively resemble a typical neural network layer. Consequently, a lengthy sequence of algorithm iterations can be viewed as a deep network with shared, hand-crafted layer weights. It is therefore quite natural to examine the degree to which a learned network model might act as a viable surrogate for traditional sparse estimation in domains where ample training data is available. While the possibility of a reduced computational budget is readily apparent when a ceiling is imposed on the number of layers, our work primarily focuses on estimation accuracy. In particular, it is well-known that when a signal dictionary has coherent columns, as quantified by a large RIP constant, then most tractable iterative algorithms are unable to find maximally sparse representations. In contrast, we demonstrate both theoretically and empirically the potential for a trained deep network to recover minimal l0-norm representations in regimes where existing methods fail. The resulting system is deployed on a practical photometric stereo estimation problem, where the goal is to remove sparse outliers that can disrupt the estimation of surface normals from a 3D scene.

Cameron Po-Hsuan Chen
An fMRI shared response model
Multi-subject fMRI data is critical for evaluating the generality and validity of findings across subjects, and its effective utilization helps improve analysis sensitivity. We develop a shared response model for aggregating multi-subject fMRI data that accounts for different functional topographies among anatomically aligned datasets. Our model demonstrates improved sensitivity in identifying a shared response for a variety of datasets and anatomical brain regions of interest. Furthermore, by removing the identified shared response, it allows improved detection of group differences. The ability to identify what is shared and what is not shared opens the model to a wide range of multi-subject fMRI studies.

Chia-Yin Tsai
Unconstrained Shape and Reflectance Estimation from Attributes of 
Light Paths
Images are poorly equipped to infer properties of a scene including its shape, reflectance, and its composition. This is especially true for scenes that interact with light in complex ways since many of these interactions are high-dimensional functions whose properties are not fully captured by the 2-dimensional image. The objective of my research is to develop novel computational imaging architectures and associated inference methods for shape and reflectance estimation by studying characterizations of light that go beyond images. My research is based on a simple yet powerful hypothesis that the laws of image formation are significantly simplified when we consider light paths and their properties as opposed to images. We support the hypothesis by two methods. First, we use the deflection of light rays to recover the shape of transparent objects. Second, we use the time of flight and light transport of light rays to study the shape and reflectance of complex objects.

Daniel Jaymin Mankowitz
Adaptive Skills, Adaptive Partitions (ASAP)
My recent work, entitled 'Adaptive Skills, Adaptive Partitions (ASAP) (http://arxiv.org/abs/1602.03351)' enables an artificially intelligent agent to learn skills (for example, run and shooting for goal in a soccer game). The agent also learns when to use these skills. This is achieved using my Reinforcement Learning-based ASAP framework.

Giceva Jana
Basslet: rethinking the OS stack for parallel data processing
We revisit the problem of scheduling parallel data processing workloads on modern hardware, in the light of new operating system and application designs. Basslet uses a novel approach that partitions a multicore machine into a control plane, for the execution of regular application threads, and a dynamic compute plane which runs customized lightweight OS stack(s) for particular classes of data processing workloads. In this poster we show how such a separation allows us to specialize the compute plane for parallel data analytics by integrating a customized scheduler into the OS kernel.

Himanshu Jain
No title
The choice of the loss function is critical in extreme multi- label learning where the objective is to annotate each data point with the most relevant subset of labels from an extremely large label set. Unfortunately, existing loss functions, such as the Hamming loss, are unsuitable for learning, model selection, hyperparameter tuning and performance evaluation. We address this issue by developing propensity scored losses which: (a) prioritize predicting the few relevant labels over the large number of irrelevant ones; (b) do not erroneously treat missing labels as irrelevant but instead provide unbiased estimates of the true loss function even when ground truth labels go missing under arbitrary probabilistic label noise models; and (c) promote the accurate prediction of infrequently occurring, hard to pre- dict, but rewarding tail labels. We also propose the PfastreXML algorithm which efficiently scales to large datasets with up to 9 million labels, 70 million points and 2 million dimensions and which gives significant improvements over the state-of- the-art.

Our results also apply to tagging, recommendation and ranking which are the motivating applications for extreme multi-label learning. They generalize previous attempts at deriving unbiased losses under the restrictive assumption that labels go missing uniformly at random from the ground truth. Furthermore, they provide a sound theoretical justification for popular label weighting heuristics used to recommend rare items.

Jungdam Won
Shadow Theatre: Discovering Human Motion from a Sequence of Silhouettes
Shadow theatre is a genre of performance art in which the actors are only visible as shadows projected on the screen. The goal of this study is to generate animated characters, the shadows of which match a sequence of target silhouettes. This poses several challenges. The motion of multiple characters are carefully coordinated to form a target silhouette on the screen, and each character's pose should be stable, balanced, and plausible. The resulting character animation should be smooth and coherent spatially and temporally. We formulate the problem as nonlinear constrained optimization with objectives, which were designed to generate plausible human motions. Our optimization algorithm was primarily inspired by the heuristic strategies of professional shadow theatre actors. Their know-how was studied and then incorporated into our optimization formulation. We demonstrate the effectiveness of our approach with a variety of target silhouettes and 3D fabrication of the results.

Nadav Cohen
On the Expressive Power of Deep Learning: A Tensor Analysis
We derive an equivalence between convolutional networks -- the most successful deep learning architecture to date, and tensor decompositions. The equivalence is used to analyze the expressive properties of such networks, settling old conjectures as well as proving new and surprising results. We show that with linear activation and product pooling, almost all functions realized by a deep network require exponential size in order to be realized (or approximated) by a shallow network. Surprisingly, the result no longer holds when the activation and pooling operators are switched to ReLU and max/average respectively. This suggests that in terms of expressiveness, the most popular type of convolutional networks is inferior to an alternative "arithmetic circuit" variant, which has recently been implemented and is showing promising results in practice. We focus on the latter, extending the analysis beyond separation of depths. Specifically, we study expressible functions in terms of their ability to model correlation between regions of the input. We find that this ability is only achievable through depth, and that a deep network's pooling geometry selects which correlations can be modeled, thereby controlling the inductive bias.

The poster is based on a series of papers recently presented in COLT, ICML and CVPR, as well as a new arXiv preprint. Joint work with Or Sharir and Amnon Shashua.

Nicolas Papernot
The Limitations of Machine Learning in Adversarial Settings
Machine learning models, including deep neural networks, were shown to be 
vulnerable to adversarial examples--subtly (and often humanly indistinguishably) 
modified malicious inputs crafted to compromise the integrity of their outputs. 
Adversarial examples thus enable adversaries to manipulate system behaviors. 
Potential attacks include attempts to control the behavior of vehicles, have 
spam content identified as legitimate content, or have malware identified as 
legitimate sofware. In fact, the feasibility of misclassification attacks based 
on adversarial examples has been shown for image, text, and malware classifiers.

Furthermore, adversarial examples that affect one model often affect another 
model, even if the two models have different architectures (neural network, 
support vector machine, nearest neighbor, ...) or were trained on different 
training sets, so long as both models were trained to perform the same task. An 
attacker may therefore train their own substitute model, craft adversarial 
examples against the substitute, and transfer them to a victim model, with very 
little information about the victim. The attacker need not even collect a 
training set to mount the attack, as a technique demonstrated how using the 
victim model as an oracle to label a synthetic training set for the substitute 
effectively allows adversaries to target remotely hosted classifiers.

This poster covers several adversarial example crafting algorithms operating under 
varying threat models and application domains, as well as defenses proposed to 
mitigate adversarial examples. Such defenses include label smoothing during 
training, training on adversarial examples, and defensive distillation.

Xiang Ren
Label Noise Reduction in Entity Typing by Heterogeneous Partial-Label Embedding
Current systems of fine-grained entity typing use distant supervision in conjunction with existing knowledge bases to assign categories (type labels) to entity mentions. However, the type labels so obtained from knowledge bases are often noisy (i.e., incorrect for the entity mention’s local context). We define a new task, Label Noise Reduction in Entity Typing (LNR), to be the automatic identification of correct type labels (type-paths) for training examples, given the set of candidate type labels obtained by distant supervision with a given type hierarchy. A novel framework, called PLE, is proposed to jointly embed entity mentions, text features and entity types into the same low-dimensional space where, in that space, objects whose types are semantically close have similar representations. Then we estimate the type-path for each training example in a top-down manner using the learned embeddings. Our experiments on three public typing datasets demonstrate the effectiveness and robustness of PLE, with an average of 25% improvement in accuracy compared to next best method.

Shandian Zhe
No title
My research mainly includes two aspects: 1) Bayesian sparse learning with spike-and-slab priors, such as sparse group selection with structure constraint, fast Laplace approximation and scalable online inference; 2) Scalable nonparametric tensor factorization, such as distributed infinite tucker decomposition, online decomposition with joint latent cluster discovery and distributed, key-value-free Gaussian process factorization.

Yuxin Chen
Near-optimal Adaptive Information Acquisition: Theory and Applications
Sequential information gathering, i.e., selectively acquiring the most useful data, plays a key role in interactive machine learning systems. Such problem has been studied in the context of Bayesian active learning and experimental design, decision making, optimal control and numerous other domains. In this work, we focus on a class of information gathering tasks, where the goal is to learn the value of some unknown target variable through a sequence of informative, possibly noisy tests. In contrast to prior work, we focus on the challenging, yet practically relevant setting where test outcomes can be conditionally dependent given the hidden target variable. Under such assumptions, common heuristics, such as greedily performing tests that maximize the reduction in uncertainty of the target, often perform poorly. 

We propose a class of novel, computationally efficient active learning algorithms, and prove strong theoretical guarantees that hold with correlated, possibly noisy tests. Rather than myopically optimize the value of a test (which, in our case, is the expected reduction in prediction error), at each step, our algorithms pick the test that maximizes the gain in a surrogate objective, which is adaptive submodular. This property enables us to utilize an efficient greedy optimization while providing strong approximation guarantees. We demonstrate our algorithms in several real-world problem instances, including a touch-based location task on an actual robotic platform, and an active preference learning task via pairwise comparisons.


Hotel

What hotel will I be staying at?
If you requested hotel and we confirmed your stay, you'll be staying the The Avatar Hotel.

What if I want to extend my stay?
You are welcome to stay in the hotel longer but it'll be at your own expense. Google will cover accommodation for the nights confirmed by email. You'll be responsible for any nights outside of the program dates.

What time is check-in and check-out?
Check-in time is at 3:00 pm and check-out is at 11:00 am. Rooms may not be available on the day of arrival prior to the Hotel's stated check-in time; however, every effort will be made to accommodate early arrivals.

Is internet available in the guest rooms?
Yes, there is free wi-fi on the property. 

Is there parking available at the hotel?
Yes, there is parking at the hotel. 

Travel Expenses

Will I need any money for the hotel?
Yes, when you check in at the hotel, your credit card will be required as a room deposit and for hotel incidentals (room service, mini bar, telephone calls, etc.). At check-out you'll have the opportunity to change the method of payment, either by presenting a different credit card or cash. The hotel doesn't accept personal checks. Google will cover the cost of the room & taxes only.

Will I need money for anything else?
Google will cover the cost of airfare, hotel accommodation for those confirmed, and one shuttle transport to the SF airport August 24th at 1:30 pm, and meals at specific times in the program. You'll be responsible for all other expenses.