Alexander Toshev

Research Scientist / Tech Lead

Robotics @ Google

toshev at google / alex.t.toshev at gmail

Google AI Profile, Google Scholar


I am a Research Scientist at Google AI, leading research efforts in Computer Vision, Machine Learning, and more recently Robotics.

Most recently, I have been working on building perceptual capabilities for autonomous agents, initiating and leading an effort on Robot Navigation (Semantic Navigation, Social Navigation, Point Navigation, etc.), as part of the robotics research effort at Google.

Prior to that, I have extensively worked on a wide range of computer vision problems. Notable achievements:

  • human pose estimation: first deep learning based approach, SOTA results over the years

  • object detection: first deep learning based approach, SOTA results ca 2015, widely deployed at Google

  • language and computer vision: co-initiated a stream on language and vision in the computer vision community, one of first works on neural image captioning.

Academic Activities

Symposium on Social Navigation Benchmarking, Feb 2022.

CVPR'20, CVPR'21, CVPR'22 (in prep) Workshop on Embodied AI

CVPR' 19 Workshop on Deep Learning for Semantic Visual Navigation

Area Chair, CVPR 2017, CVPR 2020, ECCV 2020, NIPS 2021, ECCV 2022

Program committee, CVPR, ICCV, ECCV, NIPS

Recent Talks

Georgia Tech / Google Robotics Workshop, May 2021.

iGibson Sim2Real Challenge, Embodied AI Workshop, CVPR 2020.

Robot Learning Workshop, Robot Learning Workshop, NSF & Lehigh University, 2019.



M. Ahn, A. Brohan, N. Brown, Y. Chebotar, O. Cortes, B. David, Ch. Finn, K. Gopalakrishnan, K. Hausman, A. Herzog, D. Ho, J. Hsu, J. Ibarz, B. Ichter, A. Irpan, E. Jang, R. Jauregui Ruano, K. Jeffrey, S. Jesmonth, N. J Joshi, R. Julian, D. Kalashnikov, Y. Kuang, K.-H. Lee, S. Levine, Y. Lu, L. Luu, C. Parada, P. Pastor, J. Quiambao, K. Rao, J. Rettinghouse, D. Reyes, P. Sermanet, N. Sievers, Cl. Tan, A. Toshev, V. Vanhoucke, F. Xia, T. Xiao, P. Xu, S. Xu, M. Yan, Do As I Can, Not As I Say: Grounding Language in Robotic Affordances, In Submission, 2022.

Haresh Karnan, Anirudh Nair, Xuesu Xiao, Garrett Warnell, Soeren Pirk, Alexander Toshev, Justin Hart, Joydeep Biswas, Peter Stone, Socially Compliant Navigation Dataset (SCAND): A Large-Scale Dataset of Demonstrations for Social Navigation, IROS, 2022.

Soeren Pirk, Edward Lee, Xuesu Xiao, Anthony Francis, Leila Takayama, Alexander Toshev, A Protocol for Evaluating Social Navigation Policies, ICRA Workshop on Social Robot Navigation: Advances and Evaluation, 2022.

Dhruv Shah, Peng Xu, Yao Lu, Ted Xiao, Alexander Toshev, Sergey Levine, Brian Ichter, Value Function Spaces, Skill-Centric State Abstractions for Long-Horizon Reasoning, ICLR, 2022.

Ayzaan Wahid, Austin Stone, Kevin Chen, Brian Ichter, Alexander Toshev, Learning Object-conditioned Exploration using Distributed Soft Actor Critic, CoRL 2020.

Dhruv Batra, Aaron Gokaslan, Aniruddha Kembhavi, Oleksandr Maksymets, Roozbeh Mottaghi, Manolis Savva, Alexander Toshev, Erik Wijmans, Objectnav revisited: On evaluation of embodied agents navigating to objects, position paper, 2020

Fei Xia, Chengshu Li, Or Litany, Roberto Martin-Martin, Alexander Toshev, Silvio Savarese, ReLMoGen: Leveraging Motion Generation in Reinforcement Learning for Mobile Manipulation, 2020.

Sören Pirk, Karol Hausman, Alexander Toshev, Mohi Khansari, Modeling Long-horizon Tasks as Sequential Interaction Landscapes, CoRL 2020.

Fei Xia, William Chen, Chengshu Li, Priya Kasimbeg, Micael Tchampi, Alexander Toshev, Roberto Martin-Martin, Silvio Savarese, Interactive Gibson: A Benchmark in Navigation in Cluttered Environments, RA-Letters, 2020

Kuan Fang, Alexander Toshev, Silvio Savarese, Li Fei-Fei, Scene Memory Transformer for Embodied Agents in Long Horizon Tasks, CVPR 2019.

Ayzaan Wahid, Alexander Toshev, Marek Fiser, Edward Lee, Long Range Neural Navigation Policies for the Real World, IROS 2019.

Arsalan Mousavian, Alexander Toshev, Marek Fiser, Jana Kosecka, James Davidson, Visual Representations for Semantic Target Driven Navigation, ICRA 2019.

Fereshteh Sadeghi, Alexander Toshev, Eric Jang, Sergey Levine, Sim2Real Viewpoint Invariant Visual Servoing by Recurrent Control, CVPR 2018.

Language and Vision

Oriol Vinyals, Alexander Toshev, Samy Bengio, Dumitru Erhan, Show and Tell: Lessons Learned from the 2015 MSCOCO Image Captioning Challenge, IEEE Transactions on PAMI, 2017.

Junhua Mao, Jonathan Huang, Alexander Toshev, Oana Camburu, Alan L Yuille, Kevin Murphy, Generation and Comprehension of Unambiguous Object Descriptions, CVPR 2016.

Oriol Vinyals, Alexander Toshev, Samy Bengio, Dumitru Erhan, Show and tell: A neural image caption generator, CVPR 2015 (oral, 3100+ citations).

Human Pose Estimation

AJ Piergiovanni, Anelia Angelova, Alexander Toshev, Michael S. Ryoo, Adversarial Generative Grammars for Human Activity Prediction, ECCV 2020.

George Papandreou, Tyler Zhu, Nori Kanazawa, Alexander Toshev, Jonathan Tompson, Chris Bregler, Kevin Murphy, Towards accurate multi-person pose estimation in the wild, CVPR 2017 (best on human pose estimation on COCO).

Georgia Gkioxari, Alexander Toshev, Navdeep Jaitly, Chained Predictions Using Convolutional Neural Networks, ECCV 2016.

Alexander Toshev, Christian Szegedy, DeepPose: Human Pose Estimation via Deep Neural Networks, CVPR 2014 (oral, 1300+ citations).

Benjamin Sapp, Alexander Toshev, Ben Taskar, Cascaded Models for Articulated Pose Estimation, ECCV 2010.

Object Detection

Etienne Pot, Alexander Toshev, Jana Kosecka, Self-supervisory Signals for Object Discovery and Detection, 2018.

Dumitru Erhan, Christian Szegedy, Alexander Toshev, Dragomir Anguelov, Scalable Object Detection Using Deep Neural Networks, CVPR 2014 (700+ citations).

Christian Szegedy, Alexander Toshev, Dumitru Erhan, Deep Neural Networks for Object Detection, NIPS 2013 (800+ citations).


AJ Piergiovanni, Anelia Angelova, Alexander Toshev, Michael S Ryoo, Evolving Space-Time Neural Architectures for Videos, In Submission, 2019.

Yair Movshovitz-Attias, Alexander Toshev, Thomas K Leung, Sergey Ioffe, Saurabh Singh, No Fuss Distance Metric Learning via Proxies, ICCV 2017.

Jonathan Krause, Benjamin Sapp, Andrew Howard, Howard Zhou, Alexander Toshev, Tom Duerig, James Philbin, Li Fei-Fei, The Unreasonable Effectiveness of Noisy Data for Fine-Grained Recognition, ECCV 2016.

Yunchao Gong, Yangqing Jia, Thomas Leung, Alexander Toshev, Sergey Ioffe, Deep Convolutional Ranking for Multi-label Image Annotation, ICLR 2013

Alexander Toshev, Philippos Mordohai, Ben Taskar, Detecting and Parsing Architecture at City Scale from Range Data, CVPR 2010.

Alexander Toshev, Ben Taskar, Kostas Daniilidis, Object Detection via Boundary Structure Segmentation, CVPR 2010.

Alexander Toshev, Ameesh Makadia, Kostas Daniilidis, Shape-based object recognition in videos using 3D synthetic object models, CVPR 2009.

Alexander Toshev, Jianbo Shi, Kostas Daniilidis, Image Matching via Saliency Region Correspondences, CVPR 2007 (oral).

Alexander Toshev, Submodular Function Minimization, University of Pennsylvania, 2010.


Distance Metric Learning Using Proxies, Yair Movshovitz-Attias, Thomas Leung, Sergey Ioffe, Saurabh Singh, Alexander Toshev, 10,387,749, 2019.

Generating natural language descriptions of images, Samy Bengio, Oriol Vinyals, Alexander Toshev, Dumitru Erhan, US Patent 9,858,524, 2018.

Automatic translation of digital graphic novels, Greg Don Hartrell, Debajit Ghosh, Matthew William Vaughan-Vail, John Michael Rivlin, US Patent 9,881,003, 2018.

Sublinear time classification via feature padding and hashing, Sergey Ioffe, Alexander Toshev, US Patent 9,940,552, 2018.

Ranking approach to train deep neural nets for multilabel image annotation, Yunchao Gong, King Hong Thomas Leung, Alexander Toshev, Sergey Ioffe, US Patent 9,552,549, 2017.

Object detection using deep neural networks, Christian Szegedy, Dumitru Erhan, Alexander Toshev, US Patent 9,275,308, 2016.

System and method for using segmentation to identify object location in images, Vivek Kwatra, Jay Yagnik, Alexander Toshev, US Patent 9,483,701, 2016.

Object recognition, Alexander Toshev, King Hong Thomas Leung, Jiwoong Jack Sim, US Patent 8,942,468, 2015.

Perceptually-driven representation for object recognition, Alexander Toshev, Jay Yagnik, Vivek Kwatra, US Patent 9,008,356, 2015.

Discriminitive learning for object detection, Dragomir Anguelov, Alexander Toshkov Toshev, Deva K Ramanan, Xiangxin Zhu, US Patent 9,098,741, 2015.

System and method for exploiting segment co-occurrence relationships to identify object location in images, Vivek Kwatra, Jay Yagnik, Alexander Toshev, Poonam Suryanarayan, US Patent 8,768,048, 2014.

Segmentation-based feature pooling for object models, Alexander Toshev, Jay Yagnik, Vivek Kwatra, , US Patent 8,467,607, 2013.


Dhruv Shah, Student at UC Berkeley, co-advised with Brian Ichter

Fei Xia, Robotics @ Google

Joe Campbell, Postdoc at CMU

Chengshu Li, Student at Stanford University

Kevin Chen, Apple

Fereshteh Sadeghi, DeepMind

Arsalan Mousavian, NVidia Robotics, co-advised with Jana Kosecka

Oana-Maria Camburu, Postdoc at Oxford University

Georgia Gkioxari, FAIR, co-advised with Navdeep Jaitly

Andre Araujo, Google, co-advised with Sergey Ioffe

Jonathan Krause, Google, co-advised with Howard Zhou

Kota Yamaguchi, Assist. Prof. at Tohoku University

Ling-Ling Tao, Facebook AI

Yunchao Gong, Verkada

Jack Sim, Waymo, co-advised with Thomas Leung