I am a Research Scientist at Google AI, leading research efforts in Computer Vision, Machine Learning, and more recently Robotics.
Most recently, I have been working on building perceptual capabilities for autonomous agents, initiating and leading an effort on Robot Navigation (Semantic Navigation, Social Navigation, Point Navigation, etc.), as part of the robotics research effort at Google.
Prior to that, I have extensively worked on a wide range of computer vision problems. Notable achievements:
human pose estimation: first deep learning based approach, SOTA results over the years
object detection: first deep learning based approach, SOTA results ca 2015, widely deployed at Google
language and computer vision: co-initiated a stream on language and vision in the computer vision community, one of first works on neural image captioning.
Symposium on Social Navigation Benchmarking, Feb 2022.
Area Chair, CVPR 2017, CVPR 2020, ECCV 2020, NIPS 2021, ECCV 2022
Program committee, CVPR, ICCV, ECCV, NIPS
Georgia Tech / Google Robotics Workshop, May 2021.
iGibson Sim2Real Challenge, Embodied AI Workshop, CVPR 2020.
Robot Learning Workshop, Robot Learning Workshop, NSF & Lehigh University, 2019.
M. Ahn, A. Brohan, N. Brown, Y. Chebotar, O. Cortes, B. David, Ch. Finn, K. Gopalakrishnan, K. Hausman, A. Herzog, D. Ho, J. Hsu, J. Ibarz, B. Ichter, A. Irpan, E. Jang, R. Jauregui Ruano, K. Jeffrey, S. Jesmonth, N. J Joshi, R. Julian, D. Kalashnikov, Y. Kuang, K.-H. Lee, S. Levine, Y. Lu, L. Luu, C. Parada, P. Pastor, J. Quiambao, K. Rao, J. Rettinghouse, D. Reyes, P. Sermanet, N. Sievers, Cl. Tan, A. Toshev, V. Vanhoucke, F. Xia, T. Xiao, P. Xu, S. Xu, M. Yan, Do As I Can, Not As I Say: Grounding Language in Robotic Affordances, In Submission, 2022.
Haresh Karnan, Anirudh Nair, Xuesu Xiao, Garrett Warnell, Soeren Pirk, Alexander Toshev, Justin Hart, Joydeep Biswas, Peter Stone, Socially Compliant Navigation Dataset (SCAND): A Large-Scale Dataset of Demonstrations for Social Navigation, In Submission, 2022.
Soeren Pirk, Edward Lee, Xuesu Xiao, Anthony Francis, Leila Takayama, Alexander Toshev, A Protocol for Evaluating Social Navigation Policies, In Submission, 2022.
Dhruv Shah, Peng Xu, Yao Lu, Ted Xiao, Alexander Toshev, Sergey Levine, Brian Ichter, Value Function Spaces, Skill-Centric State Abstractions for Long-Horizon Reasoning, ICLR, 2022.
Dhruv Batra, Aaron Gokaslan, Aniruddha Kembhavi, Oleksandr Maksymets, Roozbeh Mottaghi, Manolis Savva, Alexander Toshev, Erik Wijmans, Objectnav revisited: On evaluation of embodied agents navigating to objects, position paper, 2020
Fei Xia, William Chen, Chengshu Li, Priya Kasimbeg, Micael Tchampi, Alexander Toshev, Roberto Martin-Martin, Silvio Savarese, Interactive Gibson: A Benchmark in Navigation in Cluttered Environments, RA-Letters, 2020
Kuan Fang, Alexander Toshev, Silvio Savarese, Li Fei-Fei, Scene Memory Transformer for Embodied Agents in Long Horizon Tasks, CVPR 2019.
Ayzaan Wahid, Alexander Toshev, Marek Fiser, Edward Lee, Long Range Neural Navigation Policies for the Real World, IROS 2019.
Arsalan Mousavian, Alexander Toshev, Marek Fiser, Jana Kosecka, James Davidson, Visual Representations for Semantic Target Driven Navigation, ICRA 2019.
Fereshteh Sadeghi, Alexander Toshev, Eric Jang, Sergey Levine, Sim2Real Viewpoint Invariant Visual Servoing by Recurrent Control, CVPR 2018.
Language and Vision
Oriol Vinyals, Alexander Toshev, Samy Bengio, Dumitru Erhan, Show and Tell: Lessons Learned from the 2015 MSCOCO Image Captioning Challenge, IEEE Transactions on PAMI, 2017.
Junhua Mao, Jonathan Huang, Alexander Toshev, Oana Camburu, Alan L Yuille, Kevin Murphy, Generation and Comprehension of Unambiguous Object Descriptions, CVPR 2016.
Oriol Vinyals, Alexander Toshev, Samy Bengio, Dumitru Erhan, Show and tell: A neural image caption generator, CVPR 2015 (oral, 3100+ citations).
Human Pose Estimation
George Papandreou, Tyler Zhu, Nori Kanazawa, Alexander Toshev, Jonathan Tompson, Chris Bregler, Kevin Murphy, Towards accurate multi-person pose estimation in the wild, CVPR 2017 (best on human pose estimation on COCO).
Georgia Gkioxari, Alexander Toshev, Navdeep Jaitly, Chained Predictions Using Convolutional Neural Networks, ECCV 2016.
Alexander Toshev, Christian Szegedy, DeepPose: Human Pose Estimation via Deep Neural Networks, CVPR 2014 (oral, 1300+ citations).
Benjamin Sapp, Alexander Toshev, Ben Taskar, Cascaded Models for Articulated Pose Estimation, ECCV 2010.
Etienne Pot, Alexander Toshev, Jana Kosecka, Self-supervisory Signals for Object Discovery and Detection, 2018.
Dumitru Erhan, Christian Szegedy, Alexander Toshev, Dragomir Anguelov, Scalable Object Detection Using Deep Neural Networks, CVPR 2014 (700+ citations).
Christian Szegedy, Alexander Toshev, Dumitru Erhan, Deep Neural Networks for Object Detection, NIPS 2013 (800+ citations).
AJ Piergiovanni, Anelia Angelova, Alexander Toshev, Michael S Ryoo, Evolving Space-Time Neural Architectures for Videos, In Submission, 2019.
Yair Movshovitz-Attias, Alexander Toshev, Thomas K Leung, Sergey Ioffe, Saurabh Singh, No Fuss Distance Metric Learning via Proxies, ICCV 2017.
Jonathan Krause, Benjamin Sapp, Andrew Howard, Howard Zhou, Alexander Toshev, Tom Duerig, James Philbin, Li Fei-Fei, The Unreasonable Effectiveness of Noisy Data for Fine-Grained Recognition, ECCV 2016.
Yunchao Gong, Yangqing Jia, Thomas Leung, Alexander Toshev, Sergey Ioffe, Deep Convolutional Ranking for Multi-label Image Annotation, ICLR 2013
Alexander Toshev, Philippos Mordohai, Ben Taskar, Detecting and Parsing Architecture at City Scale from Range Data, CVPR 2010.
Alexander Toshev, Ben Taskar, Kostas Daniilidis, Object Detection via Boundary Structure Segmentation, CVPR 2010.
Alexander Toshev, Ameesh Makadia, Kostas Daniilidis, Shape-based object recognition in videos using 3D synthetic object models, CVPR 2009.
Alexander Toshev, Jianbo Shi, Kostas Daniilidis, Image Matching via Saliency Region Correspondences, CVPR 2007 (oral).
Alexander Toshev, Submodular Function Minimization, University of Pennsylvania, 2010.
Distance Metric Learning Using Proxies, Yair Movshovitz-Attias, Thomas Leung, Sergey Ioffe, Saurabh Singh, Alexander Toshev, 10,387,749, 2019.
Generating natural language descriptions of images, Samy Bengio, Oriol Vinyals, Alexander Toshev, Dumitru Erhan, US Patent 9,858,524, 2018.
Automatic translation of digital graphic novels, Greg Don Hartrell, Debajit Ghosh, Matthew William Vaughan-Vail, John Michael Rivlin, US Patent 9,881,003, 2018.
Sublinear time classification via feature padding and hashing, Sergey Ioffe, Alexander Toshev, US Patent 9,940,552, 2018.
Ranking approach to train deep neural nets for multilabel image annotation, Yunchao Gong, King Hong Thomas Leung, Alexander Toshev, Sergey Ioffe, US Patent 9,552,549, 2017.
Object detection using deep neural networks, Christian Szegedy, Dumitru Erhan, Alexander Toshev, US Patent 9,275,308, 2016.
System and method for using segmentation to identify object location in images, Vivek Kwatra, Jay Yagnik, Alexander Toshev, US Patent 9,483,701, 2016.
Object recognition, Alexander Toshev, King Hong Thomas Leung, Jiwoong Jack Sim, US Patent 8,942,468, 2015.
Perceptually-driven representation for object recognition, Alexander Toshev, Jay Yagnik, Vivek Kwatra, US Patent 9,008,356, 2015.
Discriminitive learning for object detection, Dragomir Anguelov, Alexander Toshkov Toshev, Deva K Ramanan, Xiangxin Zhu, US Patent 9,098,741, 2015.
System and method for exploiting segment co-occurrence relationships to identify object location in images, Vivek Kwatra, Jay Yagnik, Alexander Toshev, Poonam Suryanarayan, US Patent 8,768,048, 2014.
Segmentation-based feature pooling for object models, Alexander Toshev, Jay Yagnik, Vivek Kwatra, , US Patent 8,467,607, 2013.
Dhruv Shah, Student at UC Berkeley
Fei Xia, Robotics @ Google
Joe Campbell, Postdoc at CMU
Chengshu Li, Student at Stanford University
Kevin Chen, Apple
Fereshteh Sadeghi, DeepMind
Arsalan Mousavian, NVidia Robotics
Oana-Maria Camburu, Postdoc at Oxford University
Georgia Gkioxari, FAIR
Andre Araujo, Google
Jonathan Krause, Google
Kota Yamaguchi, Assist. Prof. at Tohoku University
Ling-Ling Tao, Facebook AI
Yunchao Gong, Verkada
Jack Sim, Waymo