Order based on talk time
Title: NeuroVIO: A Spiking-Hybrid Neuromorphic Computing Framework for Multimodal Pose Estimation
Abstract: This talk is about bio-inspered computing using spiking neural networks for autonomous navigation. The talk presents NeuroVIO, a hybrid end-to-end architecture that integrates conventional and spiking neural networks for multimodal visual-inertial odometry in underwater mobile robots. NeuroVIO addresses the need for using energy efficient and accurate pose estimation methods in underwater mobile robots. In our approach, a CNN backbone extracts visual features from successive frames and converts them into time encoded sequences, which are processed by adaptive leaky-integrate-and-fire neurons with learnable thresholds. Concurrently, inertial measurements are encoded via an SNN feature extractor. Fused features pass through an LSTM to capture temporal dependencies, and a spiking regression head predicts the six-dimensional pose vector. Evaluated on the AQUALOC dataset, NeuroVIO’s adaptive-threshold variant reduces the energy consumptionby 80.4% relative to its non-spiking counterpart while preserving the pose estimation accuracy. The experimental results demonstrate that integrating neuromorphic paradigms into resource-limited marine robotics platforms enhances the autonomy of underwater robots in exploration tasks.
Bio: Jorge Dias, PhD (1994), Dr Habil (2011) Jorge Dias is Professor at Khalifa University in Abu Dhabi, UAE, where he leads research and academic activities in Autonomous Robotic Systems. Jorge Dias is Deputy Director for the Center for Autonomous Robotic Systems at Khalifa University in Abu Dhabi, UAE, where he leads research and academic activities in Autonomous Robotic Systems. Jorge Dias has an Habilitation/Doctor of Science (Agregação) and a Ph.D. in Electrical Engineering from the University of Coimbra, Portugal. His expertise lies in the areas of Artificial Perception, Computer Vision, and Robotic Vision, with significant contributions to the field since 1984. He was a pioneer in integrating multi- sensing and multimodal aspects into Artificial Perception, notably in Visual-Inertial sensing [2006, 2007] and Visual-Auditory sensing [2008]. His remarkable contributions include the synthesis of cognitive models in robots, utilizing a Bayesian framework for multimodal artificial perception, with key applications in active attention [2008], navigation [2005], and multi-robot cooperation [2009]. His seminal book, “Probabilistic Approaches for Robotic Perception” [2014], encapsulates his pioneering research on Bayesian techniques in computational intelligence. The book highlights his extensive work on Bayesian techniques for computational intelligence. His most recent research delves into Neuromorphic Computing for Artificial Intelligence and Artificial Perception [2023, 2024], pushing forward the boundaries of science in these fields. Jorge has been the Principal Investigator for several international research projects, primarily in Europe and the Middle East, focusing on Artificial Perception for Autonomous Robotic Systems. Professor Dias’s scholarly contributions include over 370 publications across international journals, books, and conference proceedings, establishing him as an active researcher in the fields of Robotics and Artificial Perception. Jorge Dias also served as Vice-President of Instituto Pedro Nunes (IPN), a technology transfer institution associated with the University of Coimbra, from June 2008 to June 2011. He was the founding Director of the Laboratory of Systems and Automation at Instituto Pedro Nunes (IPN), Portugal.
Title: Where Perception Isn't Enough : Anticipation, Guidance, and Reasoning Over Video
Abstract: TBD
Bio: Lorenzo Torresani is a professor and the President Joseph E. Aoun Chair in the Khoury College of Computer Sciences at Northeastern University, based in Boston. Upon joining Khoury College in 2025, Torresani intends to build a research lab focusing on the creation of "perceptual AI assistants” — AI agents that use wearable cameras to observe, understand, and assist humans in daily tasks. He is fascinated by the way that vision is effortless for humans yet challenging for machines, and he wants to create multimodal video understanding systems that go beyond recognizing user actions to discern how an activity is being performed. He teaches courses on deep learning, AI, and computer vision. Torresani’s research has been widely recognized, including with a National Science Foundation CAREER Award, a Google Faculty Research Award, three Facebook Faculty Awards, and a Fulbright US Scholar Award. He has published 80 conference papers — including one which has been cited more than 10,000 times — and holds 11 patents. Before joining Northeastern University, Torresani served as a Research Director at Meta, designing and open sourcing influential models and benchmarks for video understanding. He also spent more than a decade as a professor of computer science at Dartmouth College.
Title: Disentangling Appearance, Geometry, Motion, and Location for Video Modeling, Understanding, and Generation
Abstract: A video is more than a sequence of images. It captures how the world evolves through motion, geometry, viewpoint, and spatial context over time. Unlike a single snapshot, video encodes not only what a scene looks like, but also where it is, how it changes, and how observers move through it. These cues are fundamental to applications such as visual odometry, localization, scene understanding, autonomous navigation, and video generation. Rather than treating video simply as a 3D block of pixels, this talk presents a disentangled representation perspective for video modeling, where appearance, geometry, motion, semantics, and location-related cues are explicitly separated and jointly modeled. By disentangling camera motion from object motion, and scene geometry from appearance variation, we can learn representations that are more interpretable, controllable, and robust across viewpoints, environments, and temporal dynamics. Such representations naturally bridge modern generative video models with classical geometric vision problems, including ego-motion estimation, trajectory reasoning, spatially grounded forecasting, and location-aware scene understanding. I will discuss recent advances in learning structured video representations for realistic video generation, motion forecasting, controllable visual editing, and geometry-aware video modeling, highlighting how these ideas can benefit location-driven vision applications.
Bio: Dr. Junsong Yuan is Professor and Director of Visual Computing Lab at Department of Computer Science and Engineering (CSE), State University of New York (SUNY) at Buffalo, USA. Before joining SUNY Buffalo, he was Associate Professor (2015-2018) and Nanyang Assistant Professor (2009-2015) at Nanyang Technological University (NTU), Singapore. He obtained Ph.D. from Northwestern University, M.Eng. from National University of Singapore, and B.Eng. from Huazhong University of Science Technology. He received Chancellor's Award for Excellence in Scholarship and Creative Activities from SUNY, Faculty Innovation Award from SONY, Nanyang Assistant Professorship from NTU, Outstanding EECS Ph.D. Thesis award from Northwestern University, Best Paper Award from IEEE Trans. on Multimedia, and Distinguished Lecturer from IEEE Signal Processing Society. He is currently Editor-in-Chief of Journal of Visual Communication and Image Representation (JVCI), and serves as Associate Editor of IEEE Trans. on Pattern Analysis and Machine Intelligence (T-PAMI), IEEE Trans. on Image Processing (T-IP), IEEE Trans. on Circuits and Systems for Video Technology (T-CSVT), Computer Vision and Image Understanding (CVIU), and Machine Vision and Applications (MVA). He also serves as General/Program Co-chair of ICASSP/ICME/ICIP and Area Chair for CVPR, ICCV, ECCV, ICML, AAAI, NeurIPS, ACM MM, etc. He is a Fellow of IEEE and IAPR, and Distinguished Member of ACM.
Title: TBD
Abstract: TBD
Bio: Jose M. Alvarez is a research director at NVIDIA, leading the Autonomous Vehicle Applied Research team. The team maximizes the impact of the latest research advances on the AV product. Research areas include model-centric and data-centric deep learning toward more efficient and scalable systems. Jose completed his Ph.D. in computer science in Barcelona, specializing in road-scene understanding for autonomous driving when datasets were very limited. He also worked as a postdoctoral researcher at NYU under Yann LeCunn.
Title: From Geometry to Semantics: Toward Efficient Autonomous Navigation in the Wild
Abstract: This talk presents a unified perspective on the emerging concept of spatial intelligence for embodied autonomous systems. We discuss recent progress in integrating geometric perception, semantic understanding, domain adaptation, traversability prediction, and visual navigation into cohesive decision-making frameworks. The presentation highlights how domain shifts across environments and sensing conditions fundamentally challenge existing perception systems, motivating new approaches for uncertainty-aware and context-aware spatial reasoning. Through examples spanning off-road navigation, forest robotics, and real-world field deployment, the talk demonstrates how combining semantic and geometric reasoning can enable more trustworthy autonomous behavior. We further explore lightweight image-space planning and Pareto-optimal visual navigation strategies that allow robots to navigate complex environments without heavy reliance on explicit mapping.
Bio: Dr. Lantao Liu is an Associate Professor and Founding Director of Robotics in the Luddy School of Informatics, Computing, and Engineering at Indiana University Bloomington. He directs research in autonomous robotics, robot learning, embodied AI, and field robotics applications for autonomous systems operating across air, ground, and marine environments. His work spans both single-robot and multi-robot systems, with applications in smart transportation, high-speed autonomous racing, remote sensing, infrastructure inspection, environmental monitoring, and search-and-rescue operations. At Indiana University, Dr. Liu led the development of the university’s cross-school Robotics B.S. degree program, helping establish a multidisciplinary robotics education and research ecosystem spanning computer science, AI, and systems engineering. Prior to joining Indiana University, he was a Postdoctoral Research Associate at the University of Southern California (2015–2017) and a Postdoctoral Fellow at the Robotics Institute at Carnegie Mellon University (2013–2015). He received his Ph.D. in Computer Science and Engineering from Texas A&M University in 2013.