SRI Vision and Learning
SRI Vision and Learning
News
NSF Project on Understanding STEM Interactions chosen for Facilitators' Choice Showcase
https://stemforall2021.videohall.com/presentations#/winners/id=jc
https://stemforall2021.videohall.com/presentations/2147
SRI's Multimodal Content Recommendation Commercialized by SRI Spin-off Vitrina
https://medium.com/dish/vitrina-ai-the-future-of-video-licensing-transactions-a4874355ce03
SRI Food Recognition Technology Commercialized by
SRI Spin-off Passio
https://blog.myfitnesspal.com/meal-scan/
SRI's Food Recognition Paper from 2015
https://pubmed.ncbi.nlm.nih.gov/25901024/
SRI's Driver Monitoring System wins AutoTech Breakthrough Award in the Sensor Systems category
https://www.sri.com/announcements/sri-selected-as-autotech-breaktrhough-award-winner/
https://autotechbreakthrough.com/2020-winners/
Recent Publications
Kamran Alipour, Arijit Ray, Xiao Lin, Michael Cogswell, Jurgen Schulze, Yi Yao, Giedrius Burachas Improving Users' Mental Model with Attention-directed Counterfactual Edits, 2021 Applied AI Letters (Wiley) [pdf]
Arijit Ray, Michael Cogswell, Xiao Lin, Kamran Alipour, Ajay Divakaran, Yi Yao, Giedrius Burachas, Knowing What VQA Does Not: Pointing to Error-Inducing Regions to Improve Explanation Helpfulness, 2021 Applied AI Letters (Wiley), [pdf] [arXiv] [Project Page]
Sujeong Kim, Abhinav Garlapati, Jonah Lubin, Amir Tamrakar, Ajay Divakaran, Towards Understanding Confusion and Affective States Under Communication Failures in Voice-Based Human-Machine Interaction, 2021 9th International Conference on Affective Computing and Intelligent Interaction Workshops and Demos (ACIIW)
Sujeong Kim, Amir Tamrakar, " How to best say it? : Translating Directives in Machine Language into Natural Language in the Blocks World, arxiv 2021
https://arxiv.org/abs/2107.06886
Pritish Sahu, Michael Cogswell, Sara Rutherford-Quach, Ajay Divakaran
Comprehension Based Question Answering using Bloom's Taxonomy, 6th Workshop on Representation Learning for NLP, 2021
https://arxiv.org/abs/2106.04653
Dual-Key Multimodal Backdoors for Visual Question Answering, arXiv (TrojAI)
arXiv.org
Dual-Key Multimodal Backdoors for Visual Question Answering
The success of deep learning has enabled advances in multimodal tasks that require non-trivial fusion of multiple input domains. Although multimodal models have shown potential in many problems,...
Towards Solving Multimodal Comprehension, arXiv
arXiv.org
Towards Solving Multimodal Comprehension
This paper targets the problem of procedural multimodal machine comprehension (M3C). This task requires an AI to comprehend given steps of multimodal instructions and then answer questions....
Online Defense of Trojaned Models using Misattributions, arXiv, trojAI
arXiv.org
MISA: Online Defense of Trojaned Models using Misattributions
Recent studies have shown that neural networks are vulnerable to Trojan attacks, where a network is trained to respond to specially crafted trigger patterns in the inputs in specific and...
A. Som, S. Kim, B. Lopez-Prado, S. Dhamija, N. Alozie, A. Tamrakar. "Automated Student Group Collaboration Assessment and Recommendation System Using Individual Role and Behavioral Cues". Frontiers in Computer Science, 2021.
A. Som, S. Kim, B. Lopez-Prado, S. Dhamija, N. Alozie, A. Tamrakar. "Towards Explainable Student Group Collaboration Assessment Models Using Temporal Representations of Individual Student Roles". Educational Data Mining (EDM) Conference, 2021.
A. Som, S. Kim, B. Lopez-Prado, S. Dhamija, N. Alozie, A. Tamrakar. "A Machine Learning Approach to Assess Student Group Collaboration Using Individual Level Behavioral Cues". European Conference on Computer Vision Workshops (ECCVW), 2020.
WACV 2022
Challenges in Procedural Multimodal Machine Comprehension: A Novel Way
Pritish Sahu, Karan Sikka, Ajay Divakaran
https://openaccess.thecvf.com/content/WACV2022/html/Sahu_Challenges_in_Procedural_Multimodal_Machine_Comprehension_A_Novel_Way_To_WACV_2022_paper.html
Pritish Sahu, Karan Sikka, Ajay Divakaran
Towards Multimodal Comprehension
ICCV 2021
Yunye Gong, Xiao Lin, Yi Yao, Thomas G. Dietterich, Ajay Divakaran, Melinda Gervasio
Confidence Calibration for Cross-Domain Generalization under Covariate Shift (To appear at ICCV 2021)
Xiao Lin, Meng Ye, Yunye Gong, Giedrius Buracas, Nikoletta Basiou, Ajay Divakaran, Yi Yao
Modular Adaptation for Cross-Domain Few-Shot Learning
Arijit Ray, Michael Cogswell, Xiao Lin, Kamran Alipour, Ajay Divakaran, Yi Yao, Giedrius Burachas
Knowing What VQA Does Not: Pointing to Error-Inducing Regions to Improve Explanation Helpfulness
Karan Sikka, Indranil Sur, Susmit Jha, Anirban Roy, Ajay Divakaran,
Detecting Trojaned DNNs Using Counterfactual Attributions, arXiv:2012.02275
Jihua Huang, Amir Tamrakar, "ACE-Net: Fine-Level Face Alignment through Anchors and Contours Estimation
Karan Sikka, Jihua Huang, Andrew Silberfarb, Prateeth Nayak, Luke Rohrer, Pritish Sahu, John Byrnes, Ajay Divakaran, Richard Rohwer , "Zero-Shot Learning with Knowledge Enhanced Visual Semantic Embeddings," arXiv:2011.10889
Meng Ye, Xiao Lin, Giedrius Burachas, Ajay Divakaran, Yi Yao, "Hybrid Consistency Training with Prototype Adaptation for Few-Shot Learning," Arxiv submission November 2020
https://arxiv.org/abs/2011.10082
ACM MM 2020
RGB2LIDAR: Towards Solving Large-Scale Cross-Modal Visual Localization
https://arxiv.org/abs/2009.05695
best paper candidate at ACMM'2020 .
HAI 2020
Sujeong Kim, David Salter, Luke DeLuccia, Amir Tamrakar. 2020. Study on Text-based and Voice-based Dialogue Interfaces for Human-Computer Interactions in a Blocks World. In Proceedings of the 8th International Conference on Human-Agent Interaction (HAI ’20), November 10–13, 2020, VirtualEvent, NSW, Australia. ACM, New York, NY, USA, 3 pages. https://doi.org/10.1145/3406499.3418754
Anirudh Som, Sujeong Kim, Bladmir Lopez-Prado, Svati Dhamija, Nonye Alozie, Amir Tamrakar, European Conference on Computer Vision (ECCV) Workshops, 2020
Amir Tamrakar
News
SRI's Driver Monitoring System wins AutoTech Breakthrough Award in the Sensor Systems category
https://www.sri.com/announcements/sri-selected-as-autotech-breaktrhough-award-winner/
https://autotechbreakthrough.com/2020-winners/
https://scholar.google.com/citations?user=nBUpZ-EAAAAJ&hl=en
Recent Publications
Jihua Huang, Amir Tamrakar, "ACE-Net: Fine-Level Face Alignment through Anchors and Contours Estimation
HAI 2020
Sujeong Kim, David Salter, Luke DeLuccia, Amir Tamrakar. 2020. Study on Text-based and Voice-based Dialogue Interfaces for Human-Computer Interactions in a Blocks World. In Proceedings of the 8th International Conference on Human-Agent Interaction (HAI ’20), November 10–13, 2020, VirtualEvent, NSW, Australia. ACM, New York, NY, USA, 3 pages. https://doi.org/10.1145/3406499.3418754
Anirudh Som, Sujeong Kim, Bladmir Lopez-Prado, Svati Dhamija, Nonye Alozie, Amir Tamrakar, European Conference on Computer Vision (ECCV) Workshops, 2020
Alozie, N., Dhamija, S., McBride, E., & Tamrakar, A. (2020, June). Automated collaboration assessment using behavioral analytics. International Conference of the Learning Sciences (ICLS), Nashville, TN
Alozie, N., McBride, E., Dhamija, S., & Tamrakar, A. (2020, April). Collaboration Conceptual Model to Inform the Development of Machine Learning Models Using Behavioral Analytics. San Francisco, CA: American Education Research Association (AERA)
Incorporating Conversational AI in Automotive Applications
https://www.youtube.com/watch?v=bKJideEmyss
SMILEE developed under DARPA CwC
http://aclweb.org/anthology/N18-5018
This is the 2 minute short version for the actual submission: https://youtu.be/2iM5t7cpua0
This is the director’s cut aka longer version: https://youtu.be/hfE7j7PRWro
Aesop developed under DARPA CwC by Mohamed Amer
Aesop human-computer storytelling
Yi Yao
https://scholar.google.com/citations?user=iD6QaXcAAAAJ&hl=en
ICLR 2020 Workshop on Integration of Deep Neural Models and Differential Equations.
Progressive Growing of Neural ODE's
http://arxiv.org/abs/2003.03695
WACV 2019
Pallabi Ghosh, Yi Yao, Larry S. Davis, Ajay Divakaran:
Stacked Spatio-Temporal Graph Convolutional Networks for Action Segmentation.
https://arxiv.org/pdf/1811.10575v1.pdf
WACV poster presentation at 24:25
https://www.youtube.com/watch?v=zZDhauFsOUo
NeurIPS 2019 (Work done by Mohamed Amer under DARPA XAI)
https://arxiv.org/abs/1905.02850
Karan Sikka
Dual-Key Multimodal Backdoors for Visual Question Answering, arXiv (TrojAI)
arXiv.org
Dual-Key Multimodal Backdoors for Visual Question Answering
The success of deep learning has enabled advances in multimodal tasks that require non-trivial fusion of multiple input domains. Although multimodal models have shown potential in many problems,...
Towards Solving Multimodal Comprehension, arXiv
arXiv.org
Towards Solving Multimodal Comprehension
This paper targets the problem of procedural multimodal machine comprehension (M3C). This task requires an AI to comprehend given steps of multimodal instructions and then answer questions....
Online Defense of Trojaned Models using Misattributions, arXiv, trojAI
arXiv.org
MISA: Online Defense of Trojaned Models using Misattributions
Recent studies have shown that neural networks are vulnerable to Trojan attacks, where a network is trained to respond to specially crafted trigger patterns in the inputs in specific and...
SRI's Multimodal Content Recommendation Commercialized by SRI Spin-off Vitrina
https://medium.com/dish/vitrina-ai-the-future-of-video-licensing-transactions-a4874355ce03
Karan Sikka talks about embedding knowledge in machine learning models
https://www.youtube.com/watch?v=YPrXavrlqzs
Xiao Lin
Xiao Lin talks about artificial intelligence and brain-to-brain data transfer
https://www.youtube.com/watch?v=NMDEs1DybXU
https://scholar.google.com/citations?user=zSIbUH4AAAAJ&hl=en
https://filebox.ece.vt.edu/~linxiao/
Anirban Roy
https://scholar.google.com/citations?user=N9eSuR4AAAAJ&hl=en
Jesse Hostetler
4th Life Long Machine Learning Workshop at ICML 2020
Raghavan, A., Hostetler, J., Sur, I., Rahman, A., & Divakaran, A. (2020). Lifelong Learning using Eigentasks:Task Separation, Skill Acquisition, and Selective Transfer. 4th Lifelong Machine Learning Workshop, Proceedings of the 37th International Conference on Machine Learning (ICML), PMLR, 8.
Paper link:
https://openreview.net/pdf?id=SD7m4B3kGiQ
video for the paper:
https://www.youtube.com/watch?v=IsO2Yz4z43Q
Jesse Hostetler talks about Lifelong Learning and getting Robots to dream
https://www.youtube.com/watch?v=S5Co1T_uuDE
https://jhostetler.github.io/
https://scholar.google.com/citations?user=ngJLn9EAAAAJ&hl=en
Meng Ye
Meng Ye, Xiao Lin, Giedrius Burachas, Ajay Divakaran, Yi Yao, "Hybrid Consistency Training with Prototype Adaptation for Few-Shot Learning," Arxiv submission November 2020
https://arxiv.org/abs/2011.10082
https://scholar.google.com/citations?user=YMeDRE8AAAAJ&hl=en
https://sites.google.com/site/mengye1225/
Yunye Gong
Yunye Gong talks about applying physics-based models to deep learning
https://www.youtube.com/watch?v=lxdp6d_Ih94&t=318s
https://doerschuklab.bme.cornell.edu/people/yunye-gong/
https://www.linkedin.com/in/yunye-gong-192b5629
https://dblp.uni-trier.de/pers/hd/g/Gong:Yunye
Jihua Huang
Jihua Huang, Amir Tamrakar, "ACE-Net: Fine-Level Face Alignment through Anchors and Contours Estimation
Karan Sikka, Jihua Huang, Andrew Silberfarb, Prateeth Nayak, Luke Rohrer, Pritish Sahu, John Byrnes, Ajay Divakaran, Richard Rohwer , "Zero-Shot Learning with Knowledge Enhanced Visual Semantic Embeddings," arXiv:2011.10889
Arijit Ray
https://arijitray1993.github.io/
https://filebox.ece.vt.edu/~ray93/
Arijit Ray, Karan Sikka, Ajay Divakaran, Stefan Lee, Giedrius Burachas, Sunny and Dark Outside?! Improving Answer Consistency in VQA through Entailed Question Generation , 2019 Conference on Empirical Methods in Natural Language Processing (EMNLP 2019), also at CVPR-W 2019 VQA and Visual Dialog Workshop, [arXiv], [bibTex] [Data]
Arijit Ray, Yi Yao, Rakesh Kumar, Ajay Divakaran, Giedrius Burachas, Can You Explain That: Lucid Explanations Help Human-AI Collaboratve Image Retrieval , 2019 AAAI Conference on Human Computation and Crowdsourcing (HCOMP 2019)
Julia Kruk
Julia Kruk talks about the evolution of human communication through social media
https://www.youtube.com/watch?v=1yQ3-fN9HV4
EMNLP 2019
Multi-modal Document Intent in Instagram Posts
https://arxiv.org/abs/1904.09073
Demo Video