SRI Vision and Learning

SRI Vision and Learning

News


NSF Project on Understanding STEM Interactions chosen for Facilitators' Choice Showcase

https://stemforall2021.videohall.com/presentations#/winners/id=jc

https://stemforall2021.videohall.com/presentations/2147


SRI's Multimodal Content Recommendation Commercialized by SRI Spin-off Vitrina

https://medium.com/dish/vitrina-ai-the-future-of-video-licensing-transactions-a4874355ce03


SRI Food Recognition Technology Commercialized by

SRI Spin-off Passio

https://blog.myfitnesspal.com/meal-scan/

SRI's Food Recognition Paper from 2015

https://pubmed.ncbi.nlm.nih.gov/25901024/



SRI's Driver Monitoring System wins AutoTech Breakthrough Award in the Sensor Systems category

https://www.sri.com/announcements/sri-selected-as-autotech-breaktrhough-award-winner/

https://autotechbreakthrough.com/2020-winners/


Recent Publications

Kamran Alipour, Arijit Ray, Xiao Lin, Michael Cogswell, Jurgen Schulze, Yi Yao, Giedrius Burachas Improving Users' Mental Model with Attention-directed Counterfactual Edits, 2021 Applied AI Letters (Wiley) [pdf]


Arijit Ray, Michael Cogswell, Xiao Lin, Kamran Alipour, Ajay Divakaran, Yi Yao, Giedrius Burachas, Knowing What VQA Does Not: Pointing to Error-Inducing Regions to Improve Explanation Helpfulness, 2021 Applied AI Letters (Wiley), [pdf] [arXiv] [Project Page]


Sujeong Kim, Abhinav Garlapati, Jonah Lubin, Amir Tamrakar, Ajay Divakaran, Towards Understanding Confusion and Affective States Under Communication Failures in Voice-Based Human-Machine Interaction, 2021 9th International Conference on Affective Computing and Intelligent Interaction Workshops and Demos (ACIIW)


Sujeong Kim, Amir Tamrakar, " How to best say it? : Translating Directives in Machine Language into Natural Language in the Blocks World, arxiv 2021

https://arxiv.org/abs/2107.06886

Pritish Sahu, Michael Cogswell, Sara Rutherford-Quach, Ajay Divakaran

Comprehension Based Question Answering using Bloom's Taxonomy, 6th Workshop on Representation Learning for NLP, 2021

https://arxiv.org/abs/2106.04653


Dual-Key Multimodal Backdoors for Visual Question Answering, arXiv (TrojAI)

arXiv.org

Dual-Key Multimodal Backdoors for Visual Question Answering

The success of deep learning has enabled advances in multimodal tasks that require non-trivial fusion of multiple input domains. Although multimodal models have shown potential in many problems,...

5:04

Towards Solving Multimodal Comprehension, arXiv

arXiv.org

Towards Solving Multimodal Comprehension

This paper targets the problem of procedural multimodal machine comprehension (M3C). This task requires an AI to comprehend given steps of multimodal instructions and then answer questions....

5:04

Online Defense of Trojaned Models using Misattributions, arXiv, trojAI

arXiv.org

MISA: Online Defense of Trojaned Models using Misattributions

Recent studies have shown that neural networks are vulnerable to Trojan attacks, where a network is trained to respond to specially crafted trigger patterns in the inputs in specific and...

A. Som, S. Kim, B. Lopez-Prado, S. Dhamija, N. Alozie, A. Tamrakar. "Automated Student Group Collaboration Assessment and Recommendation System Using Individual Role and Behavioral Cues". Frontiers in Computer Science, 2021.

A. Som, S. Kim, B. Lopez-Prado, S. Dhamija, N. Alozie, A. Tamrakar. "Towards Explainable Student Group Collaboration Assessment Models Using Temporal Representations of Individual Student Roles". Educational Data Mining (EDM) Conference, 2021.

A. Som, S. Kim, B. Lopez-Prado, S. Dhamija, N. Alozie, A. Tamrakar. "A Machine Learning Approach to Assess Student Group Collaboration Using Individual Level Behavioral Cues". European Conference on Computer Vision Workshops (ECCVW), 2020.


WACV 2022

Challenges in Procedural Multimodal Machine Comprehension: A Novel Way

Pritish Sahu, Karan Sikka, Ajay Divakaran

https://openaccess.thecvf.com/content/WACV2022/html/Sahu_Challenges_in_Procedural_Multimodal_Machine_Comprehension_A_Novel_Way_To_WACV_2022_paper.html

Pritish Sahu, Karan Sikka, Ajay Divakaran

Towards Multimodal Comprehension

Abstract

PDF

ICCV 2021

Yunye Gong, Xiao Lin, Yi Yao, Thomas G. Dietterich, Ajay Divakaran, Melinda Gervasio

Confidence Calibration for Cross-Domain Generalization under Covariate Shift (To appear at ICCV 2021)

Abstract

PDF

https://openaccess.thecvf.com/content/ICCV2021/html/Gong_Confidence_Calibration_for_Domain_Generalization_Under_Covariate_Shift_ICCV_2021_paper.html



Xiao Lin, Meng Ye, Yunye Gong, Giedrius Buracas, Nikoletta Basiou, Ajay Divakaran, Yi Yao

Modular Adaptation for Cross-Domain Few-Shot Learning

Abstract

PDF


Arijit Ray, Michael Cogswell, Xiao Lin, Kamran Alipour, Ajay Divakaran, Yi Yao, Giedrius Burachas

Knowing What VQA Does Not: Pointing to Error-Inducing Regions to Improve Explanation Helpfulness

Abstract

PDF


Karan Sikka, Indranil Sur, Susmit Jha, Anirban Roy, Ajay Divakaran,

Detecting Trojaned DNNs Using Counterfactual Attributions, arXiv:2012.02275

PDF

Jihua Huang, Amir Tamrakar, "ACE-Net: Fine-Level Face Alignment through Anchors and Contours Estimation

" arXiv:2012.01461

PDF

Karan Sikka, Jihua Huang, Andrew Silberfarb, Prateeth Nayak, Luke Rohrer, Pritish Sahu, John Byrnes, Ajay Divakaran, Richard Rohwer , "Zero-Shot Learning with Knowledge Enhanced Visual Semantic Embeddings," arXiv:2011.10889

PDF


Meng Ye, Xiao Lin, Giedrius Burachas, Ajay Divakaran, Yi Yao, "Hybrid Consistency Training with Prototype Adaptation for Few-Shot Learning," Arxiv submission November 2020

https://arxiv.org/abs/2011.10082

PDF


ACM MM 2020

RGB2LIDAR: Towards Solving Large-Scale Cross-Modal Visual Localization

https://arxiv.org/abs/2009.05695

best paper candidate at ACMM'2020 .

HAI 2020

Sujeong Kim, David Salter, Luke DeLuccia, Amir Tamrakar. 2020. Study on Text-based and Voice-based Dialogue Interfaces for Human-Computer Interactions in a Blocks World. In Proceedings of the 8th International Conference on Human-Agent Interaction (HAI ’20), November 10–13, 2020, VirtualEvent, NSW, Australia. ACM, New York, NY, USA, 3 pages. https://doi.org/10.1145/3406499.3418754


A Machine Learning Approach to Assess Student Group Collaboration Using Individual Level Behavioral Cues

Anirudh Som, Sujeong Kim, Bladmir Lopez-Prado, Svati Dhamija, Nonye Alozie, Amir Tamrakar, European Conference on Computer Vision (ECCV) Workshops, 2020

Paper

Amir Tamrakar


News

SRI's Driver Monitoring System wins AutoTech Breakthrough Award in the Sensor Systems category

https://www.sri.com/announcements/sri-selected-as-autotech-breaktrhough-award-winner/

https://autotechbreakthrough.com/2020-winners/


https://scholar.google.com/citations?user=nBUpZ-EAAAAJ&hl=en

https://www.forbes.com/sites/stevetengler/2020/07/14/you-scared-bro-maybe-your-autonomous-car-should-ease-your-fears/#6de5f9c3d693

Recent Publications

Jihua Huang, Amir Tamrakar, "ACE-Net: Fine-Level Face Alignment through Anchors and Contours Estimation

" arXiv:2012.01461

PDF

HAI 2020

Sujeong Kim, David Salter, Luke DeLuccia, Amir Tamrakar. 2020. Study on Text-based and Voice-based Dialogue Interfaces for Human-Computer Interactions in a Blocks World. In Proceedings of the 8th International Conference on Human-Agent Interaction (HAI ’20), November 10–13, 2020, VirtualEvent, NSW, Australia. ACM, New York, NY, USA, 3 pages. https://doi.org/10.1145/3406499.3418754



A Machine Learning Approach to Assess Student Group Collaboration Using Individual Level Behavioral Cues

Anirudh Som, Sujeong Kim, Bladmir Lopez-Prado, Svati Dhamija, Nonye Alozie, Amir Tamrakar, European Conference on Computer Vision (ECCV) Workshops, 2020

Paper

Alozie, N., Dhamija, S., McBride, E., & Tamrakar, A. (2020, June). Automated collaboration assessment using behavioral analytics. International Conference of the Learning Sciences (ICLS), Nashville, TN

Alozie, N., McBride, E., Dhamija, S., & Tamrakar, A. (2020, April). Collaboration Conceptual Model to Inform the Development of Machine Learning Models Using Behavioral Analytics. San Francisco, CA: American Education Research Association (AERA)

Incorporating Conversational AI in Automotive Applications

https://www.youtube.com/watch?v=bKJideEmyss


SMILEE developed under DARPA CwC

http://aclweb.org/anthology/N18-5018

This is the 2 minute short version for the actual submission: https://youtu.be/2iM5t7cpua0

This is the director’s cut aka longer version: https://youtu.be/hfE7j7PRWro


Aesop developed under DARPA CwC by Mohamed Amer

Aesop human-computer storytelling

https://youtu.be/A461L17s2f0

Back to Top

Yi Yao

https://scholar.google.com/citations?user=iD6QaXcAAAAJ&hl=en


ICLR 2020 Workshop on Integration of Deep Neural Models and Differential Equations.

Progressive Growing of Neural ODE's

http://arxiv.org/abs/2003.03695

WACV 2019

Pallabi Ghosh, Yi Yao, Larry S. Davis, Ajay Divakaran:

Stacked Spatio-Temporal Graph Convolutional Networks for Action Segmentation.

https://arxiv.org/pdf/1811.10575v1.pdf

WACV poster presentation at 24:25

https://www.youtube.com/watch?v=zZDhauFsOUo

NeurIPS 2019 (Work done by Mohamed Amer under DARPA XAI)

https://arxiv.org/abs/1905.02850

Back to Top


Karan Sikka

http://ksikka.com/


Dual-Key Multimodal Backdoors for Visual Question Answering, arXiv (TrojAI)

arXiv.org

Dual-Key Multimodal Backdoors for Visual Question Answering

The success of deep learning has enabled advances in multimodal tasks that require non-trivial fusion of multiple input domains. Although multimodal models have shown potential in many problems,...

5:04

Towards Solving Multimodal Comprehension, arXiv

arXiv.org

Towards Solving Multimodal Comprehension

This paper targets the problem of procedural multimodal machine comprehension (M3C). This task requires an AI to comprehend given steps of multimodal instructions and then answer questions....

5:04

Online Defense of Trojaned Models using Misattributions, arXiv, trojAI

arXiv.org

MISA: Online Defense of Trojaned Models using Misattributions

Recent studies have shown that neural networks are vulnerable to Trojan attacks, where a network is trained to respond to specially crafted trigger patterns in the inputs in specific and...


SRI's Multimodal Content Recommendation Commercialized by SRI Spin-off Vitrina

https://medium.com/dish/vitrina-ai-the-future-of-video-licensing-transactions-a4874355ce03


Karan Sikka talks about embedding knowledge in machine learning models

https://www.youtube.com/watch?v=YPrXavrlqzs

Back to Top

Xiao Lin

Xiao Lin talks about artificial intelligence and brain-to-brain data transfer

https://www.youtube.com/watch?v=NMDEs1DybXU

https://scholar.google.com/citations?user=zSIbUH4AAAAJ&hl=en

https://filebox.ece.vt.edu/~linxiao/

Back to Top

Anirban Roy

https://scholar.google.com/citations?user=N9eSuR4AAAAJ&hl=en

Back to Top

Jesse Hostetler

4th Life Long Machine Learning Workshop at ICML 2020

Raghavan, A., Hostetler, J., Sur, I., Rahman, A., & Divakaran, A. (2020). Lifelong Learning using Eigentasks:Task Separation, Skill Acquisition, and Selective Transfer. 4th Lifelong Machine Learning Workshop, Proceedings of the 37th International Conference on Machine Learning (ICML), PMLR, 8.

Paper link:

https://openreview.net/pdf?id=SD7m4B3kGiQ

video for the paper:

https://www.youtube.com/watch?v=IsO2Yz4z43Q

Jesse Hostetler talks about Lifelong Learning and getting Robots to dream

https://www.youtube.com/watch?v=S5Co1T_uuDE

https://jhostetler.github.io/

https://scholar.google.com/citations?user=ngJLn9EAAAAJ&hl=en

Back to Top



Back to Top

Meng Ye

Meng Ye, Xiao Lin, Giedrius Burachas, Ajay Divakaran, Yi Yao, "Hybrid Consistency Training with Prototype Adaptation for Few-Shot Learning," Arxiv submission November 2020

https://arxiv.org/abs/2011.10082

PDF


https://scholar.google.com/citations?user=YMeDRE8AAAAJ&hl=en

https://sites.google.com/site/mengye1225/

Back to Top

Yunye Gong

Yunye Gong talks about applying physics-based models to deep learning

https://www.youtube.com/watch?v=lxdp6d_Ih94&t=318s


https://doerschuklab.bme.cornell.edu/people/yunye-gong/

https://www.linkedin.com/in/yunye-gong-192b5629

https://dblp.uni-trier.de/pers/hd/g/Gong:Yunye

Back to Top



Jihua Huang

Jihua Huang, Amir Tamrakar, "ACE-Net: Fine-Level Face Alignment through Anchors and Contours Estimation

" arXiv:2012.01461

PDF

Karan Sikka, Jihua Huang, Andrew Silberfarb, Prateeth Nayak, Luke Rohrer, Pritish Sahu, John Byrnes, Ajay Divakaran, Richard Rohwer , "Zero-Shot Learning with Knowledge Enhanced Visual Semantic Embeddings," arXiv:2011.10889

PDF


Back to Top

Arijit Ray

https://arijitray1993.github.io/

https://filebox.ece.vt.edu/~ray93/

  • Arijit Ray, Karan Sikka, Ajay Divakaran, Stefan Lee, Giedrius Burachas, Sunny and Dark Outside?! Improving Answer Consistency in VQA through Entailed Question Generation , 2019 Conference on Empirical Methods in Natural Language Processing (EMNLP 2019), also at CVPR-W 2019 VQA and Visual Dialog Workshop, [arXiv], [bibTex] [Data]

  • Arijit Ray, Yi Yao, Rakesh Kumar, Ajay Divakaran, Giedrius Burachas, Can You Explain That: Lucid Explanations Help Human-AI Collaboratve Image Retrieval , 2019 AAAI Conference on Human Computation and Crowdsourcing (HCOMP 2019)

Back to Top

Julia Kruk


https:/www.juliakruk.com


Julia Kruk talks about the evolution of human communication through social media

https://www.youtube.com/watch?v=1yQ3-fN9HV4


EMNLP 2019

Multi-modal Document Intent in Instagram Posts

https://arxiv.org/abs/1904.09073

Demo Video

https://youtu.be/kwx3dquSQ7M

Back to Top