Publications

Google Scholar

When to Think Fast and Slow? AMOR: Adaptive Entropy Gate for Hybrid Models

AMOR is a hybrid recurrent-attention architecture that uses predictive uncertainty to decide when attention is needed. By applying attention only to difficult tokens, it improves efficiency, robustness, and long-context reasoning while matching or outperforming existing hybrid models using attention on just ~22% of tokens.

Paper

Code

H. Zheng & C. Shani (ArXiv, 2026)

Learning Concepts, Not Tokens: Self-Supervised Semantic Alignment for Language Models

Inspired by how humans reason in terms of concepts rather than exact words, we introduce a self-supervised training framework that teaches language models to predict semantically equivalent token groups instead of single tokens. This concept-based supervision improves alignment with human semantic judgments and strengthens downstream reasoning and representation quality while preserving strong language modeling performance.

Paper

Code

C. Zhang, D. Jurafsky & C. Shani (ArXiv, 2026)

From Tokens to Thoughts: How LLMs and Humans Trade Compression for Meaning

Using an Information Bottleneck framework, we compare how humans and language models organize concepts and find that models prioritize efficient compression over semantic richness. While models broadly capture human category structure, they lose fine-grained contextual nuance that humans preserve, revealing fundamental differences between artificial and human conceptual representations.

Paper

Poster

C. Shani, L. Soffer, D. Jurafsky, Y. LeCun, & R. Shwartz-Ziv (ICLR, 2026)

Beyond Tokens: Concept-Level Training Objectives for LLMs

Inspired by how humans abstract meaning beyond exact wording, we propose concept-level training objectives for language models that group semantically equivalent words under shared concepts. By shifting supervision from tokens to concepts, our approach improves robustness, semantic generalization, and downstream performance while reducing the tendency to overfit to surface form.

Paper

Code

L. Iyer, P. Somani, A. Guo, D. Jurafsky & C. Shani (EACL, 2026)

Divergent Processing Strategies in Automatic Speech Recognition

We introduce Architectural Fingerprinting to compare how speech models process information across different architectures. Our analysis shows that Conformers tend to resolve speech categories early in the network, while Transformers integrate information more gradually in deeper layers, revealing distinct inductive biases that align with different application needs.

Paper

Code

N. Roll, P. Bhalerao, M. Bartelds, A. Pawar, Y. Tatsumi, T. Ogunremi, C. Shani, C. Graham, M. Sumner & D. Jurafsky (ArXiv , 2026)

Labeling Messages as AI-Generated Does Not Reduce Their Persuasive Effects

We study how people respond to AI-generated political content and find that labeling content as AI-generated has little effect on its persuasiveness, perceived accuracy, or likelihood of being shared. Despite recognizing and trusting the labels, participants were influenced by AI-generated messages similarly to human-written ones, suggesting that transparency labels alone may not meaningfully reduce the impact of AI-generated information.

Paper

Longer version

I. Gallego, C. Shani, W. Shi, F. Bianchi, I. Gainsburg, D. Jurafsky & R. Willer (PNAS Nexus, 2026)

One Joke to Rule Them All? On the (Im)possibility of Generalizing Humor

We investigate whether language models can generalize across different types of humor rather than learning each humor style separately. By training on diverse humor datasets, we show that models develop transferable humor understanding that improves performance on unseen humor types, suggesting that humor may share deeper common mechanisms despite its wide variety.

Paper

Code

M. Turgeman, C. Shani & D. Shahaf (CHum, 2026)

Cooking Up Creativity: Enhancing LLM Creativity through Structured Recombination

We introduce a cognitively inspired framework for enhancing creativity in language models by manipulating structured representations of ideas rather than surface-level text. Applied to recipe generation, our approach produces outputs that are significantly more novel and diverse than GPT-4o, demonstrating the potential of concept-level recombination for creative AI.

Paper

Code

M. Mizrahi, C. Shani, G. Stanovsky, D. Jurafsky & D. Shahaf (TACL, 2026)

Rethinking Word Similarity: Semantic Similarity through Classification Confusion

We introduce Word Confusion, a new measure of semantic similarity that captures the context-dependent and asymmetric nature of meaning by modeling how often words are confused in classification tasks. Inspired by cognitive theories of human similarity judgments, our approach moves beyond static embedding similarity and enables more nuanced analysis of language change, polysemy, and cultural meaning.

Paper

Code

K. Zhou, H. Gao, S. Chen, D. Edelstein, D. Jurafsky & C. Shani (NAACL, 2025)

Bridged Clustering for Representation Learning: Semi-Supervised Sparse Bridging

We introduce Bridged Clustering, a semi-supervised learning framework that learns predictive mappings from largely unpaired input and output data. By independently discovering structure in each domain and connecting them with only a small number of labeled examples, our approach provides an interpretable, label-efficient alternative to traditional supervised and transport-based methods.

Paper

Code

P. Ye, C. Shani & E. Vitercik (ArXiv, 2025)

Measuring Mental Health Variables in Computational Research: Toward Validated, Dimensional, and Transdiagnostic Approaches

We argue that computational mental health research often relies on flawed measures of psychological disorders, limiting the validity of its findings. We advocate for the use of validated, dimensional, and transdiagnostic measures that better reflect modern psychological science, providing recommendations for building more reliable and clinically meaningful AI systems.

Paper

Presentation

C. Shani & E. Stade (CLPsych, 2025)

Toward Concept-Aware Large Language Models

Inspired by the central role of concepts in human cognition, we investigate how well large language models represent conceptual knowledge and explore methods for making them concept-aware. Our results show that incorporating concepts into language model training and inference improves alignment with human intuition and increases robustness, highlighting a promising direction for more human-like AI.

Paper

Code

C. Shani, J. Vreeken & D. Shahaf (EMNLP, 2023)

FAME: Flexible, Scalable Analogy Mappings Engine

Inspired by the human ability to reason through analogy, we develop a framework that automatically discovers analogical mappings using only the names of entities. By leveraging commonsense knowledge, the system can solve complex and partial analogies, generate human-like extensions, and provide interpretable explanations for its reasoning, achieving performance that rivals or exceeds humans on several analogy tasks.

Paper

Code

S. Jacob, C. Shani & D. Shahaf (EMNLP, 2023)

Data Scientist: Recent Advances towards Overcoming the Data Bottleneck

As modern machine learning becomes increasingly dependent on large labeled datasets, we present a taxonomy of methods for overcoming the data bottleneck. By organizing approaches from across the field, this work provides a practical guide for researchers and highlights promising alternatives to annotation-heavy learning.

Paper

Poster

C. Shani, Y. Zarecki & D. Shahaf (CACM 2023)

Evaluating Response Generation to Playful Shopping Requests

We explore how AI assistants can respond playfully to humorous and irrational shopping requests. By combining commonsense knowledge with response generation, our approach produces more appropriate and engaging replies than neural language models alone, highlighting the importance of commonsense reasoning for natural human-AI interaction.

Paper

Code

N. Shapira, O. Kalinsky, A. Libov, C. Shani & S. Tolmach (ECIR, 2023)

“Alexa, Do You Want to Build a Snowman?” Characterizing Playful Requests to Conversational Agents

To help conversational agents engage more naturally with users, we investigate playful interactions such as jokes, teasing, and personal questions. Drawing on humor theory and real-world Alexa data, we develop a taxonomy of playful requests that provides a foundation for building more engaging and human-like conversational systems.

Paper

Presentation

C. Shani, A. Libov, S. Tolmach, L. Lewin-Eytan, Y. Maarek & D. Shahaf (CHI, 2022)

Can Computers Understand Humor?

This article introduces kids to artificial intelligence and explores why humor is easy for humans but difficult for computers. Through the lens of jokes and machine learning, it explains how researchers are teaching computers to recognize and generate humor.

Paper

C. Shani & D. Shahaf (Frontiers for Young Minds, 2022)

How Did This Get Funded?! Automatically Identifying Quirky Scientific Achievements

We introduce the task of automatically identifying humorous scientific papers, inspired by the Ig Nobel Prize. By combining insights from psychology, linguistics, and natural language processing, we develop models that can detect funny research articles at scale, offering new tools for studying humor computationally.

Paper

Code

C. Shani, N. Borenstein & D. Shahaf (ACL, 2021)

Language (Re)modeling: Towards Embodied Language Understanding

Inspired by theories of embodied cognition, this paper argues that language understanding should be grounded in mental simulation, metaphor, and interaction with the world rather than text alone. We propose a roadmap for building AI systems that learn and reason more like humans, with improved interpretability, efficiency, and generalization.

Paper

Presentation

R. Tamari, C. Shani, T. Hope, M. R.L. Petruck, O. Abend & D. Shahaf (ACL, 2020)

Invited Talks

From Tokens to Thoughts: Teaching LLMs to Understand Concepts (Executive Code 2025)

Computational Humor in No Joke (AI week 2023)

Towards Concept-Aware Large Language Models (AI week 2024, HUJI NLP retreat 2023, Nvidia's NLP-IL 2023, Stanford's ML Seminar 2023, John Snow Labs' NLP Summit, Stanford Data Science)

Publications

Google Scholar

When to Think Fast and Slow? AMOR: Adaptive Entropy Gate for Hybrid Models

Learning Concepts, Not Tokens: Self-Supervised Semantic Alignment for Language Models

From Tokens to Thoughts: How LLMs and Humans Trade Compression for Meaning

Beyond Tokens: Concept-Level Training Objectives for LLMs

Divergent Processing Strategies in Automatic Speech Recognition

Labeling Messages as AI-Generated Does Not Reduce Their Persuasive Effects

One Joke to Rule Them All? On the (Im)possibility of Generalizing Humor

Cooking Up Creativity: Enhancing LLM Creativity through Structured Recombination

Rethinking Word Similarity: Semantic Similarity through Classification Confusion

Bridged Clustering for Representation Learning: Semi-Supervised Sparse Bridging

Measuring Mental Health Variables in Computational Research: Toward Validated, Dimensional, and Transdiagnostic Approaches

Toward Concept-Aware Large Language Models

FAME: Flexible, Scalable Analogy Mappings Engine

Data Scientist: Recent Advances towards Overcoming the Data Bottleneck

Evaluating Response Generation to Playful Shopping Requests

“Alexa, Do You Want to Build a Snowman?” Characterizing Playful Requests to Conversational Agents

Can Computers Understand Humor?

How Did This Get Funded?! Automatically Identifying Quirky Scientific Achievements

Language (Re)modeling: Towards Embodied Language Understanding

Invited Talks

From Tokens to Thoughts: Teaching LLMs to Understand Concepts (Executive Code 2025)

Computational Humor in No Joke (AI week 2023)

Towards Concept-Aware Large Language Models (AI week 2024, HUJI NLP retreat 2023, Nvidia's NLP-IL 2023, Stanford's ML Seminar 2023, John Snow Labs' NLP Summit, Stanford Data Science)

News Articles (in Hebrew)

YNET 2021

YNET 2023

Haaretz 2023