Text embeddings: Word embeddings, Sentence embeddings, Text retrieval, Text embedding benchmarks
Cross-lingual transfer: Multingual transformer models, Transfer learning, Parameter efficient fine-tuning, Tokenization issues
Bitext mining: Large scale parallel data for machine translation, Contrastive Learning
Neural machine translation: LSTMs vs. Transformers, Massively multilingual machine translation, Transfer learning
Machine translation evaluation and metrics: Lexical-based metrics, Embedding based metrics
Question answering: Reading comprehension, Open retrieval QA
Participatory research: Scaling NLP to several low-resource languages, Role of participatory/grassroots research, NLP democratization
Large language models: Ingredients of modern LLM architectures, Effect of scaling, Quality of multilingual pre-training data, and Language identification at scale
Multilingual LLM evaluations: Classical NLP tasks, Text generation tasks, Reasoning tasks, Knowledge-intensive tasks
Post-training: Post-training methods, Role of synthetic data
Multimodal and multicultural LLMs: VLMs, Making VLMs multicultural
Multilingual safety: Jail breaking LLMs, Cultural bias of LLMs
Multilingual speech representations: Transformer for speech, Speech evaluations
Automatic Speech recognition and translation: Automatic speech recognition, Speech-to-text translation, Simultaneous speech-to-speech translation
Text-to-speech: Single-speaker vs. Multi-speaker TTS, AudioLLM for TTS