This LLM Cohort kicked off on January 10, 2025 and is intended to run till March 14, 2025 (Approximate date).
We present you 2 tracks that will run simultaneously with a weekly session for each track.
Here are the details of the tracks:
This LLM Cohort kicked off on January 10, 2025 and is intended to run till March 14, 2025 (Approximate date).
We present you 2 tracks that will run simultaneously with a weekly session for each track.
Here are the details of the tracks:
TRACK 1
Multilingual Long Context - Enhancing Processing with Advanced Techniques
Weekly Sessions: Saturday 8 am PT
How can advanced positional encoding methods (RoPE, NoPE, LongROPE) and hybrid models that combine Transformers with State Space Models (SSMs) improve the processing and understanding of long-context sequences in natural language processing tasks?
Processing long-context sequences efficiently and effectively is a critical challenge in natural language processing (NLP). Many real-world applications, such as document summarization, long-form question answering, dialogue systems, and genomic sequence analysis, require models that can understand and reason over extended contexts. Traditional Transformers face limitations due to their quadratic computational complexity concerning sequence length and diminishing ability to capture long-range dependencies. By enhancing long-context processing, we can develop models that are more scalable, efficient, and capable of handling a broader range of tasks that involve long sequences, thereby pushing the boundaries of what current NLP models can achieve.
Resources:
Rotary Position Embedding (RoPE):
Transformers without positional encoding (NoPE):
LongRoPE:
SSMs:
S4: [2111.00396] Efficiently Modeling Long Sequences with Structured State Spaces
S5: [2208.04933] Simplified State Space Layers for Sequence Modeling
[2404.16112] Mamba-360: Survey of State Space Models as Transformer Alternative for Long Sequence Modelling: Methods, Applications, and Challenges || GitHub
TRACK 2
Evaluating Multilingual Long Context generation and reasoning
Weekly Sessions: Friday 10 am PT
What is the question we want to answer?
How can Multilingual Language Models (LLMs) be optimized for advanced comprehension and insight derivation in complex, long-context tasks across languages, particularly in domains like Healthcare, Finance, and Legal Systems, where accurate, contextually relevant responses are critical? Additionally, what are the current capabilities of existing LLMs in handling these tasks, and can we develop a data creation pipeline to create a new Long Context Benchmark?
Multilingual systems are crucial to enabling equitable access and broader adoption of AI across diverse languages, supporting users in contexts where technical, financial, medical, and legal information must be presented and understood in native languages. Many sectors, such as healthcare and finance, often require complex analyses or detailed insights that are deeply embedded within extensive and contextually rich documents. These LLMs must be capable of understanding long-context information to meet user needs accurately, especially when handling nuanced linguistic and contextual requirements across multiple languages.
Resources:
LongBench: A Bilingual, Multitask Benchmark for Long Context Understanding
L-Eval: Instituting Standardized Evaluation for Long Context Language Models
∞BENCH: Extending Long Context Evaluation Beyond 100K Tokens
XL2Bench: A Benchmark for Extremely Long Context Understanding with Long-range Dependencies
Leave No Document Behind: Benchmarking Long-Context LLMs with Extended Multi-Doc QA
LongGenbench: Benchmarking Long-Form Generation in long-context LLMs
SESSION RECORDINGS:
10 January 2025
17 January 2025
18 January 2025
25 January 2025
25 January 2025
31 January 2025
7 February 2025
15 February 2025
22 February 2025
15 March 2025