NLP

Channel: #nlp

Co-leads


Goal: 

Focused on NLP, with the goal of encouraging and enabling research idea sharing and collaboration between members.


Logistics:

Organizational Spreadsheet: all are welcome to add papers to the paper bank, and volunteer to present on a paper listed within. C4AI NLP Reading Group! 

Occurrences: 1-hour weekly, Saturdays at 1pm ET: https://meet.google.com/mse-htip-kxw

Recent Recordings

Arnav Singhvi - DSPy Assertions: Computational Constraints for Self-Refining Language Model Pipelin (2024-03-30 10:05 GMT-7)

Arnav Singhvi presents their work on "DSPy Assertions: Computational Constraints for Self-Refining Language Model Pipelines" 

C4AI - NLP Reading Group (2024-03-16 10:05 GMT-7)

Mamba: Zero to Hero

Anshuman Suri - Do Membership Inference Attacks Work on Large Language Models? (NLP) (2024-03-02 10:02 GMT-8)

Anshuman Suri - Do Membership Inference Attacks Work on Large Language Models?

C4AI - NLP Reading Group (2024-02-17 10:06 GMT-8)

February 17, 2024

C4AI - NLP Reading Group (2024-01-06 10:04 GMT-8)

January 5, 2024

C4AI - NLP Reading Group (2023-11-18 10:04 GMT-8)
C4AI - NLP Reading Group (2023-11-04 10:05 GMT-7)
C4AI - NLP Reading Group (2023-10-28 10:03 GMT-7)
C4AI - NLP Reading Group (2023-10-21 10:04 GMT-7)
C4AI - NLP Reading Group (2023-10-14 10:02 GMT-7)
C4AI - NLP Reading Group (2023-09-30 18:07 GMT+1)
C4AI - NLP Reading Group (2023-09-23 10:03 GMT-7)
C4AI - NLP Reading Group (2023-09-16 10:06 GMT-7)
C4AI - NLP Reading Group (2023-09-09 10:06 GMT-7)
C4AI - NLP Reading Group (2023-08-26 10:05 GMT-7)
C4AI - NLP Reading Group (2023-08-19 10_06 GMT-7).mp4
C4AI - NLP Reading Group (2023-08-12 10_05 GMT-7) (2).mp4
C4AI - NLP Reading Group (2023-08-05 10_06 GMT-7) (1).mp4

Session led by @ashkey1900! Topic: Your spouse needs professional help: Determining the Contextual Appropriateness of Messages through Modeling Social Relationships (https://arxiv.org/abs/2307.02763)

C4AI - NLP Reading Group (2023-07-29 13:04 GMT-4)
C4AI - NLP Reading Group (2023-07-22 13:06 GMT-4)
C4AI - NLP Reading Group (2023-07-15 13:07 GMT-4)
C4AI - NLP Reading Group (2023-06-24 13:03 GMT-4)
C4AI - NLP Reading Group (2023-06-17 13:04 GMT-4)
C4AI - NLP Reading Group (2023-06-10 13:03 GMT-4)
C4AI - NLP Reading Group (2023-06-03 13:03 GMT-4)

@hails leads a semi-social session where we talk about various LLM training libraries that are public

C4AI - NLP Reading Group (2023-05-27 13:05 GMT-4)

SLiC-HF: Sequence Likelihood Calibration with Human Feedback

C4AI - NLP Reading Group (2023-05-06 13:05 GMT-4)
C4AI - NLP Reading Group (2023-04-22 13:03 GMT-4)

Arun presents on Evaluating the verifiability of Generative Search Engine

C4AI - NLP Reading Group (2023-04-08 13:05 GMT-4)

Hailey presents on Quantization Model of Neural Scaling

C4AI - NLP Reading Group (2023-03-11 13_04 GMT-5).mp4
C4AI - NLP Reading Group (2023-02-25 13_06 GMT-5).mp4

Large Language Models as Knowledge Bases

C4AI - NLP Reading Group (2023-02-18 13_06 GMT-5).mp4

MAUVE: Measuring the Gap BEtween Neural Text and Human Text using Divergence Frontiers, with Irem

C4AI - NLP Reading Group (2023-02-11 13_03 GMT-5).mp4

Training Trajectories of Language Models Across Scales

Materials from all past sessions

Organizational Spreadsheet

C4AI LLM Reading Group!

Large Language Model Paper List

(the following list was shared by Arun on Discord, 29/08/2022)

### Attention

#### Original

- Original Attention paper - https://arxiv.org/abs/1409.0473

- Transformers paper - https://arxiv.org/abs/1706.03762


### Variants

#### Decoder-only

- GPT3 - https://arxiv.org/abs/2005.14165

- PaLM - https://arxiv.org/abs/2204.02311


#### Encoder Decoder

- T5 - https://arxiv.org/abs/1910.10683

- T0 - https://arxiv.org/abs/2110.08207

- EncDec vs Dec - https://arxiv.org/abs/2204.05832


#### Retrieval

- RETRO - https://arxiv.org/abs/2112.04426

- KNN-LM - https://openreview.net/forum?id=HklBjCEKvH


#### Sparse LMs

-  Switch Transformers - https://arxiv.org/abs/2101.03961


### Scaling Laws

- Kaplan Scaling Laws - https://arxiv.org/abs/2001.08361

- Routed LM Scaling Laws - https://arxiv.org/abs/2202.01169

- Chinchilla - https://arxiv.org/abs/2203.15556

- Scaling transformer efficiently - https://arxiv.org/abs/2109.10686


### Multimodal

- CLIP - https://arxiv.org/abs/2103.00020


### Inference Strategies

#### Candidate Generation 

- Beam Search

- Top-k - https://arxiv.org/abs/1805.04833

- Top-p sampling - https://arxiv.org/abs/1904.09751


#### Re-rankers

- RL Human Feedback - https://arxiv.org/abs/2009.01325

- Decision Transformer - https://arxiv.org/abs/2106.01345


### Training

- Megatron LM/GPT-J - https://arxiv.org/abs/1909.08053


### Evaluation

- Efficiency Misnomer - https://arxiv.org/abs/2110.12894

- Grokking - https://arxiv.org/abs/2201.02177


### Data

- PILE - https://arxiv.org/abs/2101.00027