[2023.03.17] Sangam Lee (presentation link) Keyword: Text-to-Text framework, LLM , Google T5
• Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. Journal of Machine Learning Research 21 (2020) (link)
[2023.03.24] Taeyup Noh (presentation link) Keyword: Reinforcement learning with prompt, LLM , Open AI Instruct GPT
• Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems 35 pre-proceedings (2022) (link)
[2023.03.31] Nayoung Kim (presentation link) Keyword: PLM, Memorization, Generalization
• Memorisation versus Generalisation in Pre-trained Language Models. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (2022) (link)
[2023.05.12] Dareen Eom (presentation link) Keyword: Fourier Transforms, Efficient transformer, FNet
• FNet: Mixing Tokens with Fourier Transforms. Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 4296-4313 (2022) (link)
[2023.05.19] Chaiho Shin (presentation link) Keyword: Text as visual recognition task, ViT-MAE, PIXEL
• Language Modelling with Pixels. Eleventh International Conference on Learning Representations (2023) (link)
[2023.05.26] Yena Park (presentation link) Keyword: Prompt Tuning, Scaling
• The Power of Scale for Parmeter Efficient Prompt. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 3045–3059 (link)
[2023.06.16] Sangam Lee (presentation link) Keyword: Retrieval augmented language model, Black-Box
• REPLUG: Retrieval-Augmented Black-Box Language Models. arXiv preprint (2023) (link)
[2024.07.24] Chaiho Shin (presentation link) Keyword: RAG, attention-based retrieval
• Dragin: Dynamic retrieval augmented generation based on the real-time information needs of large language models. The 62nd Annual Meeting of the Association for Computational Linguistics (2024) (link)
[2024.08.05] Dareen Eom (presentation link) Keyword: Medical Lab test data, tabular data, Medical Diagnostics
• DictLLM: Harnessing Key-Value Data Structures with Large Language Models for Enhanced Medical Diagnostics. The 62nd Annual Meeting of the Association for Computational Linguistics (2024) (link)
[2024.08.12] Kyuhee Lim (presentation link) Keyword: User-oriented Embedding, Question Answering
• Answer is All You Need: Instruction-following Text Embedding via Answering the Question. Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics, pages 459-477. (2024) (link)
[2024.08.19] Jiwon Kim (presentation link) Keyword: Knowledge Graph, LLM, QA
• MindMap: Knowledge Graph Prompting Sparks Graph of Thoughts in Large Language Models. Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics, pages 10370-10388. (2024) (link)
[2024.08.26] Chaiho Shin (presentation link) Keyword: RAG, SLM-based retrieval
• Knowledge Card: Filling LLMs' Knowledge Gaps with Plug-in Specialized Language Models. The Twelfth International Conference on Learning Representations (2024) (link)
[2024.09.06] Dareen Eom (presentation link) Keyword: Grammatical error correction, General language model
• Detection-Correction Structure via General Language Model for Grammatical Error Correction. The 62nd Annual Meeting of the Association for Computational Linguistics (2024) (link)
[2024.09.13] Kyuhee Lim (presentation link) Keyword: Text-to-SQL, SFT with synthetic dataset
• Synthesizing Text-to-SQL Data from Weak and Strong LLMs. The 62nd Annual Meeting of the Association for Computational Linguistics (2024) (link)
[2024.09.20] Jiwon Kim (presentation link) Keyword: Synthetic Data, Text Embeddings, LLM
• Improving Text Embeddings with Large Language Models. The 62nd Annual Meeting of the Association for Computational Linguistics (2024) (link)
[2024.09.27] Chaiho Shin (presentation link) Keyword: RALM, noise robust RAG, adversarial training
• Enhancing Noise Robustness of Retrieval-Augmented Language Models with Adaptive Adversarial Training. The 62nd Annual Meeting of the Association for Computational Linguistics (2024) (link)
[2024.10.04] Dareen Eom (presentation link) Keyword: LLM pretraining, instruction tuning, fine-tuning
• Rho-1: Not All Tokens Are What You Need. arXiv preprint (2024) (link)
• NEFTune: Noisy Embeddings Improve Instruction Finetuning. The Twelfth International Conference on Learning Representations (2024) (link)
[2024.10.25] Chaiho Shin (presentation link) Keyword: Multilingual medical LLM
• Towards building multilingual language model for medicine. Nature Communications. 15, 8834 (2024) (link)
[2024.11.01] Dareen Eom (presentation link) Keyword: Medical Report Generation Evaluation, LLM, Factual Consistency
• MRScore: Evaluating Medical Report with LLM-based Reward System. International Conference on Medical Image Computing and Computer-Assisted Intervention (pp. 283-292) (2024) (link)
• AlignScore: Evaluating Factual Consistency with A Unified Alignment Function. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (pp. 11328-11348) (2023) (link)
[2024.11.08] Kyuhee Lim (presentation link) Keyword: Knowledge graph, Synthetic data generation, QA
• KI-MAG: A knowledge-infused abstractive question answering system in medical domain. Neurocomputing, 571, 127141 (2024) (link)
[2024.11.15] Jiwon Kim (presentation link) Keyword: Knowledge graph, LLM, Medical QA
• DALK: Dynamic Co-Augmentation of LLMs and KG to answer Alzheimer’s Disease Questions with Scientific Literature. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing (2024) (link)
• KG-Rank: Enhancing Large Language Models for Medical QA with Knowledge Graphs and Ranking Techniques. Proceedings of the 23rd Workshop on Biomedical Natural Language Processing (2024) (link)
[2024.12.06] Chaiho Shin (presentation link) Keyword: LLM, Synthetic Data Generation
• Two Directions for Clinical Data Generation with Large Language Models: Data-to-Label and Label-to-Data. EMNLP findings (2023) (link)
• Constructing synthetic datasets with generative artificial intelligence to train large language models to classify acute renal failure from clinical notes. Journal of the American Medical Informatics Association (2024) (link)
• Generating synthetic clinical text with local large language models to identify misdiagnosed limb fractures in radiology reports. Artificial Intelligence in Medicine (2024) (link)
[2024.12.13] Kyuhee Lim (presentation link) Keyword: Clinical prediction, ClinicalBench benchmarking, Performance comparision between LLM and ML
• ClinicalBench: Can LLMs Beat Traditional ML Models in Clinical Prediction?. arXiv preprint (2024) (link)
[2024.12.20] Dareen Eom (presentation link) Keyword: medical record generation, LLM fine-tuning, Data-to-Text, Text summarization
• Data augmented large language models for medical record generation. Applied Intelligence (2024) (link)