Program

Day and Time: March 3rd, 2025, 9:00 am - 5:00 pm

Location: Room 119A, Pennsylvania Convention Center, Philadelphia, PA, USA

Program

Event times shown in the schedule are local times in Philadelphia, PA, USA.

9:00-9:10 Opening Remarks (Laure Berti-Equille and Shiqiang Wang)

9:10-9:35 Invited Talk 1: GneissWeb: Open Innovation for Advancing Training Data Quality (David Cox, IBM)

9:35-10:00 Invited Talk 2: Better, Safer, and More Data for Foundation Model Training (Daniel Li, Meta)

10:00-10:30 Contributed Oral Presentation Session 1 (3 papers)

10:30-10:45 Poster Session 1 (5 posters)

10:45-11:00 Break

11:00-11:25 Invited Talk 3: Building Secure RAG Applications with Open LLM Models (Timothy Spann, Snowflake)

11:25-11:50 Invited Talk 4: Building Code Models with Reasoning (Baishakhi Ray, Columbia University) 

11:50-12:10 Contributed Oral Presentation Session 2 (2 papers)

12:10-12:40 Poster Session 2 (10 posters)

12:40-14:00 Lunch

14:00-14:25 Invited Talk 5: Can Generative AI be Egalitarian? (Shimei Pan & James Foulds, University of Maryland, Baltimore County)

14:25-14:50 Invited Talk 6: Annotating Common Crawl for Good (Greg Lindahl, Common Crawl Foundation) - Canceled

14:50-15:20 Contributed Oral Presentation Session 3 (3 papers)

15:20-15:50 Poster Session 3 (10 posters)

15:50-16:00 Break

16:00-16:55 Panel discussion: What's Next in Data for LLMs?

16:55-17:00 Best paper announcement & closing


List of Accepted Papers and Oral Presentations

The poster presentations include all the accepted papers. In addition, selected top-rated papers will be also presented as oral presentations.


Selected Oral Presentations


All Accepted Papers