My name is Roee Shraga (רועי שרגא) and I am an assistant professor in the CS department and the DS program at WPI

Before that I was a Postdoc at the Data Lab at Khoury College of Computer Sciences, Northeastern University, Boston, MA. I received my PhD in 2020 in the area of Data Science from the Technion – Israel Institute of Technology.


🎉 [July 24] Our resource paper on "A Generative Benchmark Creation Framework for Detecting Common Data Table Versions" was accpted to CIKM24

🎉 [July 24] Happy to anoounce that my project on "Improving the Utilization of Humans in Data Integration and Discovery" will be supported by NSF's Computer and Information Science and Engineering Research Initiation Initiative (CRII)

🎉 [July 24] Two papers accpted to TaDA workshop at VLDB24

📢 [June 24] Gave an invited guest lucture at Hult International Business School on "Humans in a World Ruled by Machines".  !

📢 [Apr. 24] I will present my work on Semantic Version Management in Data Lakes in the upcoming NEDB Day~!!S !

🎉 [Mar. 24] Our paper "Gen-T: Table Reclamation in Data Lakes" was accepted to ICDE 2024. Joint work with Grace Fan and Renée Miller from Northeastern University.

🎉 [Feb. 24] Our paper "SC-Block: Supervised Contrastive Blocking within Entity Resolution Pipelines" was accepted to ESWC 2024. Joint work with Alexander Brinkmann and Christian Bizer from University of Mannheim.

📢 [Dec. 23] I am co-organizing HILDA 2024. CFP is out on the website.

📝 [Dec. 23] Our paper on "The Battleship Approach to the Low Resource Entity Matching Problem" is out and will be presented at SIGMOD 2024

📁 [Jun. 23] The WDC Block Benchmark is available online

📝 [Jun. 23] Grateful to join the DEI@DB initiative

🎉 [Jun. 23] The FIRST edition of PACMMOD is out with our papers

📝 FlexER: Flexible Entity Resolution for Multiple Intents

📝 SANTOS: Relationship-based Semantic Table Union Search

We will present them soon at SIGMOD 2023, see you in Seattle 🍻

📝 [April 23] Check out our new article on Computer Magazine describing our vision on "One Algorithm to Rule Them All" in Data Integration

📢 [March 23] Aamod and Grace will present some of our recent works in NEDB Day, good luck!!

📝 [March 23] Our DKE paper in available online

🎉 [Feb. 23] Our demo "DIALITE: Discover, Align and Integrate Open Data Tables" was accepted to SIGMOD 2023

🎉 [Jan. 23] Our paper on "Explaining Dataset Changes for Semantic Data Versioning with Explain-Da-V" was accepted to PVLDB  and will be presented at VLDB 2023  (paper, github, technical report)

🎉 [Nov. 22] Our paper on "Integrating Data Lake Tables" was accepted to PVLDB  and will be presented at VLDB 2023  (paper, github)

🎉 [Aug. 22] Two new papers accepted to SIGMOD 2023 

📝 [Jun. 22] Presented my paper on "HumanAL: Calibrating Human Matching Beyond a Single Task" at HILDA 2022 (preprint, video), co-located with SIGMOD 2022

📢 [May 22] Gave a talk for Michael Cafarella's group at MIT CSAIL.

Older Updates:

💬 [Dec. 21] Invited to the PC of SIGIR 2022 (Full Papers)

📢 [Nov. 21] Gave a colloquium talk at the Data and Web Science Group ( at the University of Mannheim (invited by Han van der Aa).

📝 [Nov. 21] Our paper "From Limited Annotated Raw Material Data to Quality Production Data: A Case Study in the Milk Industry" was presented at CIKM 2021 (paper, technical report, video)

💬 [Oct. 21] Invited to the PC of VLDB 2022 (Demonstration Track)

💬 [Sep. 21] Invited to the PC of ICDE 2022 (Demonstration Track)

📝 [Aug. 21] Our paper "PoWareMatch: a Quality-aware Deep Learning Approach to Improve Human Schema Matching" was accepted to  ACM Journal of Data and Information Quality, Special Issue on Deep Learning for Data Quality (preprint)

📢  [Jun. 21] Gave a talk at the Data Managment Seminar at Tel Aviv University (hosted by  Daniel Deutch).

📢  [May 21] Gave a talk for the Roi Reichart's NLP group, Technion.

📢  [May 21] Gave a talk for the AI and People (APPL) group, Technion (hosted by  Ofra Amir).

📢  [Apr. 21] Gave a talk at the Computer Science Faculty, Technion (hosted by Benny Kimelfeld).

📢  [Apr. 21] Gave a talk at the Cognitive Robotics Lab, Technion (hosted by Erez Karpas).

🪑  [Apr. 21] Chaired the Indexing session at ICDE 2021.

📝 [Apr. 21] Presented our paper "Learning to Characterize Matching Experts" at ICDE 2021 (paper, video).

📢  [Apr. 21] Gave a guest lecture in the Data Modeling and Database Design class at the University of Toronto.

📢  [Mar. 21] Gave a talk at the Data Lab Seminar at Northeastern University.

📝 [Mar. 21] "ADaMaP: Automatic Alignment of Relational Data Sources using Mapping Patterns" was accepted to CAiSE 2021

📝 [Dec. 20] ACM SIGMOD blog posted our vision on Humans' Role in-the-Loop.

📝 [Oct. 20] Presented our paper at ICPM 2020 (paper, video).

📝 [Aug. 20] Presented two papers in VLDB2020:

📝 Research Paper (paper, video).

🎓 PhD Workshop (paper, video).

🐄 [Aug. 20] Presented at the FoodIoT Big Data Seminar. Check out the video (it is in hebrew, sorry...).

📝 [July 20] Presented our paper at SIGIR 2020 (paper, video).

📝 [June 20] Presented our demo paper at SIGMOD 2020 (paper, video).