Symposium on Data Markets (SDM)

In conjunction with the 49th International Conference on Very Large Data Bases (VLDB) 

August 28, 2023

Vancouver, Canada

News

Program


All time is in Vancouver Time (PDT). 

Overview

We are entering a brave new world ushered in by the global digital transformation. With an estimated more than 2.5 quintillion bytes of data produced every day, data is becoming an independent and strategic asset. Sharing and reusing data is the key to the new norm of data economy.  There have appeared data markets of various forms, which aim to make access to data a commodity. In these markets, the main idea is to facilitate interaction between data providers (e.g., individuals or organizations that possess data in diverse domains and wish to offer them to other interested parties) and data consumers who are interested in obtaining data to accomplish certain tasks, such as training new machine learning models, increasing the accuracy of existing ones, and conducting statistical estimation. Since such platforms aim to adopt the characteristics of a market, data exchange carries an underlying cost (e.g., monetary value). The advent of such markets can be viewed as an initial step to the enablement of efficient trading of data, with enormous social and economic benefits. The design of the operating principles, market mechanisms and trading strategies (to name a few topics) of such markets constitute open research directions and involve multiple research communities, such as economics, AI, and data science. Recently, in the database and the broader data science communities, there have already appeared a number of works addressing different aspects of the data markets. 


Data markets are still new to many people. The mechanisms and the potential challenges and benefits are far from clear.  The objective of this (mini) symposium is to promote this emerging direction and bring together researchers and practitioners to report and discuss the fundamental issues related to data markets.  The symposium takes on a novel format that consists of short keynotes, fireside chats, and a panel surrounding two major topics:

Invited Talks

Designing Data Markets: Platforms for the Data Economy

Michael J. Franklin, University of Chicago

Abstract: While it is widely acknowledged that data is one of the most valuable commodities of the 21st century,  the development of market platforms for data is still in its early stages.   This talk will look at several different types of data markets to see if we can identify some common requirements and functions and use those as the motivation for the design of an architectural framework for such platforms.   Particular attention will be paid to areas where data management technologies such as those of interest to the VLDB community can play a role.    The goal is to identify new opportunities for data systems research as well as to develop ways to help users and organizations unlock the value and potential of their data resources to enable data-driven discovery.

Bio: MICHAEL J. FRANKLIN is the Morton D. Hull Distinguished Service Professor of Computer Science and Sr. Advisor to the Provost for Computing and Data Science at the University of Chicago.   At Chicago he was Liew Family Chair of the Computer Science Department and is a co-founder of the Data Science Institute.   Previously he was Thomas M. Siebel Professor of Computer Science at the University of California, Berkeley where he also served a term as Chair of the Computer Science Division.  As Co-Director of the Algorithms, Machines and People Laboratory (AMPLab) he was one of the original creators of Apache Spark, a leading open source platform for advanced data analytics and machine learning that was initially developed at the lab.  He is a Member of the American Academy of Arts and Sciences and is a Fellow of the ACM and the American Association for the Advancement of Science.  He received the 2022 ACM SIGMOD Systems Award with the team that developed Spark, and is a two-time recipient of the ACM SIGMOD “Test of Time” award. He holds a Ph.D. in Computer Sciences from the Univ. of Wisconsin (1993).

Data Discoverability: The Promise of a Data Market

Georgia Koutrika, Athena Research Center

Abstract: Data is considered the 21st century's most valuable commodity. Nevertheless, existing systems are falling behind in bridging the gap between data and humans, making data accessible and useful only to the few. Data marketplaces are promising to help bridge this gap by offering access to a multitude of data. However, potential data consumers come to confront crude search interfaces that impede rather than promote data discoverability. In this talk, we will scratch the surface of the data search problem and give a flavor of an exciting research territory that can help unleash the potential of data markets.

Bio: Georgia Koutrika is Research Director at Athena Research Center in Greece. Previously, she has worked in multiple roles at HP Labs, IBM Almaden, and Stanford. Her work focuses on interactive and intelligent data exploration and data management systems, and combines advanced data analysis, deep learning, and natural language processing techniques. Her work has been incorporated in commercial products, described in 14 granted patents and 26 patent applications in the US and worldwide, and published in more than 110 papers in top-tier conferences and journals. 

Georgia is an ACM Senior Member, and IEEE Senior Member. She is a member of the VLDB Endowment Board of Trustees, and the PVLDB Advisory board. She is co-EiC for VLDB Journal, PC co-chair for VLDB 2023, co-EiC of Proceedings of the VLDB (PVLDB).  Georgia has recently received the EDBT 2023 Test-of-Time award. She was general chair for ACM SIGMOD 2016, and she has served in various other organization roles, including EDBT 2023 and ICDE 2021 sponsorship chair. She is the chair of ACM Europe Working Group on Seasonal Schools.

Data Markets: What Can Go Wrong?

H. V. Jagadish, University of Michigan

Abstract: Unlike most physical goods, data has some interesting properties. For example, someone may have a privacy interest in the data. For another example, replication is possible essentially for free. Whereas there is a “law of diminishing returns” for the utility we ascribe to most material possessions, data appears to follow a reverse “law of increasing returns”. As we develop markets for data, and technologies to support these markets, it is important that we keep in mind these surprising properties. Otherwise, we may see undesirable behaviors in data markets.

Bio: H. V. Jagadish is Edgar F Codd Distinguished University Professor and Bernard A Galler Collegiate Professor of Electrical Engineering and Computer Science at the University of Michigan in Ann Arbor, and Director of the Michigan Institute for Data Science.  Prior to 1999, he was Head of the Database Research Department at AT&T Labs, Florham Park, NJ.

Professor Jagadish is well known for his broad-ranging research on information management, and has over 200 major papers and 38 patents, with an H-index of 101.Ê He is a fellow of the ACM, "The First Society in Computing," (since 2003) and of AAAS (since 2018).  He currently chairs the board of the Academic Data Science Alliance and previously served on the board of the Computing Research Association (2009-2018).Ê He has been an Associate Editor for the ACM Transactions on Database Systems (1992-1995), Program Chair of the ACM SIGMOD annual conference (1996), Program Chair of the ISMB conference (2005), a trustee of the VLDB (Very Large DataBase) foundation (2004-2009), Founding Editor-in-Chief of the Proceedings of the VLDB Endowment (2008-2014), and Program Chair of the VLDB Conference (2014).Ê Since 2016, he is Editor of the Springer (previously Morgan & Claypool) ÒSynthesisÓ Lecture Series on Data Management.  Among his many awards, he won the David E Liddle Research Excellence Award (at the University of Michigan) in 2008, the ACM SIGMOD Contributions Award in 2013, and the Distinguished Faculty Achievement Award (at the University of Michigan) in 2019.  His popular MOOC on Data Science Ethics is available on both EdX and Coursera.

More to come ... Stay tuned! 

Fireside Chats

Unlocking the Value of Personal Data in Trustworthy Data Markets

Yang Cao, Hokkaido University

Yang Cao is an Associate Professor at Hokkaido University. He obtained his Ph.D. in Informatics from Kyoto University in 2017. His research focuses on security & privacy, data management, and machine learning. He received the IEEE Computer Society Japan Chapter Young Author Award 2019, and the Database Society of Japan Kambayashi Young Researcher Award 2021. His research projects have been supported by various organizations, including JSPS, JST, MSRA, KDDI, LINE, and WeBank.

Selling Insights in Data Markets

Alekh Jindal, SmartApps

Alekh Jindal is Co-founder and CEO at SmartApps, a generative AI company that helps turn data into intelligence. Previously, he was Founding Chief Architect, CTO, and board member at Keebo. Before that, he built ML-driven databases in Microsoft Azure and managed the Redmond team of Gray Systems Lab. Alekh received his bachelor's degree from IIT Kanpur, master’s from Max Planck, PhD from Saarland University, and postdoc at MIT CSAIL. He has published more than 80 papers, filed 15 patents, and received 4 best paper awards at CIDR, SIGMOD, and VLDB.

The Future of Data Sharing and Collaboration: a Perspective from Databricks

Zaheera Valani, Databricks

Zaheera Valani is a Senior Director of Engineering at Databricks. She leads the Databricks SQL Experience and Databricks Ecosystem teams. She leads the team that recently launched the Databricks Marketplace. Prior to Databricks, Zaheera was the Vice President of Product and Engineering at Tableau leading the Data Management organization. She started out her career as a software engineer on Microsoft Excel. She is passionate about data, analytics, engineering and has grown teams and shipped widely adopted data and analytics products during her ~20 year career in technology.

Organizers

Related Events