EVP, Oracle Database Technologies, Oracle
SVP, Mission-Critical Data and AI Engine, Oracle
Abstract: The "Object-Relational Impedance Mismatch" has been a multi-decade problem for developers, and past solutions have all had various tradeoffs that have either compromised efficiency or consistency. JSON Relational Duality is a breakthrough capability that combines the simplicity and developer convenience of the Document model with the power, efficiency and composability of the Relational model. This session will provide an overview of JSON Relational Duality and the benefits of being able to use JSON documents as the access format while using the relational model as the storage format.
Bio: As executive vice president of Oracle Database Technologies, Juan Loaiza is responsible for driving the innovation and development of Oracle's flagship database products, which are at the core of many of the world's largest and most critical IT systems. Juan's influence extends beyond Oracle. He has authored more than 130 patents, shaping the industry’s understanding of database management, and contributes to industry standards and database research. Juan has been at Oracle since 1988 and reports directly to Oracle CTO and founder Larry Ellison. He holds BS and MS degrees in computer science from the Massachusetts Institute of Technology. In his free time, Juan is an active supporter of more than 20 organizations around the world that work to conserve wildlife and wild places.
Tirthankar Lahiri is Senior Vice President of Mission-Critical Data and AI Engines at Oracle. He has a B.Tech in Computer Science from IIT Kharagpur and an MS in Electrical Engineering from Stanford University, and holds 71 patents.
Abstract: As database workloads increasingly move into large shared-nothing cloud datacenters, the bits storing operational data, analytical tables, streams, etc. all sit together on the same disks in the cloud. This creates new opportunities to unify the capabilities of operational and analytical systems, while being mindful of “one size fits all” pitfalls. I’ll discuss how Databricks and Neon are exploring this opportunity with Lakebase, an architecture for OLTP DBMSes that leverages open formats and cloud object stores to also enable efficient analytics on the same data and easy interop between the two worlds. Furthermore, since it wouldn’t be a 2025 talk without AI, I’ll explain how we are seeing agents change the demand on both types of systems and appear to be resulting in more “analytics-like” workloads on OLTP databases and more “OLTP-like” workloads on analytical ones, primarily by issuing much larger numbers of small exploratory queries. These trends create many exciting new challenges for the research community.
Bio: Matei Zaharia is an Associate Professor of EECS at Berkeley and a Cofounder and CTO of Databricks. He started the Apache Spark open source project during his PhD at UC Berkeley in 2009, and has worked broadly on other widely used data and AI software, including Delta Lake, MLflow, Dolly and ColBERT. He currently works on a variety of research projects in cloud computing, database management, AI and information retrieval. Matei’s research was recognized through the 2014 ACM Doctoral Dissertation Award, an NSF CAREER Award, and the US Presidential Early Career Award for Scientists and Engineers (PECASE).
CTO, Databricks, Associate professor, UC Berkeley
Software Engineer, Meta
Abstract: AI has become the main consumer of data management. This brings new data and query shapes and an accelerator-first data center. At the same time, demands on interactive analytics and ETL keep growing.
To keep innovating in execution, Velox needs a holistic understanding of the data value chain. For this, we are developing Verax, Velox's query optimization companion. This targets the traditional SQL workloads as well as the warehouse to GPU trajectory in ML training. While we move towards the application, we also move down to the metal. Velox Wave is Velox's whole-stage GPU offload. Velox is further used as a front end to NVIDIA’s libcudf and the Neuroblade SQL FPGA.
Velox's place at the crossroads of these developments makes the present time a unique opportunity and challenge. We will be sharing insights from the road so far and giving a peek at upcoming developments.
Bio: Orri Erling co-founded the Velox composable query execution project at Meta. Prior to this, he worked on Google's F1 and before then created OpenLink Virtuoso, a relational/graph store, best known for its applications in linked data and knowledge graphs. Current activities are in creating a query optimizer companion to Velox and moving compute to an accelerator-first data center in support of AI becoming the first consumer of data management.
Abstract: Data streaming is rapidly emerging as a mainstream platform for integrating data in real time, with Apache Kafka becoming the de facto standard for this architecture. Many event-driven applications have been built on Kafka to enable businesses to respond to real time events as they unfold. In this talk, I will explain how data streaming not only powers operational applications but also enhances analytics and generative AI by shifting processing closer to the source. I’ll also share our vision for what constitutes a complete and modern data streaming platform.
Bio: Jun Rao is a co-founder of Confluent, a company that provides a data streaming platform based on Apache Kafka. Before Confluent, Jun Rao was a senior staff engineer at LinkedIn where he led the development of Kafka. Before LinkedIn, Jun Rao was a researcher at IBM’s Almaden research center, where he conducted research on database and distributed systems. Jun Rao is a committer of Apache Kafka and Apache Cassandra.
Co-founder, Confluent
SVP, Mission-Critical Data and AI Engine, Oracle
VP, Development, Transactions and App-Dev Technologies, Oracle
Abstract: Enterprise apps are far too complex and too interdependent to be directly generated by AI. They have stringent data security, reliability, and evolvability requirements that must be implemented correctly and without hallucination, involving many millions of lines of code and many thousands of persistent data structures. Such an app that is directly generated by AI could never be verified to be correct! Enter GenDev - a new AI-assisted development method, that uses AI in bounded and verifiable ways, to correctly generate and evolve ultra complex enterprise apps. Users use GenDev to first describe the application's high-level intent, then to determine the different modules and the data needs of each module, and then to define the data model and the exact data interfaces required by each module. The GenDev process iteratively creates a declarative specification of the application including its data model and data interfaces. At each step, this declarative specification is verifiable and editable by the end user. The final declarative specification can be converted programmatically into to JSON Duality views and security policies for the app. JSON duality views serve as the data access interface to data stored in normalized relational tables. Duality Views ensure that applications built using them are evolvable, secure, and reliable, since they decouple the data access model from the storage model and effectively encapsulate the data access intent of each app module.
Bio: Tirthankar Lahiri is Senior Vice President of Mission-Critical Data and AI Engines at Oracle. He has a B.Tech in Computer Science from IIT Kharagpur and an MS in Electrical Engineering from Stanford University, and holds 71 patents.
Ajit Mylavarapu is Vice President of the Transactions and Application Development Technologies group for Oracle Database. His group is responsible for the core transaction engine for the Oracle Database. In addition to this, his group is responsible for building technologies into the Oracle Database that enable app developers to use GenAI to build enterprise grade apps using the Oracle Database. Ajit has 16 years of experience in the database industry.
Hydro, Datathink and General Purpose Programming
Bio: Joseph M. Hellerstein's work focuses on data-centric systems and the way they drive computing. He is the Jim Gray Professor of the Graduate School at the University of California, Berkeley, and a Distinguished Scientist at Amazon Web Services. Recognitions of his research contributions include the ACM SIGMOD Codd Innovations Award, the ACM Fellow and Alfred P. Sloan Research Fellow awards, and seven "Test of Time" awards for his research papers. He co-founded RunLLM, a vendor of AI assistants, and Trifacta, the pioneering company in Data Wrangling. Hellerstein has taught thousands of Berkeley students and advised over 30 Ph.D.'s, three of whom were recognized with the ACM SIGMOD Jim Gray Dissertation award.
SagaLLM: Context Management, Validation, and Transaction Guarantees for Multi-Agent LLM Planning
Abstract: This talk introduces SagaLLM, a structured multi-agent architecture designed to address four foundational limitations of current LLM-based planning systems: unreliable self-validation, context loss, lack of transactional safeguards, and insufficient inter-agent coordination. While recent frameworks leverage LLMs for task decomposition and multi-agent communication, they often fail to ensure consistency, rollback, or constraint satisfaction across distributed workflows. SagaLLM bridges this gap by integrating the Saga transactional pattern with persistent memory, automated compensation, and independent validation agents. It leverages LLMs' generative reasoning to automate key tasks traditionally requiring hand-coded coordination logic, including state tracking, dependency analysis, log schema generation, and recovery orchestration. Although SagaLLM relaxes strict ACID guarantees, it ensures workflow-wide consistency and recovery through modular checkpointing and compensable execution. Empirical evaluations across planning domains demonstrate that standalone LLMs frequently violate interdependent constraints or fail to recover from disruptions. In contrast, SagaLLM achieves significant improvements in consistency, validation accuracy, and adaptive coordination under uncertainty—establishing a robust foundation for real-world, scalable LLM-based multi-agent systems.
Bio: Edward Y. Chang is an adjunct professor of Computer Science at Stanford University (2019–present) and co–Editor-in-Chief of ACM Books (effective Dec. 2025). He previously served as President of HTC DeepQ Healthcare (2012–2021) and as a Director of Research at Google (2006–2012), where he led work on scalable machine learning, web-scale image annotation, and data-centric AI, and sponsored the ImageNet project. Earlier, he was a tenured faculty member in ECE at UC Santa Barbara (1999–2006) and a visiting professor at UC Berkeley (2017–2020). Chang earned an M.S. in Computer Science and a Ph.D. in Electrical Engineering from Stanford. He is an ACM and IEEE Fellow, with honors including the NSF CAREER Award, Google Innovation Award, ACM SIGMM Test of Time Award, and a US$1M XPRIZE for AI-driven diagnosis. He is the author of several books, most recently Multi-LLM Agent Collaborative Intelligence: The Path to Artificial General Intelligence (March 2024; revised October 2025).
LLM-powered Data Tooling: the Next Frontier
Abstract: LLMs are changing the world, but how can they help with data processing? In this talk, we discuss ongoing work in the EPIC Data Lab at Berkeley to rethink the end-to-end data lifecycle, now with LLMs in the mix. We describe our scalable, efficient, and usable text data processing system stack, aka our document "stack stack", as well as a couple of projects that are having impact across a number of real-world domains. We'll also briefly touch upon our future research vision around better supporting agentic workloads.
Bio: Aditya Parameswaran is an Associate Professor in EECS at UC Berkeley, and a co-director of the EPIC Data Lab. Aditya has published 100+ papers overall at top venues across multiple disciplines, with multiple best paper awards; just this year, his papers have appeared at the top DB (VLDB, SIGMOD), AI (ICLR, NAACL), and HCI (UIST, CSCW) venues. Multiple open-source tools developed in his group have received thousands of GitHub stars (including Modin, Lux, IPyFlow, DocETL)---and have been downloaded tens of millions of times overall across a spectrum of industries. His research was commercialized as a startup, Ponder, in 2021, where he served as Co-founder and President, before its acquisition by Snowflake. Aditya has received the Alfred P. Sloan Research Fellowship, VLDB Early Career Award, the NSF CAREER Award, the TCDE Rising Star Award, along with other recognitions. His website is at http://adityagp.net
Scalable Secure-Oblivious Computation for Next Generation Private Systems, Databases, and AI
Abstract: Any privacy-preserving computation on encrypted data that relies solely on encryption can leak significant information about the plaintext input through leakage-abuse attacks. Industrial approaches that support confidential computing through hardware enclaves are susceptible to side-channel attacks; however, hardware enclaves provide an affordable and low-cost solution for any privacy-preserving computation. Oblivious primitives are a powerful cryptographic tool that, when combined with hardware enclaves, can mitigate leakage-abuse and software side-channel attacks.
Oblivious primitives find applications in various areas, including Signal’s contact discovery, Anonymous Key Transparency, end-to-end encrypted email search, differential privacy in the shuffle model, large-scale software monitoring (e.g., Google’s Prochlo), private federated learning/computation (Apple’s Private Cloud Compute), Data Clean Rooms, LLM privacy/private RAG, Google’s Privacy Sandbox, and broader confidential computing efforts (including Oracle’s). In this short-talk, we discuss our recent hardware-enclave-based oblivious primitives that scale private computations to terabyte-sized inputs—far exceeding the previous state-of-the-art 100MB-4GB range. Our scalable oblivious primitives include a high-throughput oblivious key-value store (used in Signal’s contact discovery, SOSP’21), a low-latency approach (PVLDB’24), the most scalable oblivious sort and shuffle (IEEE SP’24), and the first scalable oblivious filter, group-by, join approaches (USENIX’25).
Bio: Ioannis Demertzis is an Assistant Professor in the Computer Science and Engineering Dept. at the University of California, Santa Cruz. His research focuses on applied cryptography, security & privacy, and secure databases/systems. His work has been published at top security, system and database conferences including USENIX, CRYPTO, NDSS, S&P, SIGMOD, SOSP, PVLDB and TODS. He is the recipient of the ACM SIGSAC Doctoral Dissertation Award Runner-up, Distinguished Dissertation Award of ECE (University of Maryland), and the Symantec Research Labs Graduate Fellowship. Before joining UCSC, he was a Postdoctoral Researcher at the EECS Dept. of UC Berkeley hosted by Prof. Raluca Ada Popa. He received his Ph.D. from the ECE Dept. of the University of Maryland, College Park advised by Prof. Charalampos Papamanthou. He obtained his ECE Diploma and M.Sc at the Technical University of Crete, under the supervision of Minos Garofalakis.
C. Mohan (Moderator)
Dr. C. Mohan is Distinguished Professor of Science at Hong Kong Baptist University, Distinguished Visiting Professor at Tsinghua University, and a member of the inaugural Board of Governors of Digital University Kerala. He retired in 2020 as an IBM Fellow after 38.5 years at IBM Almaden Research Center, where his pioneering work in databases, blockchain, AI, and related areas influenced numerous products, standards, and the academic community. He is best known for inventing the ARIES database recovery algorithms and the Presumed Abort commit protocol.An IBM, ACM, and IEEE Fellow, Mohan has received honors including the ACM SIGMOD Edgar F. Codd Innovations Award, VLDB 10-Year Best Paper Award, and election to both the US and Indian National Academies of Engineering. He holds 50 patents and has served as consultant to Google and Microsoft. A prolific speaker, he has delivered talks in 43 countries and served on numerous editorial and advisory boards. More details at https://bit.ly/CMwIkP and his homepage at https://bit.ly/CMoDUK.
Panelists:
Dhruba Borthakur (Open AI)
Dhruba Borthakur is the Technical Lead of the data infrastructure team at OpenAI. He co-founded Rockset, a search database that powers AI applications at OpenAI. Dhruba was the founding engineer of the RocksDB database at Facebook and one of the founding engineers of the Hadoop File System at Yahoo. Dhruba was also an early contributor to the open source Apache HBase project. Previously, he was a senior engineer at Veritas Software, where he was part of a team responsible for the development of VxFS and Veritas SanPointDirect storage system; was the cofounder of Oreceipt.com, an ecommerce startup based in Sunnyvale; and was a senior engineer at IBM-Transarc Labs, where he contributed to the development of Andrew File System (AFS). Dhruba holds an MS in computer science from the University of Wisconsin-Madison and a BS in computer science BITS, Pilani, India. He has 69 issued patents.
Matei Zaharia (Databricks & UC Berkeley)
Matei Zaharia is an Associate Professor of EECS at Berkeley and a Cofounder and CTO of Databricks. He started the Apache Spark open source project during his PhD at UC Berkeley in 2009, and has worked broadly on other widely used data and AI software, including Delta Lake, MLflow, Dolly and ColBERT. He currently works on a variety of research projects in cloud computing, database management, AI and information retrieval. Matei’s research was recognized through the 2014 ACM Doctoral Dissertation Award, an NSF CAREER Award, and the US Presidential Early Career Award for Scientists and Engineers (PECASE).
Yannis Papakonstantinou (Google)
Yannis is a Google Cloud Distinguished Engineer in the Data Cloud organization. He technically leads innovative features and products in support of GenAI and agentic applications over Data Cloud services. In addition he advises many initiatives at the intersection of GenAI and databases. He is also an Adjunct Professor of Computer Science at UCSD and has held roles at Databricks, AWS, and as CEO/Chief Scientist of Enosys Software, which developed an early Enterprise Information Integration platform later acquired by BEA. A former Stanford Ph.D. in Computer Science, his research spans query processing, semistructured data, and AI, with over 120 publications and 21,000+ citations.
Avrilia Floratou (Microsoft)
Avrilia Floratou is a Principal Engineering Manager in Microsoft Azure Data, where she leads the development of AI components for the Fabric data analytics platform. Her current work centers on building intelligent agents that enable business users to derive insights from their data, as well as designing services for AI-driven data transformation and extraction.
Previously, she led a research team at Microsoft’s Gray Systems Lab, exploring topics ranging from applying AI to data management to improving system performance and scalability. In this role, she collaborated closely with product teams like Fabric and SQL Server to translate research advances into real-world product impact. Prior to Microsoft, she spent three years at IBM Almaden Research Center, contributing to research on SQL-on-Hadoop and scalable analytics.
Avrilia holds a PhD and MS in Computer Science from the University of Wisconsin–Madison and a Bachelor’s degree from the University of Athens.
Shasank Chavan (Oracle)
Shasank Chavan is the Vice President of the In-Memory, Data and AI Engines group at Oracle. He leads an organization of brilliant engineers working on the nexus between AI systems and modern databases. His team is responsible for developing the next-generation AI-centric data storage engine, designed for in-memory OLTP, Analytics and AI Vector Search capabilities to power real-time AI intelligence in database systems. Shasank earned his BS/MS in Computer Science at the University of California, San Diego. He has accumulated 50+ patents over a span of 27 years working on systems software technology.