PhD Summer School on Process Mining

Bolzano, 29 June - 3 July 2026

Program

Monday June 29

Module 1: Introduction

09:00 – 10:30 Unlocking Process Intelligence A 360 Degree Introduction to Process Mining (Wil van der Aalst)

📜Description: Process mining serves as the essential bridge between data science and process science, utilizing event data to reveal how operational processes actually function in the real world. This session explores the core pillars of the field—discovery, conformance, and performance—demonstrating how organizations can automatically transform "digital footprints" into actionable insights and process improvements. By examining diverse process modelling notations such as BPMN and Petri nets alongside event data standards such as XES and OCEL, you will learn to navigate end-to-end processes, eliminate bottlenecks, and ensure compliance.

🛠️Technical requirements:

ProM (https://promtools.org/). An introductory video: https://youtu.be/w7dKYFdR9C0. Please install ProM 6.15 or ProM Lite 1.4. ProM requires Java 8. If you encounter issues, you may use the Docker setup described in the video
Academic Edition of Celonis (https://signup.celonis.com/). An introductory video: https://youtu.be/w1M-kHVpvBU
Download the following dataset https://www.vdaalst.com/courses/pmdata2026.zip. Please ensure that you can load these datasets in ProM and Celonis and perform basic analyses (e.g., discovering DFGs or BPMN models), as demonstrated in the videos.

📚Slides: part-1

11:00 – 12:30 From Event Logs to Process Logic: The Basics of Process Discovery (Wil van der Aalst)

📜Description: Process discovery automatically constructs models from event data, yet it must navigate the complex trade-offs between recall, precision, generalization, and simplicity. This session introduces the Directly-Follows Graph (DFG) as a scalable baseline and demonstrates how advanced filtering techniques are essential for isolating dominant behavior from infrequent "noise". We will dive into the core foundations of both bottom-up discovery—uncovering local patterns through the Alpha algorithm—and top-down discovery, which uses inductive mining to recursively decompose logs into sound-by-construction process trees.

🛠️Technical requirements:

ProM (https://promtools.org/). An introductory video: https://youtu.be/w7dKYFdR9C0. Please install ProM 6.15 or ProM Lite 1.4. ProM requires Java 8. If you encounter issues, you may use the Docker setup described in the video
Academic Edition of Celonis (https://signup.celonis.com/). An introductory video: https://youtu.be/w1M-kHVpvBU
Download the following dataset https://www.vdaalst.com/courses/pmdata2026.zip. Please ensure that you can load these datasets in ProM and Celonis and perform basic analyses (e.g., discovering DFGs or BPMN models), as demonstrated in the videos.

📚Slides: part-2

Module 2: Procedural models

14:00 – 15:30 Introduction to Petri Nets (Robin Bergenthum)

📜Description: This lecture introduces Petri nets as a formalism for modelling and analysing distributed systems, with applications in business process modelling and workflow analysis. It covers both structural and behavioural analysis methods, including properties such as boundedness, reachability, liveness, coverability, as well as P- and T-invariants. Furthermore, the lecture explains how Petri nets model local states, resources, conflicts, and concurrency. Different semantic representations of system behaviour are presented, including step semantics, partially ordered runs, and branching processes, each capturing different behavioural aspects. These concepts provide the formal foundations for the analysis, model checking, verification, and synthesis of distributed systems.

📚Slides: part-1 and part-2

16:00 – 17:30 Modelling with Workflow Nets (Robin Bergenthum)

📜Description: This lecture focuses on the modelling and analysis of business processes using workflow nets. Building on the foundations introduced in Session 1, the lecture introduces important workflow net concepts and correctness properties, including final states and soundness. The lecture revisits the modelling of control-flow patterns using concepts such as repeated labels and silent transitions and discusses the relationship between workflow nets and BPMN. Furthermore, it presents important extensions of workflow nets, including guarded transitions, stochastic behaviour, and data-aware nets, which are required to represent practical business process scenarios, particularly in the context of process mining.

Tuesday June 30

Module 3: Procedural Approaches

09:00 – 10:30 Model Discovery (Boudewijn van Dongen)

📜Description: This session focuses on discovering process models from event data. It introduces the main ideas behind transforming event logs into concrete models, revisits key quality metrics (generalization, precision, recall, and simplicity), and discusses intermediate representations such as directly-follows graphs and relational models. Core discovery algorithms are covered, including the Alpha Miner, Heuristic Miner, process trees, and their relationship to sound workflow nets. The session also presents the Inductive Miner, its extensions, and briefly discusses how it can be applied to discover object-centric Petri nets using the split–discovery–glue approach. Relevant tools are introduced.

11:00 – 12:30 Conformance Checking (Boudewijn van Dongen)

📜Description: This session introduces conformance checking as a key process mining task. It covers token-replay–based techniques and alignment-based methods. The session also provides a quick overview of potential use cases as well as tool support for conformance checking.

Module 4: Procedural Simulation & Enhancement

14:00 – 15:30 Process Simulation (Anna Kalenkova)

📜Description: This session introduces process simulation and simulation models used in the context of PM. It covers some known simulation models used in discrete event simulation, data-driven simulation and more. The session also discusses how to model resources, time, queues and other performance-relevant aspects.

📚Slides: here

16:00 – 17:30 Process Enhancement, What-If Analysis, and Decision Support (Arik Senderovich)

📜Description: Building on the preceding session on process simulation, this lecture will focus on how process mining and simulation can be used for process enhancement and operational decision support. The session will be structured around data stories from real cases, mainly from healthcare applications, showing how event-log-based findings such as bottlenecks, waiting times, resource constraints, and slow variants can be translated into intervention hypotheses and evaluated through what-if analysis. I will also demonstrate a real tool for simulation, what-if analysis, and optimization, illustrating how process mining can move from retrospective diagnosis toward comparing feasible operational changes before implementation.

📚Slides: here

Wednesday July 1

Module 5: Declarative Models & Approaches

09:00 – 10:30 Foundations of Declarative Process Models (Claudio di Ciccio)

📜Description: This session provides a brief recap of LTLf and introduces the core properties of declarative process specifications. It discusses notions of equivalence and presents DFA/NFA-based analysis techniques for DECLARE. The session also discusses consistency checking of declarative models (e.g., via their reduction to LTLf satisfiability). Relevant tool support will be covered as well.

🛠️Technical requirements:

RuM (https://rulemining.org/) and MINERful (https://github.com/Process-in-Chains/MINERful) tools will be required during this session.

📚Slides: here

11:00 – 12:30 Discovery of declarative models (Claudio di Ciccio)

📜Description: This session focuses on discovering declarative specifications from event data. It introduces algorithms for the automated extraction of Declare constraints, discusses quality metrics for discovered declarative specifications, and discusses the redundancy and vacuity problems. The session also briefly presents DCR graphs as a more hybrid declarative–procedural approach, illustrates recent advancements in the field, and covers relevant tools.

🛠️Technical requirements:

RuM (https://rulemining.org/) and MINERful (https://github.com/Process-in-Chains/MINERful) tools will be required during this session.

📚Slides: here

14:00 – 15:30 Conformance Checking of Declarative Models (Andrea Marrella)

📜Description: This session covers conformance checking of declarative process models. It introduces finite state automata-based methods to formalize the trace alignment problem and explores AI-driven techniques for efficiently solving this issue and monitoring declarative models at run-time. Relevant tool support will also be covered.

🛠️Technical requirements:

RuM (https://rulemining.org/) will be required during this session.

Module 6: Methodology

16:00 – 17:30 Experiments in Process Mining (Jana Rehse)

📜Description: This session focuses on the methodological aspects of (algorithmic) process mining research. It introduces reliability and validity as two basic principles of scientific enquiry and explores how they can be threatened when conducting experiments on process mining algorithms. It then presents a structured methodology for designing and conducting such experiments that researchers can use to mitigate those threats. This methodology is accompanied by a practical checklist to support the planning, execution, and evaluation of experiments in a reliable and valid manner.

📚Slides: here

Social Dinner

19:30 Social dinner at Fink Gasthaus

📜Description: Just come here and enjoy the dinner!

Thursday July 2

Module 7: Large-Scale Processing & Visualization

09:00 – 10:30 Distributed Process Mining (Olaf Landsiedel)

📜Description: The lecture introduces distributed process mining as a response to the limits of traditional process mining, where IoT data from smart factories, cities, health, transportation, and similar domains is too distributed, heterogeneous, fast, voluminous, and privacy-sensitive for central event logs. The core idea is to “bring algorithms to the data”: local nodes mine partial process fragments or footprint matrices, which are then aggregated into global process models without moving all raw data centrally. The approach is extended to distributed conformance checking and evaluated on real-world datasets, showing that predecessor-based querying and batching can substantially reduce communication while preserving useful process-model fitness.

📚Slides: here

11:00 – 12:30 Explainable Process Mining (Agnes Koschmieder)

📜Description: This session introduces the concept of explainability in process mining, with a particular focus on data quality of event logs. The first part of the presentation addresses how to systematically derive explainability in terms of transparency. For this purpose, a pipeline for transforming unstructured data into process mining-ready event logs will be summarized. Each of the steps of the pipeline involves assumptions and transformations that can significantly influence the resulting process model. The second part of the session focuses on data quality in event logs presenting a data quality improvement process. Finally, we clarify the important distinction between outliers and noise in event log data. While both deviate from expected patterns, noise typically refers to random errors or irrelevant data points, whereas outliers may represent rare but valid process behavior. Understanding this distinction is crucial for deciding whether such data should be removed, corrected, or further analyzed.

📚Slides: here

14:00 – 15:30 Streaming Process Mining (Andrea Burattin)

📜Description: This session presents techniques that show how streaming process mining supports real-time insights from continuously generated data. The session starts with a general description of the motivations behind the need for streaming process mining. Then it continues with an overview of a possible framework for classifying streaming techniques, including one concrete example for each of them. The session concludes with a hands-on overview of pyBeamline's current capabilities, a Python library for streaming process mining.

🛠️Technical requirements:

Basic knowledge of Python, basic experience with notebooks.
pyBeamline framework from https://beamline.cloud/pybeamline/installation/.

📚Slides: here

16:00 – 17:30 Process Mining & Visual Analytics (Jana Rehse)

📜Description: This session focuses on the interplay of process mining and visual analytics (VA), an interdisciplinary field that combines data visualization, analytical techniques, and interactive interfaces to help people make sense of complex and often large datasets. It introduces the foundations of visualization and VA, including the underlying cognitive principles, as well as procedure models and frameworks that can be used to design and analyze visualizations. It then inspects existing visualizations in the process mining field to identify both strengths and areas of improvement.

📚Slides: here

Friday July 3

Module 8: Complex Process Processing & Privacy

09:00 – 10:30 Object-Centric Process Mining (Dirk Fahland)

📜Description: Processes and event data in practice typically operate over multiple related objects. This session explores how behavior of such processes is recorded, how to extract event logs, and how flattening into classical logs introduces errors. We then study Object-Centric Event Data (OCED) models that avoid these errors, how to build OCED from source data. We discuss the principles for generalizing classical process mining techniques to Object-Centric Proces Mining technique (OCPM) over OCED for data exploration, process discovery, and performance analysis. We show how OCED allows integration of source data with PM results, enabling new technques such as actor and queue analysis, and new industrially relevant use cases.

📚Slides: here

📚Book chapter on EKGs: here

11:00 – 12:30 Abstraction in Processes (Xixi Lu)

📜Description: This session introduces event abstraction in process mining as a way to manage the complexity of real-life event data and process models. Event data are often recorded at different levels of granularity, while process mining techniques require a level of detail that is both analytically meaningful and understandable to users. The lecture discusses how event abstraction can bridge low-level recorded events and high-level process logic, including techniques for deriving higher-level activities from event data using patterns or process knowledge. The session shows how event abstraction can be used to improve the interpretability and quality of process mining results and support more effective analysis of complex processes.

14:00 – 15:30 Foundations of Privacy-Enhancing Technologies (Florian Tschorsch)

📜Description: This lecture will provide the foundations of privacy-enhancing technologies by alternating between offensive and defensive perspectives. On the attack side, it will cover techniques such as re-identification; on the defense side, it will introduce mechanisms such as k-anonymity, l-diversity, and differential privacy. Students will develop an understanding of the fundamental privacy–utility trade-off. The session will conclude with a discussion of privacy metrics in process mining, thereby preparing the ground for the next session.

📚Slides: here

16:00 – 17:30 Event Log Anonymization and Inference Attacks (Stephan Fahrenkrog-Petersen)

📜Description: A plethora of event log anonymization techniques has been developed by the community, able to provide common privacy guarantees. The lecture will provide an overview of the existing techniques and also include a deep-dive into the anonymization techniques implemented in PM4Py: SaCoFa, SaPa, and PRIPEL. Further, we will provide an outlook on inference attacks on process science artifacts such as process models and schedules. This outlook will demonstrate how inference attacks enable attackers to gain information from indirect data.

📚Slides: here

Page updated

Google Sites

Report abuse