Enforcing the Commitment ordering (or Commit ordering; CO) history (transaction schedule) property or its generalization Vote ordering (VO) locally in every transactional object guarantees both Global serializability and automatic resolution of Global deadlocks in a distributed environment with multiple autonomous databases and other transactional objects that may use each any concurrency control (CC) technique. In the presence of CO any incompatibility between databases' local precedence orders, which is the cause of Global serializability violation, generates a voting-deadlock in the Atomic commit protocol (ACP; of any type, e.g., from a simple hand-shake to the much more reliable Two-phase commit (2PC) and Paxos protocols). The voting-deadlock prevents the violation from happening. ACP resolves every such voting-deadlock by aborting a transaction with a missing (blocked) vote. Such abort resolves the incompatibility and guarantees Global serializability. With such mechanism no database needs CC information of another database, and thus no distribution of such information is needed (database autonomy). Distribution of CC information (e.g., precedence relations, locks, timestamps, and tickets) is the impediment that makes all other known CC mechanisms ineffective when distributed (e.g., limited by lack of scalability). Also locking-based Global deadlocks generate voting-deadlocks and are resolved automatically by the same voting-deadlock resolution mechanism, an important side-benefit of CO. CO can be enforced both locally and globally by effective algorithms that only order commit events and do not interfere with the databases' other CC mechanisms' operations.
The popular Strong strict two-phase locking (SS2PL) protocol is a variant of CO (a blocking and constraining special case) and inherits many of CO's qualities. It has been the de-facto standard for (SS2PL based) multi-database Global serializability since the 1980s (before the discovery of CO in 1990; but the automatic Global deadlock resolution for SS2PL has been overlooked in the research literature). As a result any CO compliant database system can transparently join any SS2PL based Global serializability solution and maintain Global serializability.
CO also has generalizing variants (super-sets) with similar global behavior and more concurrency. CO and its variants are transparently interoperable, and also in a mixed environment with database systems of different variants both Global serializability and automatic Global deadlock resolution are guaranteed. In the case where no CC information distribution exists, the set comprising CO and its variants (this set is called Generalized CO, GCO, or Vote ordering, VO) is the necessary condition for guaranteeing Global serializability. Thus this set probably provides the only existing general solution for Global serializability that scales in number of databases and computer network size (the maximal set that scales ).
1. Conflicts and their graphs - A unified conflict theory
A transaction is in a conflict with another transaction, if the transaction requests (issues, invokes) access operation to a database object being already accessed by the other transaction, and the two access operations are not commutative.
Note: This definition is different from the common definition of conflict: Traditionally a conflict happens not upon request but rather upon being materialized by actual access (e.g., Bernstein 1987, Weikum 2001).
Conflicts are fundamental for Concurrency control (CC) mechanisms. A mechanism is categorized by the way it handles conflicts. Its type is:
Note: These definitions are equivalent to the common definitions of the respective terms.
The (common) conflict graph of a history (transactions' schedule) is the directed graph of the materialized conflicts among transactions
Unaborted transactions are nodes. An aborted transaction is removed from the graph. Conflicts are directed edges. An edge exists from a transaction with earlier operation to the one with the later conflicting operation (two transactions may be connected by multiple edges).
The Conflict serializability theorem (e.g., Bernstein 1987)
A history has the Conflict serializability (CSR) property if and only if the projection of its conflict graph on committed transactions (i.e., the conflict graph of committed transactions only; the commit projection of the graph) is cycle free.
A precedence partial order of committed transactions in a history is induced by the commit projection of the conflict graph when it is cycle-free. I.e., if committed transaction T2 is in a materialized conflict with committed T1, then (there exists an edge from T1 to T2, and) T1 precedes T2.
The (common) wait-for graph of a history is the directed graph of the non-materialized conflicts among transactions
Unaborted transactions are nodes. Conflicts are directed edges. An edge exists from a transaction with later operation to the one with the earlier conflicting operation (opposite to the time-order convention of the conflict graph).
The fundamental data-structure: The augmented conflict graph – A directed graph of all the conflicts
Transactions are nodes; conflicts are directed edges. It is the union of the conflict graph and the reversed edge (due to the common time-order convention) wait-for graph.
Cycles in the augmented conflict graph:
2. Commitment ordering
Partial orders compatibility - Two partial orders over elements of the same set (here Transactions) are compatible if for every two elements, e.g., T1, T2, if their ordered pairs belong to both partial orders, then these elements' orders are the same in both pairs (e.g., when the pair (T1<T2) belongs to both partial orders (compatible), Vs. the case when (T1<T2) belongs to one partial order, and (T2<T1) belongs to the other (incompatible)).
Commitment ordering (or Commit ordering; CO) - A property of a history where the chronological order of transactions' commit events is compatible with the precedence order of respective (committed) transactions (Raz 1990-1994); i.e., for every committed T1, T2, if (T1 precedes T2) then (T1 commits before T2 commits).
Note: If transaction T2 is in a materialized conflict with T1, or in other words T1 precedes T2 in the conflict graph, than T1 commits before T2 does; this implies that when enforcing CO any cycle in the precedence graph through T1 and T2 is guaranteed to be broken by abort, to have a well defined precedence order, and that if T2 commits, its commit is blocked until T1 commits or aborts).
Note: CO is described in (Fekete 1988), (Fekete 1990) and (Lynch 1993) with the name Dynamic atomicity in the framework of Abstract Data Types (prior to CO). A mechanism for global serializability that uses local dynamic atomicity is described there, similarly to a result here.
The CO and Conflict-serializability theorem
Every CO (compliant) history is Conflict-serializability (CSR) compliant.
To enforce CO every cycle in the conflict graph (with materialized conflicts) needs to be broken before all its transactions are committed (see the note above). Thus no cycle of committed transactions can be generated, and the related history is CSR. [ ]
The CO deadlock theorem
When CO is being enforced every sustained cycle in the Augmented conflict graph represents a deadlock.
An edge from transaction T1 to T2 means that T2 is in a conflict with T1. If the conflict is non-materialized then the processing of T2 is blocked, typically until T1 ends. If it is a materialized conflict then when T2 reaches the ready state (its processing has ended) its commit is blocked until T1 ends by enforcing CO. A cycle of blocked transactions is a deadlock. [ ]
The following is an immediate corollary:
The CO locking-based deadlock theorem
When enforcing CO a cycle with at least one non-materialized conflict represents a locking-based conflict.
Such cycle is not a cycle in the (regular) conflict graph, and at least one blocking by a lock has occurred. [ ]
The CO commit-delay theorem
Every CSR history can be made CO compliant by delaying commit events, and without aborting any transaction.
In a CSR history the committed transactions generate a cycle-free conflict graph which induces a precedence order. Starting from the initial transactions in the precedence order and following the order, for any pair (T1 precedes T2), the commit of T2 is delayed to be after the commit of T1. The resulting history is CO compliant. [ ]
Generic local CO algorithm – Guarantees generating CO
compliant histories only. Covers the entire CO class of histories.
Data structure: Conflict graph
A ready-to-commit transaction (its execution has ended) is chosen and committed. Then all the undecided transactions on in-coming edges in the conflict graph to that transaction need to be aborted (the abort set for that transaction). Aborted transactions are removed from the graph. A committed transaction with and empty abort set (the result of both commits and aborts, if not empty upon ready-to-commit) is removed from the graph.
Without a persistent (local) cycle through the committed transaction the abort set eventually becomes empty by delaying the commit for a sufficient long time (by transaction in it ending before the commit; when empty no abort occurs upon commit; in accordance with the theorem above). Cycles can be detected by a cycle detection algorithm to determine that the abort set cannot become empty. Alternatively the time to abort the transactions in a non-empty abort set can be determined heuristically by a time-out (with the possibility of an unnecessary abort(s) when no cycle exists, and a too short time-out).
Note: For SS2PL (see below) an abort set is always empty before commit, and this algorithm is irrelevant.
Generic enhanced local CO algorithm – Guarantees
both CO and deadlock resolution.
Augmented conflict graph
Same algorithm as the previous one, with the different data structure. However here the transactions' execution on an all non-materialized conflicts cycle (as for every cycle for SS2PL) is blocked without reaching the ready-to-commit state, and cycle detection should be invoked if such is suspected, and breaking it by abort if detected, or, alternatively, abort a transaction by timeout.
Note: For SS2PL (see below) all conflicts are non-materialized, and the algorithm reduces to (common) wait-for graph cycle handling.
3. Global commitment ordering and voting-deadlocks
Distributed (global) transaction – A transaction that spans two or more database systems.
Every distributed atomic transaction needs to end with an Atomic commitment protocol (ACP) to correctly determine whether to commit or abort all its distributed parts. ACP can be a simple hand-shake (a simple type of One-phase commit protocol), which may be used in a reliable environment, or where all protocol participant usually fail together (e.g., on a same computer, or same integrated circuit), or more reliable protocols (with more overhead), like the Two-phase commit (2PC) and Paxos protocols). Thus an ACP is a needed component supported in every environment with distributed transactions.
Distributed (global) cycle – A cycle in one of the graphs above that spans two or more database systems (data is assumed to be partitioned among database systems, and each conflict (edge) belongs to one database system).
Database systems typically schedule transactions and their operations independently. Any attempt to exchange concurrency control (CC) information (e.g., precedence relations, locks, timestamps, etc.) to coordinate their precedence orders typically results in substantial performance hit. With distributed transactions and no CC coordination precedence orders incompatibilities occur, resulting in Global serializability violations when the incompatibilities occur among committed transactions (among precedence orders projections over committed transactions).
The local precedence orders compatibility lemma:
If the precedence partial orders (with respective cycle-free conflict graphs commit projections) of two databases or more in a distributed environment are incompatible (no global partial order can embed all of them together) then the global conflict graph (unified graph of all databases) has a distributed cycle (which is the reason for Global serializability violation when all the cycle's transactions are committed).
Note: No precedence partial orders compatibility coordination is possible among autonomous databases. Lack of coordination, due to excess communication needed, is the common case, but usually the probability of incompatibility can be made low (but not zero) by known database and transaction design heuristics.
Vote ordering strategy – Each database system votes in the Atomic Commitment
Protocol (ACP; where the database is a participant; of any type) in a chronological order that is in accordance (compatible) with the local precedence (partial) order. A vote out-of-order is blocked (until an order is reached, or the respective voted on transaction is aborted). The voting mechanism is allowed to exploit all the parallel voting that the local precedence partial order allows, which is typically large.
The CO vote ordering strategy (CO enforcement) theorem:
In a multi-database environment in which each database orders local commit events in a chronological order compatible with the local precedence order, enforcing the vote ordering strategy in each database is a necessary and sufficient condition for guaranteeing (enforcing) CO both locally in each database and globally in the environment (without it CO may be violated).
The following corollary is concluded:
The CO and Global serializability theorem
When local CO is enforced (including the necessary and sufficient vote ordering strategy) in each database system in a distributed environment, them Global (conflict) serializability is guaranteed.
Global CO implies Global serializability. [ ]
A vote blocking can happen either by lock blocking (for a non-materialized conflict; blocked transaction's ready state cannot be reached, and thus no vote can happen) or by the vote ordering strategy (for a materialized conflict; the transaction in conflict is in ready state with the vote blocked by the vote ordering strategy).
Voting-deadlock – A situation when voting cannot happen because of mutual vote blocking of two distributed
transactions or more. ACP eventually aborts a distributed transaction in the deadlock with a
missing vote, the voting-deadlock is resolved, and the remaining transactions in the deadlock can be voted on and end normally.
The CO voting-deadlock theorem:
If CO exist locally in each database (with the necessary and sufficient vote ordering strategy), then a voting-deadlock occurs if and only if a distributed cycle in the augmented conflict graph is generated and persists.
Distributed CO algorithm – consists of enforcing
CO locally in each database (with the necessary and sufficient vote ordering strategy). The Atomic commitment protocol (ACP; any; which always exists for atomicity of distributed transactions) is an integral part of the Distributed CO algorithm. It guarantees both distributed (global) serializability and automatic resolution of locking-based distributed (global) deadlocks for the following reasons:
Note: The only distributed environment support needed to implement the CO solution (algorithm) for global serializability is (distributed) ACP which already exists for atomicity of distributed transactions. The specific CO implementation (including the vote ordering strategy) is local to each database system and can be one of many CO variants in each database (see below). ACP views the same (regular) voting behavior from all the variants, and as usual needs to abort a transaction when concluding that one of its votes is missing. This resolves any voting deadlock.
4. Commitment ordering variants
Major CO variants:
Strong strict two-phase locking - SS2PL: This is the de-facto standard for Global serializability (for SS2PL databases) since the 1980s. It is blocking and constraining, and CO relaxes it significantly. The name SS2PL is from (Raz 1992) and its earlier versions. SS2PL was called in the past Strict 2PL, e.g., (Bernstein 1987, page 57).
Strict CO - SCO: CO and Strictness are combined. Strictness is widely utilized for efficient database recovery from failure (Raz 1991).
Optimistic CO - OCO (covers the entire CO class of histories, and thus not a real special case, but rather a convenient name for CO when "optimistic" is emphasized)
History class (property) containment (Raz 1992)
Inherently-blocking properties have to utilize non-materialized conflicts (i.e., they need either pessimistic or semi-optimistc mechanisms)
Generalizations: More available local information allows to relax the local CO constraints and increase concurrency, while still achieving Global serializability.
Extended CO - ECO: Local conflict serializability, CSR, is assumed as part of it; local transactions (confined to a single database) are identified; now the vote ordering strategy means that commit order of distribute (global, but not local) transactions only is compatible with the local precedence order; local transactions can commit at any chronological order (Raz 1993a). This means that even commit of local transactions is uninterrupted (with CO commit delay may occur).
Multi-version CO - MVCO: Commutativity based conflicts are extended to handle Multi-version data. Single-version CO, the common CO, is a special case of MVCO (Raz 1993b).
Extended Multi-version CO - EMVCO: MVCO and ECO are combined.
Note: ECO in its original article (Raz 1993a) is defined to be separate from local serializability. Here local serializability is part of ECO (and GCO; see below; the separation does not provide any benefit, but rather a little complicates the presentation of the generalizations).
The SCO Vs. SS2PL performance theoremFor a sufficiently large transaction load, with transactions with read-write conflicts (virtually almost every load with sufficiently many read and write operations), the average transaction completion time with SCO is always shorter than with SS2PL, and thus SCO provides better transaction throughput than SS2PL (in short, when sufficiently many read-write transactions are present then SCO almost always provides better performance than SS2PL).
Note: For the other conflict types. write-write and write-read, SS2PL and SCO have the same blocking behavior. Locking overhead is comparable.
read-write conflict: SS2PL Vs. SCO (Raz 1991)
t is the time duration for T2 to become ready after the w2(x) request is executed. The duration of T2 with SCO is always shorter than with SS2PL
Generalized CO (GCO; or Vote ordering, VO) is the set that contains CO and all its generalizing variants (e.g., all the above variants) that can achieve (with the necessary and sufficient vote ordering strategy) Distributed (and Global) serializability with local concurrency control (CC) information only (no ACP participant needs any CC information of another participant: we say that each participant has the Generalized autonomy property).
Note: The name Vote ordering (VO) well describes the GCO behavior: Only local serializability (by any mechanism) and the vote ordering strategy (which is in fact ordering the voting according to the local precedence order) need to be maintained.
Distributed GCO (VO) algorithm - consists of enforcing GCO (VO) locally in each database (with the necessary and sufficient vote ordering strategy). Identical to the Distributed CO algorithm above.
Note: APC views exactly the same voting behavior from CO and all its variants.
The CO variants interoperability theorem:
The last theorem above can be rephrased as follows and summarize the subject more compactly:
The Generalized CO (GCO, or Vote ordering, VO) theorem
Proof outline (following Raz 1993a):
With the way transactional systems typically operate (invoking transaction operations dynamically with no advance knowledge about them) this theorem positions GCO (VO) as probably the upper limit (maximal) history property for scalable (in number of database systems and computer network size) Distributed/Global serializability solutions. At the moment GCO contains CO, ECO, MVCO, and EMVCO (which contains all the others, so currently practically GCO=EMVCO), but new GCO compliant generalizing CO variants may be discovered (possibly with a new generalization of conflict serializability, similarly to the generalization for multi-version in Raz 1993b).
Other observations/corollaries of the theorem above:
Alan Fekete, Nancy Lynch, Michael Merritt, William Weihl (1988): Commutativity-based locking for nested transactions (PDF) MIT, LCS lab, Technical report MIT/LCS/TM-370, August 1988.
Alan Fekete, Nancy Lynch, Michael Merritt, William Weihl (1990): "Commutativity-based locking for nested transactions" (PDF), Journal of Computer and System Sciences (JCSS), Volume 41 Issue 1, pp 65-156, August 1990, Academic Press, Inc. doi>10.1016/0022-0000(90)90034-I.
Yoav Raz (1990): The Principle of Commitment Ordering, or Guaranteeing Serializability in a Heterogeneous Environment of Multiple Autonomous Resource Managers Using Atomic Commitment, DEC-TR 841, Digital Equipment Corporation, November 1990.
Yoav Raz (1991): Locking Based Strict Commitment Ordering, or How to improve Concurrency in Locking Based Resource Managers, DEC-TR 844, Digital Equipment Corporation, December 1991.
Yoav Raz (1992): "The Principle of Commitment Ordering, or Guaranteeing Serializability in a Heterogeneous Environment of Multiple Autonomous Resource Managers Using Atomic Commitment". Proceedings of the Eighteenth International Conference on Very Large Data Bases (VLDB), pp. 292-312, Vancouver, Canada, August 1992.
Yoav Raz (1993a): "Extended Commitment Ordering or Guaranteeing Global Serializability by Applying Commitment Order Selectivity to Global Transactions", Proceedings of the Twelfth ACM Symposium on Principles of Database Systems (PODS), Washington, DC, pp. 83-96, May 1993 (also DEC-TR 842, November 1991).
Yoav Raz (1993b): "Commitment Ordering Based Distributed Concurrency Control for Bridging Single and Multi Version Resources", Proceedings of the Third IEEE International Workshop on Research Issues on Data Engineering: Interoperability in Multidatabase Systems (RIDE-IMS), Vienna, Austria, pp. 189-198, April 1993.(also DEC-TR 853, July 1992).
Nancy Lynch, Michael Merritt, William Weihl, Alan Fekete (1993):Atomic Transactions In Concurrent and Distributed Systems, Morgan Kauffman (Elsevier), August 1993, ISBN 978-1-55860-104-8, ISBN 1-55860-104-X
Yoav Raz (1994): "Serializability by Commitment Ordering", Information Processing Letters (IPL), Volume 51, Number 5, pp. 257-264, September 1994 (Received 3 August 1991).
Gerhard Weikum, Gottfried Vossen (2001): Transactional Information Systems, Elsevier, ISBN 1-55860-508-8
© 2011 Yoav Raz
Initial version: June 2009
Pages of Yoav Raz >