- May 2011 marks 20 years since the first Commitment ordering (CO) patent filing and CO's public disclosure. CO has been quite unknown for years and largely misunderstood. In recent years CO utilization has increased considerably. This page provides resources that can help a wider audience of software developers, beyond academic database researchers, to understand CO and utilize it for guaranteeing Global serializability in distributed (multi process/thread) environments. CO is the only serializability based general concurrency control method that allows building large-scale reliable transactional-based distributed Web applications (in an environment with shared resources, even for atomic transactions with only two processes/participants). CO allows optimistic implementations. In most cases atomic transactions are preferred over other non-transactional ad-hoc methods for reliability.
- Generalized CO (VO; which includes CO and all its variants) is necessary for guaranteeing Global serializability under the following two assumptions:
1. Utilization of atomic transactions for reliability (in a failure prone environment).
2. Requiring database concurrency control autonomy in a distributed environment with multiple databases and other transactional objects
(i.e., databases do not communicate concurrency control information to each other).
Necessity in VO means that
if VO is violated under autonomy (independence of external concurrency control information), then serializability cannot be guaranteed
(may be violated, which typically quickly happens);
If VO is violated under local (defined by the boundary of an atomic commitment protocol (ACP) participant) serializability,
then autonomy (of the ACP participant) cannot be guaranteed (may be violated).
The necessity follows mathematically from the above assumptions.
- The CO patents are about to expire, in 2014, which will put CO in the public domain.
Commitment ordering (or Commit ordering; CO) is a general concurrency control (Serializability) class of techniques that I discovered in 1990, while at Digital Equipment Corporation (DEC). It provides an effective general solution to the Global Serializability Problem. This problem in the area of transactional systems has been characterized as Open problem still in October 1991, and has been considered unsolvable also years after I publicly disclosed CO in May 1991. CO is also an effective general Distributed serializability technique, i.e., it allows the effective distribution of every other known concurrency control technique. CO is the only known general serializability method that provides global and distributed serializability without the costly distribution of concurrency control information (e.g., local precedence relations, locks, timestamps, tickets, etc.), and thus provides unbounded scalability in network size and number of database systems. It also provides automatic global deadlock resolution (a fact true also in the common all-SS2PL environment, but unnoticed in the research literature). This makes CO especially effective in highly distributed environments, e.g., as with Cloud computing.
Major conclusions of the CO research:
1. CO can made any concurrency control mechanism serializable (serializability compliant) without interfering with the mechanism's scheduling methods.
2. For guaranteeing Global serializability and automatic Global deadlock resolution
serializability mechanism can be used locally in each local
transactional object/database system (not necessarily made CO compliant; a
complete heterogeneity). In addition to any given serializability mechanism only a certain vote ordering strategy (voting according to local precedence order) needs to be maintained
locally (for distributed transactions), which makes each local
transactional object/database system Vote ordering (VO) compliant, and the entire environment Global VO compliant (with Global serializability and automatic Global deadlock resolution).
- Special cases: When only a single concurrency control (serializability) method is utilized locally for all objects. I.e., the techniques allow the effective
distribution (low overhead, scalable, with neither mechanism's interference nor even necessarily making it CO compliant) of every
known serializability method while
guaranteeing Distributed/Global serializability and automatic Global deadlock resolution.
Evidently CO has been misunderstood by some researchers for years (see Quotations in Wikipedia - Global serializability), as I found a while ago, to my surprise, after years of not following the related research area. With all the interest earlier in Global serializability problem it is quite strange that almost immediately after I introduced the CO solution it has almost "disappeared," or often mentioned with accompanying incorrect information when mentioned. The Global serializability problem has continued to be considered "unsolvable" by some, with explanations and excuses as before. However the quite intensive rate of publications about the subject of concurrency control of databases in conferences and journals has declined quickly. Proposed solutions still have continued to appear, but usually not in major publications. SS2PL has practically been quite satisfactory for distribution for years with a quite limited utilization in "shared nothing" environments (without a distributed lock manager; e.g., in a common (SS2PL based) multi-database system with a transactional environment that provides 2PC). As a result people from other areas, different from databases, looking for not-SS2PL Global serializability solutions (e.g., optimistic), have not known about CO, and often ended with ineffective proposals based on other earlier database techniques. Another reason is the fast technological development of computing: With limited distribution over fast, low-latency networks the difference between local and distributed CC is shrinking, i.e., local techniques are useful over increasing sizes of computing environments and geographical distances. However, with distribution beyond a certain modest size only CO (including SS2PL) scales, also for using possibly needed optimistic methods. For example, with Cloud computing local methods cannot support the high distribution, and CO can provide substantial benefits. Current wisdom proposes ways to avoid distributed atomic transactions, but in many cases (applications) distributed transactions are unavoidable.
Commitment ordering has been quite widely known inside the transaction processing and databases communities at DEC since 1990 (primarily pursued in order to acquire buy-in for utilization in products and approval for patenting; DEC was largely the "decision-by-consensus company" at its lower-medium organizational levels). It was under company confidentiality due to patenting processes of CO methods for guaranteeing global serializability (it had an email distribution of approximately 40 people, probably too many to keep a secret about a technique with a self-explanatory name in a company of ~130,000...). Intended trial and possible utilization of SCO (replacement for SS2PL) in Rdb, DEC's database system, which was sold to Oracle - I do not know if ever happened. CO was disclosed outside of DEC by lectures and technical reports' submission and distribution to leading database researches (tens) in May 1991, immediately after its first patent filing. In September 1991 it was presented during a poster session in the Asilomar High Performance Transaction Systems (HPTS} workshop in Pacific Grove, California. It was later presented with articles at VLDB 92 (CO), PODS 93 (ECO), and RIDE-IMS 93 (MVCO). An additional article (submitted Aug. 91) appeared in Sep. 94 in the journal IPL (see references below).
The discovery of CO resulted from my attempt to find the Global serializability "secret" of SS2PL, a hidden property (if existed), and by that to generalize SS2PL (SS2PL is constraining, which often negatively impacts performance). At that time SS2PL was the only available solution and the de-facto standard for Global serializability. The consensus among the database experts was that "nothing more (beyond SS2PL) can be done." I first generalized it slightly to a property I named "Separation" (SEP), thought this was the end of the way, and wrote a paper (by Phil Bernstein's suggestion, to prove my claims, which turned out to be the right thing to do). Later, after finding a mistake in SEP's necessity proof and being mad about the broken proof and confused for three days, I suddenly "saw" CO which corrected the proof! CO generalized SEP, was the necessary property, and everything came into place. "Commitment ordering" was the natural name for the new property. It was simple, beautiful, non-blocking, surprisingly general, and a new paradigm in concurrency control (serializability). It provided a general effective, scalable solution for the global serializability problem. Then Extended SEP was generalized into Extended CO (ECO). Algorithms emerged as part of the CO discovery, where SEP algorithms turned into CO algorithms. As a matter of fact the correct necessary (CO) algorithm for the necessity proof was discovered just before the resulting property (CO) was formulated. A generalization of conflicts and CO for multi-version (MVCO) followed. The combination of CO and all its generalizations that do not need the distribution of CC information (e.g., local precedence relations, locks, timestamps, tickets, etc.) are now called Vote ordering (VO), or Generalized CO (GCO). As ECO, it consists of Local serializability and a Vote ordering strategy. VO achieves Global serializability and Automatic global deadlock resolution also with no need in local CO (but rather local serializability).
Atomic commit protocol (ACP) has been needed to bring Global serializability also for SS2PL based environments (to sync the end of a distributed transaction). After explicitly being mentioned in the first CO publication, the need in ACP has been repeated in many places, and ACP (specifically, Two-phase commit protocol - 2PC) alone has become in time the folklore Global serializability solution among practitioners, forgetting that assuming the popular SS2PL has been the key in existing solutions (that do not use CO in its more general forms). However, another aspect, CO's side-benefit of automatic Global deadlock resolution in SS2PL based environments, has never been mentioned in the literature until today (except in the CO publications), though global deadlocks have been a quite intensive research area.
Remarks about distributed VO, CO, and SS2PL (also named Rigorous scheduling, and previously Strict 2PL): Without ACP and a related vote ordering strategy the scalable distributed CO (and distributed VO) solution cannot be explained in general, and cannot work at all for autonomous database systems. SS2PL is the only known CO variant for which the vote ordering strategy is automatic (implied). Thus Global SS2PL can be trivially explained without knowing about CO, and indeed this has been noticed earlier, and SS2PL has been utilized for Global serializability in multi-database systems also several years before the discovery of CO in 1990. Global SS2PL has become the de-facto standard for Global serializability, and is such until today. The added value of CO (VO): Every CO (VO) compliant database system (with any CC mechanism, that may outperform SS2PL in some situations, especially optimistic or semi-optimistic variants; CO compliance is easy to enforce; VO requires only local serializability and Vote ordering strategy) can join any such SS2PL based distributed solution and maintain Global serializability. With the needed (for CO, VO) Vote ordering strategy also Local recoverability (strictnes) implies Global recoverability (strictness).
Wikipedia - Commitment ordering is probably the broadest and most comprehensive existing article on CO (with no mathematical notation, but quite mathematical) .
Theory of Commitment Ordering on this site is a theory summary.
See also History of Commitment Ordering on this site.
Other background information can be found in
DEC internal memo on 11-16-1990: Call for patenting.
The final definition of Database (concurrency control) autonomy was given shortly later.
Last version of the full Technical Report (working document) as of 1995, still with typos.
Note: This last version of the paper includes summaries of all the CO technical reports, except ECO.
New concepts introduced in the paper:
- Commitment ordering (CO) and the role of atomic commitment in Global CO
- Extended CO (ECO)
- Using the name Strong Strict 2PL (SS2PL), and making a distinction from Strict 2PL (S2PL)
- Strict CO (SCO) and its concurrency and performance advantage over SS2PL
- Multi-version CO (MVCO): Conflicts are extended for multi-version to extend conflict serializability, and an underlying theory is given.
- Database (Concurrency control - CC) autonomy
- Database extended knowledge autonomy
- Local and distributed CO algorithms
- When CO is a necessary condition for Global serializability (Modular serializability)
- Commit order coordinator (COCO) and its architecture
- Guaranteeing a property (precise definition)
- Inherently blocking property
- Commit decision delegation (CDD; precise definition)
- Commit decision delegation dependency condition (CDDDC; primarily utilized as a vote ordering strategy in 2PC for CO)
- Local cycle Vs. Global cycle in the serializability graph
- Automatic resolution of voting deadlocks (global cycles in the augmented conflict graph; see below) by atomic commitment, when CO is maintained, both for maintaining global serializability and resolving global data-access locking deadlocks.
- Common schedule properties that exist globally when applied locally (at the presence of atomic commitment with CDDDC)
New terms and concepts added later for simplifying presentation and clarity (see also The Theoory of Commitment Ordering):
- Voting deadlock, a generalization of the common data-access locking global deadlock, when atomic commitment is present
- Augmented conflict graph, the union of the conflict graph with the reversed-edge wait-for graph. When enforcing CO cycles characterize deadlocks: either potential serializability violations or data access related deadlocks. Global cycles characterize voting deadlocks
- Unified conflict theory: Materialized conflicts and non-materialized conflicts; . A conflicting operation blocked by a lock, issued by transaction T and released by T at its end, generates a non-materialized conflict. A cycle in the augmented conflict graph with at least one non-materialized conflict is not a cycle in the regular conflict graph and does not affect serializability; however, any cycle represents a deadlock
- Generic local enhanced CO algorithm, which uses the Local augmented conflict graph (rather than the regular Local conflict graph, as in the original Generic local CO algorithm), and locally both guarantees serializability, and resolves (local) deadlocks (Vs. only serializability in the original algorithm)
- Database generalized autonomy: Generalizes autonomy (which includes extended knowledge autonomy) by allowing any local information to be used (not using CC information from another participant in Atomic commitment). This is a necessary condition for database systems to generate a unified schedule with Global serializability that captures both CO, its generalizing variants ECO and MVCO, and possibly additional unknown useful generalizing variants. The necessity is proven the same way as for (regular) autonomy and CO. It assumes using a generalized CDDDC condition defined with a transitive conflict (a conflict graph path between two voted-on (global) transactions), e.g., as in ECO, and a general (commutativity based) conflict, e.g., as in MVCO (October 2010). This can be summarized in the concept of
- Genaralized Commitment ordering (GCO, or Vote ordering (VO)) : This property exists whenever Generalized CDDDC and serializability exist locally. It contains CO, ECO and MVCO and possibly other interesting generalizing CO variants (still unknown). It is a necessary condition for Global serializability when Generalized autonomy (not relying on CC information from other ACP participants) exits (January 2011)
A conference version of the ECO technical report (cannot find the full TR):
A conference version of the MVCO technical report (cannot find the full TR):
The following are the CO publications:
- Yoav Raz (1992): "The Principle of Commitment Ordering, or Guaranteeing Serializability in a Heterogeneous Environment of Multiple Autonomous Resource Managers Using Atomic Commitment." Proceedings of the Eighteenth International Conference on Very Large Data Bases (VLDB), pp. 292-312, Vancouver, Canada, August 1992. (also DEC-TR 841, Digital Equipment Corporation, November 1990)
- Download/view the VLDB 1992 article (PDF)
- Yoav Raz (1994): "Serializability by Commitment Ordering." Information Processing Letters (IPL), Volume 51, Number 5, pp. 257-264, September 1994. (Received August 1991)
- Yoav Raz (2009): Theory of Commitment Ordering - Summary GoogleSites - Site of Yoav Raz. Retrieved 1 Feb, 2011.
- Yoav Raz (1990): On the Significance of Commitment Ordering - Call for patenting, Memorandum, Digital Equipment Corporation, November 1990.
- Yoav Raz (1991b): "The Commitment Order Coordinator (COCO) of a Resource Manager, or Architecture for Distributed Commitment Ordering Based Concurrency Control", DEC-TR 843, Digital Equipment Corporation, December 1991.
- Yoav Raz (1991c): "Locking Based Strict Commitment Ordering, or How to improve Concurrency in Locking Based Resource Managers", DEC-TR 844, December 1991.
- Yoav Raz (1993a): "Extended Commitment Ordering or Guaranteeing Global Serializability by Applying Commitment Order Selectivity to Global Transactions." Proceedings of the Twelfth ACM Symposium on Principles of Database Systems (PODS), Washington, DC, pp. 83-96, May 1993. (also DEC-TR 842, November 1991)
- Yoav Raz (1993b): "Commitment Ordering Based Distributed Concurrency Control for Bridging Single and Multi Version Resources." Proceedings of the Third IEEE International Workshop on Research Issues on Data Engineering: Interoperability in Multidatabase Systems (RIDE-IMS), Vienna, Austria, pp. 189-198, April 1993. (also DEC-TR 853, July 1992)
On May 18, 2011 I found (by Google alert for "Dynamic atomicity") that the Dynamic atomicity property in a form equivalent to the CO property appeared prior to CO in:
- Alan Fekete, Nancy Lynch, Michael Merritt, William Weihl (1988): Commutativity-based locking for nested transactions (PDF) MIT, LCS lab, Technical report MIT/LCS/TM-370, August 1988. (Found 5/18/2011 by Google alert to "Dynamic atomicity")
- Alan Fekete, Nancy Lynch, Michael Merritt, William Weihl (1990): "Commutativity-based locking for nested transactions" (PDF), Journal of Computer and System Sciences (JCSS), Volume 41 Issue 1, pp 65-156, August 1990, Academic Press, Inc. doi>10.1016/0022-0000(90)90034-I
The above definition of Dynamic atomicity is different from the one I have been referred to and referencing (Weihl 1989) as prior art since I heard about Dynamic atomicity from Maurice Herlihy (at DEC at that time; after he saw my article) in December 1990 (or later?). This definition defines a special case of CO, which is difficult to implement as an effective general mechanism, and I have not suspected that two definitions (by the same people) have existed at approximately the same period (though it could happen due to the time needed to publish a journal article). The original Dynamic atomicity definition I knew (and its related optimality result) is from:
- William Edward Weihl (1984): Specification and Implementation of Atomic Data Types (PDF), Ph.D. Thesis, MIT-LCS-TR-314, March 1984, MIT, LCS lab.
- William Edward Weihl (1989): "Local atomicity properties: modular concurrency control for abstract data types", ACM Transactions on Programming Languages and Systems (TOPLAS), Volume 11, Issue 2, April 1989, pp. 249 - 282, ISSN:0164-0925