Operational Transformation

Back to I was in a startup company two years ago, we were going to develop an online video chatroom over WebRTC with a collaborative real-time editor to share codes or thoughts for online interviews. Starting to build the editor from scratch seemed quite challenging for us at that time, so we chose etherpad-lite, an open-source project derived from Google Wave(Apache Wave for now), since we didn’t have too many choices. After deploying etherpad-lite to our servers, we could hardly make the synchronizations smoothly enough, and the merging strategies kept bothering us too. Soon we degraded the editor with a speed-over-merging solution: synchronize the local changes to the server ASAP, solve the conflicts with the so-called last-write-wins(LWW) strategy at the server side, and then broadcast the latest version of text to all the other clients. The only problem is that this approach may cause data loss while two or more users concurrent typing at the same time, but it is a compromise we can put up with.

However, the most interesting part is how collaborative editing systems work. In this post, we’ll discuss several characteristics of collaborative editing systems, as well as the models to ensure the documents are consistent. Then we’ll describe the operational transformation algorithm in detail, and finally we’ll survey several approaches to operational transformation.

Collaborative editing systems

Collaborative editing systems like Google Doc provide an editor with a background synchronization mechanism, which allows simultaneous editing of a document. Basically, there are three main characteristics of a collaborative editing system:

  1. Real-time: the editor should response ASAP while typing, optimally, it should have a response time like a local editor;
  2. Distributed: each user owns a separated editor which may get separated geographically and be connected by different networks with non-deterministic latency.
  3. Unconstrained: users are allowed to concurrently modify any part of the document at any time.

How about the traditional ways?

The traditional web approaches of solving conflicts(w/o data loss, of course) are basically pessimistic and optimistic locking.

Pessimistic locking

Every time before updating the document, each client needs to acquire an exclusive lock first. While a client is holding the exclusive lock, the other clients have to wait until the lock is released, this may make the synchronization significantly slow down, thus the editor would not be real-time.

Optimistic locking

Optimistic locking actually does not acquire any lock. Before updating, each transaction verifies that no other transaction has modified the resource it has just read. If the check reveals conflicting modifications, the updating transaction rolls back. However, for collaborative editing systems, the conflicts may occur quite often, and “roll back”s are not acceptable.

In the next sections, we will discuss the algorithms that operational transformation uses which make rollbacks unnecessary for resolving conflicts. But first, we need to get familiar with the concepts of consistency models.

Consistency models

In distributed systems, a consistency model is a contract between the system and the developer. The system guarantees that if the programmer follows the rules, memory will be consistent and the results of reading, writing, or updating memory will be predictable. Consistency models define rules for the apparent order and visibility of updates. For collaborative editing systems, several consistency models have been proposed. We here discuss one of the most commonly referenced models by Sun et al. AKA the CCI Model, in where three consistency properties are grouped together:

  1. Convergence: ensures the replicated copies of the shared document be identical at all sites, this property guarantees that the document will be correct at the end of any particular editing time period.
  2. Causality preservation: ensures the execution order of causally dependent operations be the same as their natural cause-effect order during the process of collaboration. The causal relationship between two operations is defined formally by Lamport’s “happened-before” relation. When two operations are not causally dependent, they are concurrent. Two concurrent operations can be executed in different order on two different document copies. For example, given two operations O_{A} and O_{B}, if O_{A} \rightarrow O_{B}(O_{A} causally occurred before O_{B}), then O_{A} is executed before O_{B} at each site. This property guarantees that the document will be correct at any point during editing.
  3. Intention preservation: ensures that the effect of executing an operation on any document state be the same as the intention of the operation. The intention of an operation O is defined as the execution effect which can be achieved by applying O on the document state from which O was generated. This property is similar to the convergence property, but additionally deals with the situation where operations are submitted from two different initial document states.

Operational transformation

Each client(including the server in centralized OT system) in collaboration systems utilizing OT stores a local copy of the document on their own. OT actually doesn’t allow one to directly modify the local copy, instead, any modifications to the local copy have to be performed in terms of operations, which will be sent one by one to the server. After an operation is confirmed by the server, it will be transformed and propagated to all the clients to maintain a synchronized state. In case there are concurrent operations applying to a local copy, all the operations have to be performed in an FIFO queue.

Data models

The data models define the way data in a document are organized, different OT systems may have different data models. For example, the most basic data model is a single linear address space(like a string). For advanced data models, we may need more complex operations such as moving data between memory addresses. For example, JSON-based rich text editors like Quill can have edit operations on particular lookup paths of a JSON object.

For simplicity, in this post we assume the data is stored as a single string.

Operation models

Operation models define the set of operations that can be directly transformed by OT functions. There are two basic approaches to supporting complex application-level operations for manipulating shared documents:

  • Primitive operation model: the OT system has an operation model consisting of a small number of primitive operations. In this approach, transformation functions are defined at the primitive operation level, which is simple and transformation functions so designed are reusable among different applications. Here we define two types of operations:
    • delete(Integer N): delete one character at the index N;
    • insert(Integer N, String S): insert string S at the index N;
  • Application-specific operation model: the OT system has an operation model consisting of a large number of application operations. In this approach, transformation functions are defined at the application operation level, transformation functions so designed are not reusable for different applications(like Google Wave).

For an application with N different operations, N × N transformation functions are needed for supporting this application in both ways.

Basic example

Now imagine that there are two sites working on a shared document represented by a sequence of characters. These characters are addressed from 0 to the end of the document. Initially, both copies hold the string "efecte". User 1 executes operation o_{1} = insert(1, ``f") to insert the character "f" at position 1. Concurrently, user 2 performs o_{2} = delete(5) to delete the character "e" at position 5. When o_{1} is received and executed on site 2, it produces the expected string "effect". But, when o_{2} is received on site 1, it does not take into account that o_{1} has been executed before it and it produces the string "effece". The result at site 1 is different from the result of site 2 and it apparently violates the intention of o_{2} since the last character "e", which was intended to be deleted, is still present in the final string. Consequently, we obtain a divergence between the two sites. It should be pointed out that even if a serialization protocol was used to require that all sites execute o_{1} and o_{2} in the same order to obtain an identical result "effece", this identical result is still inconsistent with the original intention of o_{2}.

To maintain convergence, our algorithm would execute a variant of o, denoted by {o}'(called transformation of o) that intuitively intends to achieve the same effect as o. We denote this algorithm by a function T. When o_{2} is received on site 1, o_{2} needs to be transformed according to o_{1} as follows: T((delete(5), insert(1, ``f")) = delete(6). The deletion position of o_{2} is incremented because o_{1} has inserted a character at position 1, which is before the character deleted by o_{2}. Next, o_{2} is executed on site 1. In the same way, when o_{1} is received on site 2, it is transformed as follows: T(insert(1, ``f"), delete(5)) = insert(1, ``f"); o_{1} remains the same because "f" is inserted before the deletion position of o_{2}.

Operation transformation

Operation transformation relies on satisfying several properties that ensure that the system is correct at each point. These are lower-level properties than the consistency model described previously, but not all properties are satisfied by each control algorithm, and where these properties are ensured may differ between each one. These transformation properties are often divided up into two different types: convergence properties and inverse properties.

If an OT system is to support particular functionality, then it must be able to support certain transformation properties. For group editing and consistency maintenance, the system must support a transformation function known as Inclusion Transformation (IT). For group undo, where the effect of a previously executed operation is undone at all sites, and all operations executed after it are all re-transformed, the system must support another transformation function known as Exclusion Transformation (ET).

  1. Inclusion transformation: IT\left ( O_{A}, O_{B} \right ), which transforms operation O_{A} against another operation O_{B} in such a way that the impact of O_{B} is effectively included.
  2. Exclusion transformationET\left ( O_{A}, O_{B} \right ), which transforms operation O_{A} against another operation O_{B} in such a way that the impact of O_{B} is effectively excluded.

For example, Ellis et al. extended the insert function with signature i(p, c, s) where:

  • p = position of the character to be inserted;
  • c = character to insert;
  • s = site priority of the client, which is always different from the others.

the inclusion transformation function in Ellis’s algorithm can be described as:

IT(i(p_{1}, c_{1}, s_{1}), i(p_{2}, c_{2}, s_{2})) = \left\{\begin{matrix} i(p_{1}, c_{1}, s_{1}) & if(p_{1} < p_{2}) \vee (p_{1} = p_{2} \wedge c_{1} \neq c_{2} \wedge s_{1} < s_{2}) \\ i(p_{1} + 1, c_{1}, s_{1}) & if(p_{1} > p_{2}) \vee (p_{1} = p_{2} \wedge c_{1} \neq c_{2} \wedge s_{1} > s_{2}) \\ Nop() & Otherwise \end{matrix}\right. IT(i(p_{1}, c_{1}, s_{1}), delete(p_{2})) = \left\{\begin{matrix} i(p_{1}, c_{1}, s_{1}) & if(p_{1} < p_{2}) \\ i(p_{1} - 1, c_{1}, s_{1}) & Otherwise \end{matrix}\right. IT(delete(p_{1}), i(p_{2}, c_{2}, s_{2})) = \left\{\begin{matrix} delete(p_{1}) & if(p_{1} < p_{2}) \\ delete(p_{1} + 1) & Otherwise \end{matrix}\right. IT(delete(p_{1}), delete(p_{2})) = \left\{\begin{matrix} delete(p_{1}) & if(p_{1} < p_{2}) \\ delete(p_{1} - 1) & if(p_{1} > p_{2}) \\ Nop() & Otherwise \end{matrix}\right.

Besides Ellis’s algorithm, several other operation transformation algorithms have also be proposed including Ressel’s algorithmSun’s algorithm, Suleiman’s algorithm and Imine’s algorithm. Notice that some OT systems use both IT and ET functions, while some use only IT functions.

Transformation properties

Convergence properties must generally be satisfied by all control algorithms, as part of the convergence guarantee in the consistency model. The following two properties are related to achieving convergence:

CP1: Given a state S, and two operations O_{A} and O_{B}:

\begin{aligned} \text{If } & O'_{A} = IT\left ( O_{A}, O_{B} \right ) \text{ and } O'_{B} = IT\left ( O_{B}, O_{A} \right ) \\ \text{Then: } & S \circ O_{A} \circ O'_{B} = S \circ O_{B} \circ O'_{A} \end{aligned}

Here O_{A} \circ O_{B} denotes the sequence of operations containing O_{A} followed by O_{B}. CP1 is required only if the OT system allows any two operations to be executed in different orders.

CP2: Given a state S, and three operations O, O_{A} and O_{B}:

\begin{aligned} \text{If } & O'_{A} = IT\left ( O_{A}, O_{B} \right ) \text{ and } O'_{B} = IT\left ( O_{B}, O_{A} \right ) \\ \text{Then: } & IT\left (IT\left ( O, O_{A} \right ) , O'_{B}\right ) = IT\left (IT\left ( O, O_{B} \right ) , O'_{A} \right ) \end{aligned}

CP2 is required only if the OT systems allows two operations be IT-transformed in two different document states (or contexts).

Similarly, there are three inverse properties that need to be satisfied to support the “group undo” operation.

IP1: Given a state S, and sequence of operations O \circ \overline{O}:

S \circ O \circ \overline{O} = S

IP2: Given any operation O, and a pair of operations O_{x} and \overline{O_{x}}:

IT\left (IT\left ( O, O_{x} \right ), \overline{O_{x}} \right ) = IT\left ( O, I \right ) = 0

IP3: Given a state S, and two operations O_{A} and O_{B}:

\begin{aligned} \text{If } & O'_{A} = IT\left (O, O_{B} \right ), \text{ and} \\ & O'_{B} = IT\left (O_{B}, O_{A} \right ), \text{ and} \\ & \overline{O_{A}}' = IT\left (\overline{O_{A}}, O'_{B} \right ) \\ \text{Then: } & \overline{O_{A}}' = \overline{O_{A}'} \\ \text{Or: } & IT\left ( \overline{O_{A}}, O'_{B} \right ) = \overline{IT\left ( O_{A}, O_{B} \right )} \end{aligned}

Comparison of control(integration) algorithms

The OT control algorithm is the main high-level algorithm governing the collaboration functions that are available to the system. This algorithm controls the time/space complexity of the system, handles ordering of the operations, and processes the incoming operations into the modified transformed operations. We here compare several typical implementations of OT control algorithms.

dOPT

GROVE(GRoup Outline Viewing Editor) invented dOPT(distributed Operation Transformation) which consists of two components: one is the state vector timestamping scheme for ensuring the precedence property, and the other is the dOPT algorithm for ensuring the convergence property. The basic idea of the dOPT algorithm is that when an operation satisfies the precedence condition for execution, it is transformed against independent operations in the log(which saves executes operations in the order of their execution) in such a way that executions of the same set of properly transformed independent operations in different orders produce identified document states, thus ensuring the convergence property. For m operations, there is an m x m matrix of the inclusive transformation resultant functions, for each pair of operations.

Although dOPT is simple and satisfies many of the correctness properties, it fails in the case of the so-called dOPT puzzle, when an operation is concurrent with two or more other operations from different sites.

Jupiter

Jupiter, a multi-user remote collaboration virtual world developed at Xerox PARC by Nichols et al., addressed some of the issues found in dOPT. One of the major changes in Jupiter was to have a centralized coordination server that uses a change propagation algorithm to keep the clients updated and in check.

The optimistic concurrency techniques are used in the individual client-server links, where the synchronization algorithm in the individual client-server links is very similar to the dOPT algorithm.

The major changes in Jupiter, in comparison with dOPT, is that instead of a transformation matrix, Jupiter uses a function called xform, which takes a pair of client and server operation requests, and transforms them as a new pair of operations that lead the client and server operations to the same final state.

The xform technique is successful while the client and server are in the same starting state, but if the client and server diverge too much, then the operation must look back into its past history to properly calculate the converging operations. This is often done by keeping track of operation revision history, but not directly transforming saved messages, like dOPT does, which causes the incorrect scenario. Instead, Jupiter calculates past converging operations, even if it does not apply them, in order to generate the correct recent transformed operation.

Google Wave OT

Google Wave OT was a modification on top of the Jupiter algorithm. The main changes to the Jupiter control algorithm are mainly in the client/server communication protocol, as well as optimizing the transformation functions for batch updates.

Instead of sending client operations to the server whenever new operations are requested, the Google Wave OT client must wait for a server response before sending any more operations. In the meantime, the operations are composed together into a single buffer, and then sent together once the server has processed the last batch of operations and converged.

This technique serves two major purposes. First, as the server acknowledges the client before any new operations proceed, the client is capable of predicting the operation path that the server will take, and thus always send operations to the server that are always on the path. This simplifies the server implementation significantly, as the server only needs to keep track of its own history. Instead of taking a quadratic O(h^{2}) space for history on every possible path, it only takes up a linear O(h^{2}) space on the server for history, where h is the number of previous operations (and possible causal operations)required to calculate the correct OT path.

Finally, the transformation algorithm handles operations in streams, rather than as single discrete operations. These operation sequence streams are guaranteed to be in order, as well as linear, so the streaming transformations can be performed in linear time.

ABT

A more modern protocol for OT is based on the admissibility-based transformation (ABT) framework. TIPS, an implementation of ABT, builds on top of existing HTTP protocols, and also uses a centralized server.

Clients in TIPS are able to join or leave a session at any time, and the clients also independently decide to sync with the server. As with Google Wave OT, operations are buffered and synchronized with the server only at some particular server-determined interval. This is useful, as it can be easily adapted to JavaScript-based long polling methods, which are the traditional method of maintaining an open connection over the web.

Once the server has asked all of the clients to synchronize, each client that has responded will have their operations applied to an n-way merge algorithm, which will create a sequential set of operations as output. The clients receive this sequential set of operations in another interval, which is adapted by them through another algorithm, ITSQ, which performs the inclusion transformation step that updates the document correctly.

The benefit of TIPS over existing protocols is that it is more capable of allowing clients to dynamically enter and exit a particular OT session, as well as being more robust when dealing with client or network failures.

Leave a Reply

Your email address will not be published. Required fields are marked *

*