Anomalies

Reactive, local-first applications can deliver a fast and responsive user experience. However, it’s important to make sure that interactions feel natural and enjoyable by avoiding anomalies. Espeically when you have user actions coming from different sources.

The following table summarises the anomalies that can occur with different database technologies and illustrates how ElectricSQL’s model of Rich-CRDTs with transactional causal+ consistency (TCC+):

  1. avoids aborts and flickering that are common with strongly consistency systems; and
  2. prevents anomalies that are possible under pure CRDT and OT systems

Summary table

Anomaly OT CRDTs TCC+ & Rich-CRDTs Serializability
Causality violation Single object Single object No No
User intention violation Some Some Some No
Cumulative effects Yes Yes No No
Flickering Some Some No Yes (aborts)
Broken referential integrity Yes Yes No No
Ownership clash Yes Yes No No
Long-lived actions interference Yes Yes No No
Duplication of elements Yes Yes No No
Tree cycles Yes Yes No No
Convergence inconsistencies Yes Yes Some Not applicable
Ghost movements Yes Yes Yes Yes
Proximity bias Not applicable Not applicable Randomised Yes
Offline interactions Some Offline-first Online Online

User experience anomalies

Causality violation

Problem:
Operations become visible before their dependencies are also made visible, influencing the user’s action. I.e.: the user acts on information that has already been superceeded.

Example:
Elements in a creativity application are organised by color; user changes the color of an element and arranges accordingly. Operations are delivered out of order. Another user sees that the element was positioned in the wrong group (according to the color) and moves the element to a different group.

Example:
User changes the access rights for some document, removing the participants that have lower clearance level. Subsequently modified the content of that document with sensitive data. If these two changes are delivered out of order, a user with lower clearance level might be able to see the sensitive information.

Mitigation:
Latency of operations is so low that the user observing the operations out of order doesn’t have enough time to act before the operation out of order is delivered.

Solution:
Transactional Causal Consistency+, the strongest consistency model under High Availability, which prevents reordering of causal-related operations.

Illustration of a causality violation anomaly.

Violation of user intent

Problem:
The conflict-resolution strategy applied to solve conflicting updates leaves the object in a state that does not reflect the intention of individual changes.

Example:
A user inserts a word at the beginning or the end of a sentence, while another user changes the style of the same sentence. Merging the changes results in an inconsistent style for the sentence.

Mitigation:
Capturing user intention is a domain-specific problem. Many times arbitration can only go so far as preserving the intention of one of the users (if they are conflicting)

Solution:
develop data-types that have more complex internal representation of the data they are used to represent. ElectricSQL provides off the shelf rich data-types tailored for specific use cases, like collaborative text edition and other productivity tools.

Cumulative effects

Problem:
Cumulating the effects of individual operation might result in amplification of the intended changes. This is a particular case of failing to preserve user intention.

Example:
two users modify some object’s property concurrently, for example enlarging a shape, and the semantics of the merge is to sum the deltas. The shape size will end larger than each user intended. When trying to correct the issue, the reverse situation might happen again.

Mitigation:
Cumulative effects are acceptable when latency is low enough that users perceive that there are concurrent modifications to the same object.

Solution:
Prevent multiple users from modifying the same property of an object concurrently.

Flickering

Problem:
Objects a user is working on are modified by other users concurrently. Changes are applied locally immediately, but some of them might conflict with changes made by other users. The issues arises when one of the client applies unconfirmed changes that will be later undone by conflict arbitration.

Example:
Two users set the color of the same object concurrently. They apply the change immediately in the local state and send the update to the server. A server receives the first update and propagates it to all all users. The clients receiving the update from the server apply it, possibly overwriting the color set locally. When the second operation arrives at the server, conflict resolution might chose this operation as the winner and propagate it to clients. This will result in flickering effect for the users that already applied the first update.

Illustration of a flickering anomaly.

Example:
Consider a collaborative kanban board application. Two users move the same card to different columns. While the system does not arbitrate the winning move, users might temporarily see the card in the position picked by the other user, before the system decides that the winning position is the other one, and move the card again.

Mitigation:
Delay applying conflicting operations until the server arbitrates the result.

Solution:
ElectricSQL doesn’t use a central authority to arbitrate conflict-resolution, instead every server is able to merge incoming operation in local state deterministically. CRDTs provide decentralised conflict resolution rules that ensure that the result of merging conflicting updates is independent of the order in which conflicting operations are applied (given that they respect causality in the case of operation-based CRDTs).

Referential Integrity

Problem:
Concurrent operation might inadvertently destroy the target in one side of a relationship and the inconsistency is not detected until the element is dereferenced.

Example:
In a Kanban board, a user moves a card into a column and concurrently another user removes that column. After merging the changes, the card that was moved has no assigned column and can possibly be lost.

Illustration of a referential integrity anomaly.

Example:
In a role playing game (“RPG”), a user equips an item and concurrently sells it. The user may receive the proceeds of the sale while the item remains equipped but no longer exists in the inventory. It is possible to do an integrity check to unequip the item, but the developer has to provide the code for that.

Illustration of a referential integrity anomaly.

Mitigation:
Use a stronger consistency model that ensures sequential operation execution.

Solution:
ElectricSQL prevents referential integrity violation through reservations and compensations.

Ownership clash

Problem:
Concurrent operations set the value of a field. Arbitration picks a single winning value, but during the fork different replicas allow operations based on the different values set.

Example:
two concurrent transactions set different owners for an entity. Divergence is only reconciled asynchronously, meanwhile the (two) owner(s) of the entity executed operations that should only be allowed by a single owner.

Mitigation:
coordinate concurrent operation execution for each record/attribute.

Solution:
Use Rich-CRDT to enforce uniqueness constraint efficiently.

Cursor ghost movements

Problem:
The edition cursor position might change when incoming updates are applied and no longer match the input device position on screen.

Example:
A user select an entry to edit in a todo list by clicking on it. Meanwhile, another user adds a new entry to the list. When the new entry is added to the todo list of the first user, the element he is editing will change position, while the mouse pointer will remain in the same position he clicked.

Example:
A user wants to edit some paragraph in a text block and positions the screen on the beginning of that paragraph. Another user is adding text before that paragraph making the paragraph the first user focused to move down in the screen.

Mitigation:
Track the cursor position using hints from the context and reposition the screen when new changes affect the focus of the user

Solution:
Tracking cursor movements is a difficult problem because it crosses the boundaries of two layers, visualisation and data management. Mitigation is the only solution we are aware of.

Long-lived actions

Problem:
Operations that require multiple user interactions might see their changes being altered by concurrent operations that were execute during the duration of the whole operation.

Example:
User executes a copy-paste operation but takes a few seconds between copying the elements and pasting them in another location. Some user deletes one of the copied element concurrently. The first user finally pastes the copied elements, but the copied elements no longer include the deleted element. Conversely, if the element is re-created it might also be confusing for the user that deleted it.

Mitigation:
Handle copy-and-paste in a transaction and abort one of the conflicting operations (i.e. the entire copy-and-paste or the conflicting update to the copy set)

Solution:
ElectricSQL allows the developer to select the conflict-resolution policy that suits the situation best, either prioritising the recreation of the concurrently-removed elements (add-wins), or ensuring that they are deleted (remove-wins).

Duplication of elements

Problem:
Concurrent operations that modify the relations of some element result in the duplication of that element.

Example:
Two cut-and-paste operations of the same element result in two copies of that element, violating uniqueness.

Example:
Concurrent reparenting an element in a graph results in duplication of that element.

Mitigation:
Use locking to prevent concurrent operations on a single element.

Solution:
Rich CRDTs provide uniqueness invariant through the use of reservations, allowing operations that are safe to execute concurrently over a single object.

No cycles

Problem:
moving elements in a tree might result in cycles.

Example:
A node A is moved under B, while B is moved under A, creating a cycle. The same problem can happen with a larger tree.

Example:
In a structured collaboration application, one container element might be moved inside another container and vice versa, concurrently.

Mitigation:
Lock the root of the tree to prevent concurrent moves.

Solution:
Use reservations to allow moves within subtrees without blocking operations on other parts of the tree. Requires a cycle-detection mechanism to issue reservations safely.

Offline interaction

Problem:
working on a document offline for long periods might result in divergence difficult to reconcile.

Example:
Users use a spreadsheet to collect data points in one sheet and calculate summary info in another sheet. The template of the datapoints might be modified, affecting the columns that should be used to make the calculations. Meanwhile, the user working offline on the summary sheet might add new calculations referring to the old column positions.

Solution:
Minimize divergence by synching replicas frequently; give feedback about the status of the connection.

Proximity bias

Problem:
The arbitration of concurrent operations over a competing resource favours clients that are nearer a server.

Example:
Two players in a competitive game try to collect the same item in the scenario. The user that is closer to the server systematically succeeds collecting the item.

Mitigation:
don’t use timestamp-based conflict-resolution strategies.

Solution:
Use Rich-CRDTs that introduce pseudo-randomness in the presence of conflicts. E.g. a reservations system that manages reservations across multiple regions; a CRDT with pseudo-random conflict arbitration.

Conflict-resolution consistency

Problem:
Correctness of application depends on having effects of operations being applied atomically in a transaction. Conflict arbitration might break the implicit invariants enforced by a transaction.

Example:
consider a numerical equality, X = Y. A transaction sets the two values, for instance TxA: X = 1, Y = 1 and TxB: X = 2, Y = 2. The conflict resolution strategy for X and Y are set differently, suppose that one picks the maximum, while the other picks the minimum value. Being unable to ensure coherence between the conflict resolutions strategies applied to X and Y will break the invariant X = Y.

Mitigation:
can’t ensure this invariant without atomic operations.

Solution:
Use transactions and carefully consider conflict resolution rules; declare invariant and let the system figure out the appropriate conflict-resolution rules to be applied for each value.

Database anomalies

Phantom reads

Problem:
Non-serialisable database anomaly where one transaction computes a collection of records, while another concurrently adds/removes a record to that collection and commits. A re-read of the computed collection in the same transaction will result in a different result.

Example:
A user counts all comments per user on a post and another user adds a new comment. If the first user tries to retrieve the comments for that user, the result set won’t match the size returned in the first read.

Solution:
Range locks, or similarly with reservations.

Stickiness

Problem:
Reconnecting to a different server does not reflect last changes executed in the session.

Example:
A user makes some comments on a video-editing platform; The server where he is connected fails and the connection is handed over to a different server. After refreshing the page, the user doesn’t see his comments.

Solution:
Ensure selected server for failover includes latest client writes, or cache unstable operations on the client.

Write-skew (short fork)

Problem:
Two transactions don’t have write-write conflict but are not serialisable.

Example:
T1 reads something modified by T2 and T2 reads something modified by T1, but the transactions don't conflict

Solution:
Issue is possible under SI and TCC+.

Long fork

Problem:
Two concurrent transactions writing to different objects commit, subsequent transactions might see the effects of one of the transaction, but not the other.

Example:
See problem definition.