tags : Representing Time and Date, Distributed Systems, crdt, Consensus Protocols

Overview

  • Time & State: https://queue.acm.org/detail.cfm?id=2745385 (There is No Now)
  • Logical clocks, vector clocks, hybrid logical clocks, snapshots
    • They form the basis of
      • crdt
      • version vectors in NoSQL databases
      • snapshot reads and commits in distributed SQL databases build upon.

FAQ

About state and Time

To read

  • “relativity means there is no such thing as simultaneity” used as an argument that synchronized clocks cannot exist.
    • This is a misunderstanding
      • the equations of special and general relativity provide exact equations for time transformations, and it is possible to define any number of sensible, globally-synchronized time clocks.
      • Consistency constraints that refer to these clocks will depend on the choice of clock
      • The good news is that
        • For all intents and purposes, clocks on earth are so close to each other’s velocities and accelerations that errors are much smaller than side-channel latencies
        • Many of the algorithms for ensuring real-time bounds in asynchronous networks use causal messages to enforce a real-time order, and the resulting order is therefore invariant across all reference frames.
  • GPS clocks are changing databases (again) - YouTube

Monotonic?

Atomic clocks?

Lamport Clocks (Total order)

Vector Clocks (Partial order)

Timestamps and Databases

TODO Learning Syllabus

  • Time: Abstract
  • Clocks: Implementation

Part 1: Time - Abstract Concept in Distributed Systems

Fundamental Time Concepts

  • Time as a partial ordering of events in distributed systems
  • Formal definition of concurrent events: events that are neither “happened before” nor “happened after” each other
  • Synchrony models and their precise definitions:
    • Synchronous: bounded message delay and bounded process execution speed
    • Asynchronous: unbounded message delay and/or process execution speed
    • Partially synchronous: eventually bounded message delay and process execution speed
  • The impossibility of perfect global time in distributed systems (theoretical proof)

Causality and Ordering

  • Lamport’s happens-before relation (→): formal definition and properties
    • If a and b are events in the same process, and a occurs before b, then a → b
    • If a is the sending of a message and b is the receipt of that message, then a → b
    • If a → b and b → c, then a → c (transitivity)
  • Concurrent events: a ∥ b ⟺ ¬(a → b) ∧ ¬(b → a)
  • Causal history: C(e) = {e’ | e’ → e} ∪ {e}
  • Formal proof that causality establishes a partial, not total, order

Consistency Models Based on Time Ordering

  • Linearizability: formal definition as a history H that can be extended to a history H’ such that:
    • Complete(H’) is equivalent to some legal sequential execution S
    • If op1 completes before op2 begins in H, then op1 appears before op2 in S
  • Sequential consistency: formal definition and distinction from linearizability
  • Causal consistency: formal definition using happens-before relation
  • PRAM consistency (Pipelined RAM): formal definition
  • Eventual consistency: formal definition including convergence, conflict resolution

Time in Coordination Problems

  • Time’s role in the Two Generals Problem (formal impossibility proof)
  • Time assumptions in the Byzantine Generals Problem
  • Time’s role in distributed deadlock detection
  • Formal treatment of timeouts as imperfect failure detectors
  • Time and leader election (impossibility in purely asynchronous systems)

The FLP Impossibility Result

  • Precise statement of the FLP result: In an asynchronous system with even one faulty process, there is no deterministic algorithm that solves consensus
  • Detailed analysis of time’s role in this impossibility
  • Implications for real-world distributed system design
  • Connection to other impossibility results (CAP theorem, Two Generals)

Part 2: Clocks - Mechanisms for Tracking Time

Physical Clock Systems

  • Oscillator-based physical clocks: crystal oscillators, atomic clocks
  • Precise definitions of clock terminology:
    • Clock drift: deviation of a clock from perfect time
    • Clock skew: difference between two clocks at a given moment
    • Time-scale offset: constant difference between two clocks
    • Frequency offset: difference in the rate of clock advancement
  • Hardware clocks:
    • Timer interrupts and their limitations
    • High-precision timers (HPET)
    • Time Stamp Counter (TSC) in modern CPUs
  • Temperature and aging effects on physical clocks (quantitative analysis)
  • Formal specification of clock synchronization problem:
    • External synchronization: alignment with an external standard
    • Internal synchronization: alignment among system clocks

Clock Synchronization Protocols

  • Network Time Protocol (NTP):
    • Stratum hierarchy and algorithm details
    • Error estimation techniques
    • Security vulnerabilities
  • Precision Time Protocol (PTP/IEEE 1588):
    • Hardware vs. software timestamping
    • Best master clock algorithm
    • Boundary and transparent clocks
  • Marzullo’s algorithm for combining multiple time sources
  • Optimal clock synchronization algorithms under byzantine conditions
  • Cristian’s algorithm: mathematical formulation and error bounds
  • Berkeley algorithm: detailed protocol steps and coordinator election
  • Reference broadcast synchronization (RBS) for wireless networks

Logical Clocks

  • Lamport clocks:
    • Formal definition: C(a) < C(b) if a → b
    • Clock update rules:
      • Before executing an event, process Pi increments Ci: Ci = Ci + 1
      • When sending a message m, process Pi sets m’s timestamp ts(m) = Ci
      • Upon receiving message m, process Pj sets Cj = max(Cj, ts(m)) + 1
    • Limitations: possibility of C(a) < C(b) when a ∥ b
  • Lamport timestamps as an implementation of monotonic time
  • Mathematical proof that Lamport clocks capture causal precedence but not concurrency

Vector Clocks

  • Vector clock formal definition: VC(e) = [t₁, t₂, …, tₙ] where tᵢ represents local logical time at process i
  • Vector clock rules:
    • Initially, VC[j] = 0 for all j
    • Before each event at process i: VC[i] = VC[i] + 1
    • When sending a message m, process i includes current VC with m
    • When process j receives message m from process i with vector timestamp VC_m:
      • VC[k] = max(VC[k], VC_m[k]) for all k ≠ j
      • VC[j] = VC[j] + 1
  • Comparison of vector timestamps: VC(a) < VC(b) iff ∀k: VC(a)[k] ≤ VC(b)[k] and ∃k: VC(a)[k] < VC(b)[k]
  • Mathematical proof that vector clocks accurately capture both happens-before and concurrent relationships
  • Space and message overhead: O(n) where n is the number of processes
  • Optimizations: compression techniques, relevant subset tracking

Advanced Logical Clock Systems

  • Version vectors:
    • Distinction from vector clocks: only track object dependencies, not process events
    • Update rules for read/write operations
    • Conflict detection algorithm
  • Dotted version vectors: extension for tracking causal history in dynamo-style systems
  • Interval tree clocks: O(log n) space complexity
  • Matrix clocks: tracking transitive dependencies
  • Plausible clocks: probabilistic approaches to causality tracking
  • Hybrid logical clocks (HLC): combining physical and logical time
    • Formal definition: HLC(e) = [l.e, c.e] where l.e is the logical component and c.e is the physical component
    • Update rules integrating physical clock readings
    • Error bounds and formal guarantees

Modern Clock Systems in Industry

  • Google’s TrueTime API:
    • Interval-based time representation: [earliest, latest]
    • GPS and atomic clock integration
    • Error bound guarantees and wait-out-uncertainty approach
  • Amazon Time Sync Service:
    • Architecture and failover mechanisms
    • Integration with AWS services
  • CockroachDB’s hybrid clock implementation:
    • Integration with transaction ordering
    • Bounded uncertainty handling
  • Spanner’s implementation of external consistency using TrueTime
  • Facebook’s time synchronization architecture for datacenters

Part 3: Applications and Advanced Topics

Distributed Snapshots

  • Chandy-Lamport snapshot algorithm:
    • Formal definition of a consistent global state
    • Marker sending rules
    • State and message recording rules
    • Proof of correctness
  • Mattern’s algorithm using vector clocks
  • Lai-Yang algorithm for non-FIFO channels
  • Alagar-Venkatesan algorithm using logical time
  • Snapshot for termination detection

Time in Consensus Protocols

  • Paxos and time:
    • Unbounded delays vs. liveness
    • Round numbers as logical time
    • Multi-Paxos and leader leases
  • Raft:
    • Leader election timeouts
    • Term numbers as logical clock values
    • Heartbeats and time assumptions
  • Viewstamped Replication:
    • View changes and logical time
    • Recovery protocol time assumptions
  • Byzantine Fault Tolerance protocols:
    • PBFT view changes and timeout mechanisms
    • Honey Badger BFT’s asynchronous approach

Implementing Causal Broadcast

  • Schiper-Eggli-Sandoz (SES) causal broadcast protocol using vector clocks
  • Birman-Schiper-Stephenson protocol
  • Optimized causal broadcast using dependency tracking
  • Implementation challenges and optimizations
  • Formal verification of causal broadcast protocols

Time in Distributed Databases

  • Timestamp-based concurrency control:
    • Thomas write rule: discard writes with older timestamps
    • Multiversion concurrency control (MVCC) using timestamps
  • Transaction ordering using hybrid clocks
  • Causal consistency implementation using vector clocks
  • Last-writer-wins conflict resolution with vector clocks
  • Read-your-writes consistency implementation

Time and Failure Detection

  • Unreliable failure detectors: completeness and accuracy properties
  • ◊S (Eventually Strong) failure detector
  • Heartbeat-based failure detection with adaptive timeouts
  • Phi Accrual failure detector: probabilistic approach
  • Gossip-based failure detection
  • The relationship between failure detection and consensus

Advanced Research Topics

  • Quantum clock synchronization algorithms
  • Self-stabilizing clock synchronization
  • Non-blocking atomic commit protocols based on logical time
  • Time in large-scale geo-distributed systems
  • Machine learning approaches to adaptive timeout calculation
  • Time in Byzantine environments with adversarial nodes

Practical Projects and Implementation Exercises

  1. Implement and visualize vector clocks in a distributed simulator
  2. Measure clock drift and skew in a real multi-node system
  3. Build a causally consistent distributed key-value store
  4. Implement the Chandy-Lamport distributed snapshot algorithm
  5. Create a logical time-based debugging tool for distributed systems
  6. Develop a custom clock synchronization protocol for a specific network topology
  7. Implement a Byzantine-tolerant clock synchronization system
  8. Build a consensus protocol with configurable time assumptions