The first time someone explained vector clocks to me, I came away thinking they were a clever trick for ordering writes in a Dynamo-style database. That is true, but it sells them short. Vector clocks are interesting because they force you to be precise about a question you usually avoid: which of these two events knew about which?

Happens-before, restated

Given two events a and b, exactly one of these is true: a → b, b → a, or they are concurrent. Scalar Lamport timestamps collapse cases 1 and 2 onto a total order but lose case 3 entirely. If a has timestamp 4 and b has timestamp 7, you cannot tell whether b actually saw a or whether they are concurrent events that happened to land at those numbers.

Vector clocks preserve the partial order exactly. If neither VC(a) < VC(b) nor VC(b) < VC(a), the events are concurrent. That third case is the whole point.

Two things that took me too long to internalize

  1. Vector clocks tell you that two events are concurrent. They do not tell you what to do about it. The merge policy is a separate decision.
  2. Comparison cost is O(N) in replicas. For systems with churn, this is the actual reason people move to dotted version vectors, not theoretical elegance.

I keep coming back to vector clocks because they are a small, sharp tool for asking a question most systems pretend does not exist.