Skip to content

Restore silently drops index entries when backup read_ts regresses (Zero state loss) #9706

@matthewmcneely

Description

@matthewmcneely

Summary

If Zero's state is wiped or rebuilt while an existing backup chain is still active, the cluster's timestamp counter regresses. Subsequent incremental backups are accepted silently and written with read_ts values below earlier entries in the same chain. On restore, the reduce phase keeps only the highest-version KV per key — so newer postings (committed at the regressed, lower timestamps) are silently dropped, while their data-posting siblings sometimes survive. The result is that indexes and type metadata lose entries even though the underlying data is still present.

Impact

  • Silent index corruption on restore. Common queries (type(X), eq(field, value) on indexed fields) return incomplete results.
  • The corruption is invisible until someone uses the affected query path — has() and direct uid lookups still work.
  • Triggered by routine operational events: Zero pod recreation without persistent storage, namespace teardown + restore, manual zw/ cleanup, accidental helm uninstall followed by reinstall.

Metadata

Metadata

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions