Skip to main content
Sanity Check

August 30, 2023 · 1 min read · 269 words

SC 016 — Data Repair Work

Identifies eight distinct ways data projects break and argues that understanding failure modes enables better maintenance strategies. Key solutions: deprecation processes for metric definition chan…

Data repair on cloud infra

Summary

Identifies eight distinct ways data projects break and argues that understanding failure modes enables better maintenance strategies. Key solutions: deprecation processes for metric definition changes (old and new coexist temporarily), alerts without automatic syncing for schema changes (avoids cost surprises), and surrogate keys with uniqueness/null tests for granularity issues.

Core thesis: stability builds trust, and trust creates opportunities for value creation.

Key Arguments

  • Data maintenance is undervalued but foundational — you can’t create value on an unstable platform
  • Metric definition changes need deprecation windows, not hard cuts
  • Schema change alerts should inform, not auto-remediate (cost control)
  • Surrogate keys with proactive testing prevent granularity issues before they cascade
  • Stability → trust → opportunity is the progression

Writing Style Notes

Published during Hurricane Idalia evacuation — the personal stakes contrast with the methodical technical content. Practical and specific. The “repair work” framing elevates maintenance from chore to craft.

Connections