logoalt Hacker News

cpardtoday at 5:20 AM0 repliesview on HN

Row level and summary stats are both diffs over values that can tell you that something changed but not whether the * meaning * has changed. What I'm working on is providing more information on how the meaning changes.

What questions I'd like to answer with the diffing is more like: will the grain go from one-row-per-user to one-row-per-user-per-day, will a key stop being unique, will a join start fanning out and quietly double a measure, will something additive become non-additive.

This diff is over structure but this structure is latent in the transformation that produces it and to make things harder, if we are talking about some declarative language being used (e.g. SQL) the code doesn't even describe how things are getting done, but what the output would be.

What I've ended up doing is recovering the structure from the code by analyzing it and then using * cheap * profiling than a full row compare.

As an example, my equivalent impact sub-command output would be something like this: "this change makes account_id non-unique three models downstream"