Health metrics are absolutely tarnished by a lack of proper context. Unsurprisingly, it turns out that you can't reliably take a concept as broad as health and reduce it to a number. We see the same arguments over and over with body fat percentages, vo2 max estimates, BMI, lactate thresholds, resting heart rate, HRV, and more. These are all useful metrics, but it's important to consider them in the proper context that each of them deserve.
This article gave an LLM a bunch of health metrics and then asked it to reduce it to a single score, didn't tell us any of the actual metric values, and then compared that to a doctor's opinion. Why anyone would expect these to align is beyond my understanding.
The most obvious thing that jumps out to me is that I've noticed doctors generally, for better or worse, consider "health" much differently than the fitness community does. It's different toolsets and different goals. If this person's VO2 max estimate was under 30, that's objectively a poor VO2 max by most standards, and an LLM trained on the internet's entire repository of fitness discussion is likely going to give this person a bad score in terms of cardio fitness. But a doctor who sees a person come in who isn't complaining about anything in particular, moves around fine, doesn't have risk factors like age or family history, and has good metrics on a blood test is probably going to say they're in fine cardio health regardless of what their wearable says.
I'd go so far to say this is probably the case for most people. Your average person is in really poor fitness-shape but just fine health-shape.
> But a doctor who sees a person come in who isn't complaining about anything in particular, moves around fine, doesn't have risk factors like age or family history, and has good metrics on a blood test is probably going to say they're in fine cardio health regardless of what their wearable says.
This is true of many metrics and even lab results. Good doctors will counsel you and tell you that the lab results are just one metric and one input. The body acclimates to its current conditions over time, and quite often achieves homeostasis.
My grandma was living for years with an SpO2 in the 90-95% range as measured by pulse oximetry, but this was just one metric measured with one method. It doesn't mean her blood oxygen was actually repeatedly dropping, it just meant that her body wasn't particularly suited to pulse oximetry.
>This article gave an LLM a bunch of health metrics and then asked it to reduce it to a single score, didn't tell us any of the actual metric values, and then compared that to a doctor's opinion. Why anyone would expect these to align is beyond my understanding.
This gets to one of LLMs' core weaknesses, they blindly respond to your requests and rarely push back against the premise of it.
Measuring metrics is easy, it's the algorithm on the backend that matters.
There's a reason why Oura rings are expensive and it's not the hardware - you can get similar stuff for 50€ on Aliexpress.
But none of them predicted my Covid infection days in advance. Oura did.
A device like the Apple Watch that's on you 24/7 is good with TRENDS, not absolute measurements. It can tell you if your heart rate, blood oxygen or something else is more or less than before, statistically. For absolute measurements it's OK, but not exact.
And from that we can make educated guesses on whether a visit to a doctor is necessary.
>I'd go so far to say this is probably the case for most people. Your average person is in really poor fitness-shape but just fine health-shape.
Modern medicine has failed to move into the era of subtlety and small problems and many people suffer as a result. Fitness nerds and general non-scientists fill the gap poorly so we get a ton of guessing and anecdotal evidence and likely a whole lot of bad advice.
Doctors won't say there's a problem until you're SICK and usually pretty late in the process when there's not a lot of room to make improvements.
At the same time, doctors won't do anything if you're 5% off optimal, but they'll happily give you a medicine that improves one symptom that's 50% off optimal that comes along with 10 side effects. Although unless you're dying or have something really straightforward wrong with you, doctors don't do much at all besides giving you a sedative and or a stimulant.
Doctors don't know what to do with small problems because they're barely studied and the people who DO try to do something don't do it scientifically.
The problem is that the product itself invites the wrong expectation
Many of those metrics are population or sampling measures and are confounded by many factors at an individual level. The most notorious of which is BMI; it is practically a category error to infer someone's health or risk by individual BMI, and yet doing so remains widespread amongst people that are supposed to know better.
Instrumentation and testing become primarily useful at an individual level to explain or investigate someone's disease or disorder, or to screen for major risk factors, and the hazards and consequences of unnecessary testing outweigh the benefits in all but a few cases. For which your GP and/or government will (or should) routinely screen those at actual risk, which is why I pooped in a jar last week and mailed it.
An athlete chasing an ever-better VO2max or FTP hasn't necessarily got it wrong, however. We can say something like, "Bjorn Daehlie’s results are explained by extraordinary VO2max", with an implication that you should go get results some other way because you're not a five-sigma outlier. But at the pointy end of elite sport, there's a clear correlation between marginal improvement of certain measures and competitive outcomes, and if you don't think the difference of 0.01sec between first and third matters then you've never stood on a podium. Or worse, next to one. When mistakes are made and performance deteriorates, it's often due to chasing the wrong metric(s) for the athlete at hand, generally a failure of coaching.