logoalt Hacker News

elzbardicotoday at 1:42 AM3 repliesview on HN

LLMs are not a mythical universal machine learning model that you can feed any input and have it magically do the same thing a specialized ML model could do.

You can't feed an LLM years of time-series meteorological data, and expect it to work as a specialized weather model, you can't feed it years of medical time-series and expect it to work as a model specifically trained, and validated on this specific kind of data.

An LLM generates a stream of tokens. You feed it a giant set of CSVs, if it was not RL'd to do something useful with it, it will just try to make whatever sense of it and generate something that will most probably have no strong numerical relationship to your data, it will simulate an analysis, it won't do it.

You may have a giant context windows, but attention is sparse, the attention mechanism doesn't see your whole data at the same time, it can do some simple comparisons, like figuring out that if I say my current pressure is 210X180 I should call an ER immediately. But once I send it a time-series of my twice a day blood-pressure measurements for the last 10 years, it can't make any real sense of it.

Indeed, it would have been better for the author to ask the LLM to generate a python notebook to do some data analysis on it, and then run the notebook and share the result with the doctor.


Replies

rfw300today at 1:52 AM

This is true as a technical matter, but this isn't a technical blog post! It's a consumer review, and when companies ship consumer products, the people who use them can't be expected to understand failure modes that are not clearly communicated to them. If OpenAI wants regular people to dump their data into ChatGPT for Health, the onus is on them to make it reliable.

show 1 reply
Deklomalotoday at 12:15 PM

You state a lot of things without testing it first?

A LLM has structures in its latent space which allows it to do basic math, it has also seen enough data that it has probably structures in it to detect basic trends.

A LLM doesn't just generate a stream of tokens. It generates an embedding and searches/does something in its latent space, then returns tokens.

And you don't even know at all what LLM Interfaces do in the background. Gemini creates sub-agents. There can easily be already a 'trend detector'.

I even did a test and generated random data with a trend and fet it to chatgpt. The output was very coherent and right.

show 1 reply
protocolturetoday at 3:15 AM

This LLM is advertising itself in a medical capacity. You arent wrong, but the customer has been fed the wrong set of expectations. Its the fault of the marketing of the tool.