> measure when a deep learning system is making stuff up or hallucinating
That's a great problem to solve! (Maybe biased, because this is my primary research direction). One popular approach is OOD detection, but this always seemed ill-posed to me. My colleagues and I have been approaching this from a more fundamental direction using measures of model misspecification, but this is admittedly niche because it is very computationally expensive. Could still be a while before a breakthrough comes from any direction.
Could you elaborate on what you mean by OOD detection seeming ill-posed?
> Could still be a while before a breakthrough comes from any direction.
It would be valuable enough that getting significant funding to work on it is probably possible. Especially with all the money being thrown at AI.