I'm not an expert but my understanding is transformers based models simply can't do some of those things, it isn't really how they work.
Especially something like expressing a certainty %, you might be able to get it to output one but it's just making it up. LLMs are incredibly useful (I use them every day) but you'll always have to check important output
Yeah I have seen multiple people use this certainty % thing but its terrible. A percentage is something calculated mathemtatically and these models cannot do that.
Potentially they could figure it out if they looks into a comparison of next token probabilites, but this is not exposed in any modern model and especially not fed back into the chat/output.
Instead people should just ask it to explain BOTH sides of an argument or explain why something is BOTH correct and incorrect. This way you see how it can halluciate either way and get to make up your own mind about the correct outcome.