You can just tell the agent to do exactly that

wagwang • today at 1:43 AM • 3 replies • view on HN

Replies

I've had various agents backed by various models ignore the shit out of various rules and request at varying rates but they all do it.

When you point it out "Oh yes, I did do that which is contrary to the rules, request <whatever>.. Anyway..."

➕ show 1 reply

D-Machine • today at 6:28 AM

Except you can't be sure it isn't producing nonsense when you do this, and generally the model(s) will be overconfident. This has been studied, see e.g. https://openreview.net/pdf?id=E6LOh5vz5x

    > An alternative way to obtain uncertainty estimates from LLMs is to prompt them directly. One benefit of this approach is that it requires no access to the internals of the model. However, this approach has produced mixed results: LLMs can sometimes verbalize calibrated confidence levels (Lin et al., 2022a; Tian et al., 2023), but can also be highly overconfident (Xiong et al., 2024). Interestingly, Xiong et al. (2024) found that LLMs typically state confidence values in the range of 80-100%, usually in multiples of 5, potentially in imitation of how humans discuss confidence levels. Nevertheless, prompting strategies remain an important tool for uncertainty quantification, along with measures based on the internal state (such as MSP).

alterom • today at 2:08 AM

>You can just tell the agent to do exactly that

You can.

It just won't do it.

➕ show 1 reply

alt Hacker News

Replies