logoalt Hacker News

ostiyesterday at 5:38 PM1 replyview on HN

Meh, I feel that the car wash test is probably the worst question of all of those LLM test questions. The question is basically logically inconsistent and expect the model to work around the inconsistency.


Replies

gs17yesterday at 6:09 PM

It seems like a fine question to me. If the question is "logically inconsistent" (IMO it's more that it's vague if you don't say why you're going there), then we want a model to respond with a request asking for clarification that resolves the inconsistency to generate a correct answer, or an answer that outlines the different cases. Some models even fail when you say that you need to wash your car in the prompt.

show 1 reply