Flere-Imsaho
21 hours agospmurrayzzz
21 hours agoThe problem with this approach is that it does not generalize well at all out of distribution. I'm not aware of any follow up to this, but I do think it's an interesting area of research nonetheless.
robrenaud
15 hours agoThere is a 4 choice choice question. Your best guess is the answer is B, at about 35% chance of being right. If you are graded on fraction of questions answered correctedly, the optimization pressure is simply to answer B.
If you could get half credit for answering "I don't know", we'd have a lot more models saying that when they are not confident.
anonym29
16 hours agoAgreed. Humans do not perform rote memorization for all possibilities of rules-based classifications like "kosher or not kosher".
>This is information that should be retrieved from RAG or whatever.
Firm disagreement here. An intelligent model should either know (general model) or RAG-retrieve (non-general model) the criteria for evaluating whether an animal is kosher or not, and infer based on knowledge of the animal (either general model, or RAG-retrieval for a non-general model) whether or not the animal matches the criteria.
>If a model responded with "I don't know the answer to that", then that would be far more useful.
Again, firm disagreement here. "I don't know" is not a useful answer to a question that can be easily answered by cross-referencing easily-verifiable animal properties against the classification rules. At the very least, an intelligent model should explain which piece of information it is missing (properties of the animal in question OR the details of the classification rules), rather than returning a zero-value response.
To wit: if you were conducting an interview for a developer candidate, and you asked them whether Python supports functions, methods, both, or neither, would "I don't know" ever be an appropriate answer, even if the candidate genuinely didn't know off the top of their head? Of course not - you'd desire a candidate who didn't know to say something more along the lines of "I don't know, but here's what I would do to figure out the answer for you".
A plain and simple "I don't know" adds zero value to the conversation. While it doesn't necessarily add negative value to the conversation the way a confidently incorrect answer does, the goal for intelligent models should never be to produce zero value, it should be to produce nonzero positive value, even when it lacks required information.
If a model responded with "I don't know the answer to that", then that would be far more useful. Is anyone actually working on models that are trained to admit not knowing an answer to everything?