Training an LLM on a ton of multiple choice questions doesn't "infect it" like y...

Training an LLM on a ton of multiple choice questions doesn't "infect it" like you're thinking. The tokens capture the fact it's a multiple choice question, and the LLM eventually captures the nuance of textual entailment as a common form of multiple choice question.

In a more natural conversational setting, you'd get a different answer:

https://chat.openai.com/share/00fed9d6-e3de-4319-9c76-ae1800...

https://chat.openai.com/share/dc9a796c-870c-44ee-b421-31c24b...