Do Theory of Mind Benchmarks Need Explicit Human-like Reasoning in Language Models?