LLM-based task classifiers systematically misroute prompts that look simple on the surface but require deeper processing — what we call Type II error in classification. We tested whether prepending a single "Step-0" question before the classification decision reduces this failure mode, and ran a mec