Self-Corrected Multimodal Large Language Model for End-to-End Robot Manipulation