Exploiting Hybrid Policy in Reinforcement Learning for Interpretable Temporal Logic Manipulation