Background Modeling via Visual Language Modeling