Chain-of-Spot: Interactive Reasoning Improves 

Large Vision-language Models 


Zuyan Liu*   Yuhao Dong*   Yongming Rao   Jie Zhou   Jiwen Lu   

 Tsinghua University    Tencent   

* Equal Contribution

[Paper (arXiv)]      [Code (GitHub)]