My works

Works on video captioning

Multi-Modal Attention based Transformer for Video Captioning

Multi-modal Hierarchical Attention-based Dense Video Captioning