Search this site
Embedded Files
Zakir Hossain
  • Home
  • Research
    • Interests
    • Publications
      • Journal Articles
      • Conference Proceedings
      • Others
    • Projects
      • Project 1
      • Project 2
      • Project 3
      • Project 4
      • Project 5
      • Project 6
    • Collaborators
    • PhD Projects
    • Grants
  • Employment
  • Education
  • Teaching
  • Hobbies
  • Awards
    • News
Zakir Hossain
  • Home
  • Research
    • Interests
    • Publications
      • Journal Articles
      • Conference Proceedings
      • Others
    • Projects
      • Project 1
      • Project 2
      • Project 3
      • Project 4
      • Project 5
      • Project 6
    • Collaborators
    • PhD Projects
    • Grants
  • Employment
  • Education
  • Teaching
  • Hobbies
  • Awards
    • News
  • More
    • Home
    • Research
      • Interests
      • Publications
        • Journal Articles
        • Conference Proceedings
        • Others
      • Projects
        • Project 1
        • Project 2
        • Project 3
        • Project 4
        • Project 5
        • Project 6
      • Collaborators
      • PhD Projects
      • Grants
    • Employment
    • Education
    • Teaching
    • Hobbies
    • Awards
      • News

Deep Learning for Generating Images from Texts

Back to Projects page

Description

Upon recent success in the image generation and natural language processing, it enables the possibility to improve the text-to-image generation problems. For example, DM-GAN [1] achieves high R-Precision but Frechet Inception Distance (FID) and Inception Score (IS) are still questionable compared to a convention generative adversarial network (GAN) trained on the same dataset. They also lack the flexibility of controlling object layouts. In the context, this project aims to generate high-resolution photo-realistic images from text descriptions by using the GAN and primarily considering CoCo2014[2] and Cub-200-2011[3] databases. Initial model will be developed at a smaller scale with a lower resolution target and extend it further on large data sets in a convenient manner.

  

Background Literature

[1] Zhu et al. DM-GAN: Dynamic Memory Generative Adversarial Networks for Text-to-Image Synthesis, IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5802-5810, 2019.

[2] Lin et al. Microsoft COCO: Common Objects in Context, arXiv 2015. available: https://arxiv.org/pdf/1405.0312.pdf

[3] Caltech-UCSD Birds-200-2011, available: http://www.vision.caltech.edu/visipedia/CUB-200-20...

Google Sites
Report abuse
Page details
Page updated
Google Sites
Report abuse