Course Information
Course Number : CSCE 753
Course Title : Computer Vision and Robot Perception
Section : 600
Time : TR 3:55 pm - 5:10 pm
Location : HRBB 113
Credit Hours : 3
Instructor Details
Instructor : Zhengzhong Tu
Office : PETR 220
Phone : 979-845-5904
E-Mail : tzz@tamu.edu
Office Hours : T 5:30 pm - 6:30 pm
TA : Chan-Wei Hu
E-Mail : huchanwei123@tamu.edu
Office Hours : Th 10:00 am - 11:00 am (at EABC)
Course Description
A graduate course in computer vision and robot perception with a focus on computer vision fundamentals, techniques, and advances; deep learning basics; convolutional neural networks; advanced architectures like vision transformers; variational autoencoders (VAEs) and generative adversarial networks (GANs); various tasks in computer vision like detection, segmentation; low-level vision and robust vision; generative models: fundamentals and applications; creative AI; video generative models; 3D vision and spatial intelligence; autonomous driving techniques; large language models (LLMs) and vision-language models (VLMs).
Course Learning Outcomes
At the end of study, students will be able to:
List and describe standard methods of computer vision models and techniques.
Articulate the need for computer vision models for real-world applications.
Apply existing computer vision modeling approaches to tackle real-world applications.
Conduct literature reviews, summarize recent research findings, and develop new research ideas in the field of computer vision
Textbook and/or Resource Materials
This course does not mandate any textbook. The lecture slides/videos and other materials provided by the instructor will be sufficient, serving as the primary reference. In addition, the students are recommended to refer to the following textbooks:
Foundations of Computer Vision, Antonio Torralba, Philip Isola, William T. Freeman (2024).
Deep Learning, Ian Goodfellow, Yoshua Bengio, Aaron Courville (2016).
Computer Vision: Algorithms and Applications, Richard Szeliski (2010).
Dive into Deep Learning, Aston Zhang, Zack Lipton, Mu Li, Alex Smola (2019).
The Little Book of Deep Learning, FRANÇOIS FLEURET (2024).
Probabilistic Machine Learning: An Introduction, Kevin Murphy (2022).
Important Dates (Dates are specific to Spring 2025)
Assignments
Assignment 1 : 2/18
Assignment 2 : 3/18
Assignment 3 : 4/15
Quizzes
Quiz 1 : 2/18 2/25
Quiz 2 : 3/18
Quiz 3 : 4/15
Final team project
Teaming deadline : end of Week 4 (2/9 Sunday)
Proposal Report (2-pager) : end of Week 6 (2/23 Sunday)
Midterm Report (4-pager) : end of Week 12 (3/30 Sunday)
Final Report (8-page full report) : 5/2
Project code (Zipped) : 5/2
Grading Policy
Assignments - 30%
Assignment 1 - 10%
Assignment 2 - 10%
Assignment 3 - 10%
Quizzes - 15%
Quiz 1 - 5%
Quiz 2 - 5%
Quiz 3 - 5%
Class Participation - 5%
Group Course Project - 50%
Proposal Report - 10%
Midterm Report - 10%
Final Presentation - 15%
Final Report and Code Review - 15%
Projects that excel in novelty, e.g., in multidisciplinary areas (eg, AI for social goods, ethics, policy, quantum computing, climate, etc), judged by the instructor. +2%
There will be no final exam
The grading scale will be:
A = 90-100
B = 80-89
C = 70-79
D = 60-69
F = <60
Any academic misconduct in this course will result in a grade of F.
Description of Graded Work
The students should practice and learn how to identify a research gap in a specific topic in computer vision, conduct literature reviews, and close critical gaps in the final project session. At first, students should choose one specific topic and do the following tasks surrounding their chose topic.
Students are expected to finish 3 homework assignments. Each assignment has wripen questions and coding assignments
Students should be expected to apend 3 in-class quizzes to assess their understanding of the concepts and knowledge covered in the lectures.
Students are expected to attend both the lecture and presentation classes. Asking questions after paper presentation or final project presentation will receive a bonus point (up to 2%), judged by the TA and instruction.
Students will form small groups (1-4 students in each group, formed in the first few weeks) to conduct collaborative research activities until the final project presentation. Students will learn how to conduct research by identifying a research problem, running experiments, coding, and iterating until they write a technical report and give a presentation. The students should learn how to collaborate and communicate in a research group.
The final report should be an 8-page technical report using the CVPR template: https://github.com/cvpr-org/author-kit/releases. Supplementary materials are optional.
The final project code should be submitted as a Zip file or as a Google Drive link (it must be accessible). A clearly wripen README file should be included on how to run the project.
Late Work Policy
For late submission, each additional late day will incur a 10% penalty, no exception.
1 day late: 10% penalty
2 days late: 20% penalty
3 days late: 30% penalty
4 days late: 50% penalty
5 or more days late: 100% penalty
Work submitted by a student as makeup work for an excused absence is not considered late work and is exempted from the late work policy (Student Rule 7). The student is responsible for informing the instructor in a timely manner about excused absences. The instructor will then work with the student to catch up on their submissions, including missed group class participation.