Project

There are many topics provided in this course project. Students can collaborate as a team to complete the project. Each team should: choose a topic, complete the topic, and present the topic. The following are rules and policy of the project.

Team work

AT MOST 10 teams. Each team has 1 ~ 4 members.

Project topic : choose only 1 topic from the following

OpenCV (6 topics)
Deep learning (2 topics): Human pose estimation, Holistic tracking.
Medical image (3 topics): Fundus image segmentation, OCT layer segmentation.
Miscellaneous topics : COVID, FPGA design, AOI defect detection (1 topic).
You can propose a new topic to the teacher if there is no appropriate topic for you. But you have to ask for permission from the teacher.

Requirement

Each team has to choose one topic and complete the followings
- Paper reading
- Program code writing
- Data experimenting
- Oral presentation (10~15 minutes)
- Written report (10 ~ 40 pages)

Schedule

Week 11, 5/04 : Project announcement
Week 12, 5/11～17 : Confirmation of team members and project title (Google Sheet)
Week 13, 5/18～24 : Team proposal, 1~5 pages, docx (Template).
Week 17, 6/15 : Project presentation, 10~15 minutes (ppt)
Week 18, 6/20 : Project report, 10~40 pages, docx (Template)

OpenCV

Goal

Complete a complex opencv project with C/C++.

Description

There are 6 OpenCV projects from the book
- Mastering OpenCV 3, 2nd, by D.L. Baggio, S. Emami, D.M. Escrivá, K. Ievgen, J. Saragih, R. Shilkrot. Packt Pub., 2017. PDF (restricted access), Source code (GitHub).
You have to choose at least one chapter of the book as your team project.
- Cartoonifier and Skin Changer for Android (Chapter 1)
- Marker-based Augmented Reality on iPhone or iPad (Chapter 2)
- Number Plate Recognition Using SVM and Neural Networks (Chapter 3)
- Non-rigid Face Tracking (Chapter 4)
- 3D Head Pose Estimation Using AAM and POSIT (Chapter 5)
- Face Recognition using Eigenfaces or Fisherfaces (Chapter 6)

Requirement

You have to read the book chapter of the program code.
You have to successfully compile and execute the code.
You have to experiment the code with your images, data, and different parameters.
You have to extend the code with at least EXTRA USEFUL 50 lines.
You have to extend the code with more OpenCV functions not used in the program codes: At least two functions or two algorithms.

Human Pose Estimation

Goal

Complete a human pose estimation project.

Description

Recently human pose estimation can be successfully solved by deep learning. Two very successful algorithms are OpenPose (by CMU) and BlazePose (by Google).
- OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields, ARXIV v2 2018.
- BlazePose: On-device Real-time Body Pose tracking, CVPR, 2020.
You can use 3rd-party library to complete the project. Suggested libraries are OpenVINO and MediaPipe.
You can use C/C++ to complete this project, but Python is better.
Suggested readings
- An Overview of Human Pose Estimation with Deep Learning - An introduction to the techniques used in Human Pose Estimation based on Deep Learning. 2019
- 六种人体姿态估计的深度学习模型和代码总结，2018.
- Guide to OpenPose for Real-time Human Pose Estimation, 2021.
- Openpose 人體姿態辨識 2020.
- Google MediaPipe for Pose Estimation, Github, 2021.
- Build a personal AI Trainer, Youtube, 2021.
- Multi-Person Pose Estimation in OpenCV using OpenPose, 2018.

Requirement

You have to complete a human pose estimation project with either OpenPose or BlazePose.
You have to experiment the code with your images, videos, and different parameters.
You have to extend the code for some applications, for example Build a personal AI Trainer.

Holistic Tracking

Goal

Complete a face/hand/body tracking project. The project may need to be implemented in embedded hardware.

Description

Recently face, hand, and body tracking can be successfully solved by deep learning. Usually an algorithm can apply in the tracking of only one object. Holistic tracking targets tracking all three objects: human face, human hand, and human body with only one algorithm.
- Simultaneously detecting face, hand motion, and pose in real-time on mobile devices
You can use 3rd-party library to complete the project. Suggested libraries are OpenVINO, DepthAI, and MediaPipe.
You can use C/C++ to complete this project, but Python is better.
You can use PC to complete this project, but embedded hardwares such as Raspberry Pi, nVidia Jetson, and Intel Neural Compute Stick are also feasible.
You may need a camera to do real-time tracking. You can use USB camera or OpenCV AI Kit (OAK).
You need to import MediaPipe deep learning models into DepthAI. Here are some tutorials:
- Hand pose : 从0到1学习使用DepthAI-手势识别
  - Hand tracking with DepthAI : Running Google Mediapipe Hand Tracking models on DepthAI hardware (OAK), GitHub.
- Human pose - 从0到1学习使用DepthAI-人体姿势跟踪检测
  - Blazepose tracking with DepthAI : Running Google Mediapipe body pose tracking models on DepthAI hardware (OAK), GitHub.

Requirement

You have to complete a holistic tracking with at least two objects.
You have to experiment the code with your images, videos, and different parameters.
You have to extend the code for some applications.

Fundus Image Segmentation

Goal

Complete a medical image segmentation project for fundus images.

Related papers

You can choose at least one of the following papers.

(MNet) Huazhu Fu, Jun Cheng, Yanwu Xu, Damon Wing Kee Wong, Jiang Liu, and Xiaochun Cao, "Joint Optic Disc and Cup Segmentation Based on Multi-label Deep Network and Polar Transformation", IEEE Transactions on Medical Imaging, vol. 37, no. 7, pp. 1597-1605, 2018. [Code@GitHub : based TensorFlow 1.14 + Keras) + Matlab] [PDF]
(DENet) Huazhu Fu, et al., "Disc-Aware Ensemble Network for Glaucoma Screening From Fundus Image," IEEE Transactions on Medical Imaging, vol. 37, no. 11, pp. 2493-2501, Nov. 2018. [Code@GitHub : Keras/Tensorflow] [PDF]

Data

REFUGE/REFUGE2, Drishti-GS1 database, RIM-ONE r3, RIGA.

Requirement

You have to read papers, complete the code of the paper, and write a report.

OCT Layer Segmentation

Goal

Complete a medical image segmentation project for OCT (Optical Coherence Tomography) images.

Related papers

You can choose at least one of the following papers.

S. Motamedi, et al., "Normative Data and Minimally Detectable Change for Inner Retinal Layer Thicknesses Using a Semi-automated OCT Image Segmentation Pipeline," Frontiers in Neurology, 25 November 2019. URL. SAMIRIX: Matlab @ GitHub, NeuroDIal @ GitHib for OCT analysis. [PDF]
A. Lang, A. Carass, M. Hauser, E. S. Sotirchos, P. A. Calabresi, H. S. Ying, J. L. Prince, "Retinal layer segmentation of macular OCT images using boundary classification." Biomedical Optics Express 4, 1133-1152, 2013. OCTLayerSegmentation by AURA Tools on NITRC [PDF]

Requirement

You have to read papers, complete the code of the paper, and write a report.

FPGA Design for Computer Vision

Goal

Understand some basic concepts of implementing computer vision by Verilog.

Readings

H. Jeong, Architectures for Computer Vision: From Algorithm to Chip with Verilog. John Wiley & Sons, 2014. PDF (restricted access)

Requirement

You have to read the book (chapters 1~4), write some example codes, and write a report.

COVID

Goal

Learn some computer vision techniques applied for COVID-19.

Readings

Diagnosing Pneumonia from X-Ray Images Using Convolutional Neural Network, 2022/04.
Transfer Learning: COVID-19 from Chest X-Rays Classifier, 2021/12.
Use Computer Vision to Capture Real-World Events, 2021/05.
- Leveraging computer vision to assist with COVID vaccine distribution
Mask Face: MaskTheFace — CV based tool to mask face dataset, Medium, 2020/08/26.

Requirement

You have to read the articles, write some example codes, and write a report.

AOI 瑕疵分類 (AIdea)

Goal (議題簡介)

自動光學檢查（簡稱 AOI），為高速高精度光學影像檢測系統，運用機器視覺做為檢測標準技術，可改良傳統上以人力使用光學儀器進行檢測的缺點，應用層面包括從高科技產業之研發、製造品管，以至國防、民生、醫療、環保、電力…等領域。工研院電光所投入軟性電子顯示器之研發多年，在試量產過程中，希望藉由 AOI 技術提升生產品質。本資料集由工研院提供，請同學針對所提供的 AOI 影像資料，來判讀瑕疵的分類，藉以提升透過數據科學來加強 AOI 判讀之效能。

Description (資料說明)

本議題所提供之影像資料，包含 6 個類別（正常類別 + 5 種瑕疵類別）。
下載資料 aoi_data.zip 檔案包含：
- train_images.zip：訓練所需的影像資料（PNG格式），共計 2,528 張。
- train.csv：包含 2 個欄位，ID 和 Label。
  - ID：影像的檔名。
  - Label：瑕疵分類類別（0 表示 normal，1 表示 void，2 表示 horizontal defect，3 表示 vertical defect，4 表示 edge defect，5 表示 particle）。
- test_images.zip：測試所需的影像資料（PNG格式），共計 10,142 張。
- test.csv：包含 2 個欄位，ID 和 Label。
  - ID：影像的檔名。
  - Label：瑕疵分類類別（其值只能是下列其中之一：0、1、2、3、4、5）。
其他說明請請詳見網站。

Requirement

Rules in AIdea

Page updated

Report abuse