Spring 2022: CS 6384 Computer Vision

Understand the 3D world from 2D images

Course Information

Term: Spring 2022
Class Level: Graduate
Activity Type: Lecture
Days & Times: Monday & Wednesday 1:00 PM – 2:15 PM
Location: ECSN 2.126

Instructor: Prof. Yu Xiang
Office Location: ECSS 4.702
Office Hours: Monday & Wednesday 3:30PM – 4:30 PM

Teaching Assistant: Jikai Wang
Office Hours: Tuesday 1:00PM – 2:00 PM

Syllabus

All the course materials can be found here.

Course Description

Theory and practice of computer vision. Provides in-depth overview of computer vision, including geometric primitives and transformations, camera models, image features, epipolar geometry and stereo, structure from motion and SLAM, 3D reconstruction, variations of modern neural networks and various recognition problems such as object detection, semantic segmentation, and human pose estimation.

Textbooks

Richard Szeliski. Computer Vision: Algorithms and Applications. 2011th Edition. Springer.
ISBN-13: 978-1848829343
ISBN-10: 1848829345
Second Edition Draft

David Forsyth, Jean Ponce. Computer Vision: A Modern Approach, 2nd Edition. Pearson, 2011. (Optional)
ISBN: 9789332550117

Richard Hartley. Multiple View Geometry in Computer Vision, 2nd Edition. Cambridge University Press, 2004. (Optional)
ISBN-13: 978-0521540513
ISBN-10: 0521540518

Grading Policy

  • Homework (50%)
    • Assignment 1 (10%)
    • Assignment 2 (10%)
    • Assignment 3 (10%)
    • Assignment 4 (10%)
    • Assignment 5 (10%)
  • Team Project (45%)
    • Project proposal (5%)
    • Project mid-term report (10%)
    • Project presentation (15%)
    • Project final report (15%)
  • In-class Activity (5%)

Project

  • Project proposal description (PDF)
  • Project mid-term report requirement (PDF)
  • Project presentation and final report requirement (PDF)

Homework

Guest Lecturer

Dr. Fei Xia from Google Research talked about Embodied AI on 4/27/2022.

Lectures

DateTopic
Week 1, 1/17Martin Luther King Day
Week 1, 1/19, Lecture 1Introduction to Computer Vision (slides)
Week 2, 1/24, Lecture 2Image Formulation:
Geometric Primitives and Transformations (slides)
Week 2, 1/26, Lecture 3Image Formulation:
3D Rotations (slides)
Week 3, 1/31, Lecture 4Image Formulation:
Camera Models (slides)
Week 3, 2/2, Lecture 5 Image Formulation:
Visual Rendering I (slides)
Week 4, 2/7, Lecture 6 Image Formulation:
Visual Rendering II (slides)
Week 4, 2/9, Lecture 7Feature Detection and Matching:
Keypoint Features I (slides)
Week 5, 2/14, Lecture 8 Feature Detection and Matching:
Keypoint Features II (slides)
Week 5, 2/16, Lecture 9 Feature Detection and Matching:
Edges, Contours and Lines (slides)
Week 6, 2/21, Lecture 103D Vision:
Camera Calibration and Pose Estimation (slides)
Week 6, 2/23, Lecture 11 3D Vision:
Epipolar Geometry and Stereo (slides)
Week 7, 2/28, Lecture 12 3D Vision:
Structure from Motion and SLAM (slides)
Week 7, 3/2, Lecture 13 3D Vision:
3D Reconstruction (slides)
Week 8, 3/7, Lecture 14Deep Learning:
Convolutional Neural Networks I (slides)
Week 8, 3/9, Lecture 15Deep Learning:
Convolutional Neural Networks II (slides)
Week 9, 3/14Spring Break
Week 9, 3/16Spring Break
Week 10, 3/21, Lecture 16Deep Learning:
Recurrent Neural Networks (slides)
Week 10, 3/23, Lecture 17Deep Learning:
Transformers (slides)
Week 11, 3/28, Lecture 18 Deep Learning:
Generative Neural Networks (slides)
Week 11, 3/30, Lecture 19 Deep Learning:
Neural Networks for 3D Data
(slides)
Week 12, 4/4, Lecture 20Recognition:
Visual Representation Learning (slides)
Week 12, 4/6, Lecture 21Recognition:
Optical Flow and Correspondences (slides)
Week 13, 4/11, Lecture 22Recognition:
Object Detection (slides)
Week 13, 4/13, Lecture 23Recognition:
Semantic Segmentation (slides)
Week, 14, 4/18, Lecture 24Recognition:
Pose Estimation of Objects, Humans and Hands (slides)
Week 14, 4/20, Lecture 25Recognition:
Images and Languages (slides)
Week 15, 4/25, Lecture 26Application:
Computer Vision in Robotics (slides)
Week 15, 4/27Guest Lecture: Dr. Fei Xia
Hierarchical Learning Approaches for Long Horizon Robotics Tasks (slides)
Week 16, 5/2Project Presentation I
Group 1: Visual Navigation Using ORB-SLAM3 (slides, demo)
Group 2: Teaching Robots to Explore Unseen Environments (slides)
Group 3: Interacting with Virtual Environment through Hand Pose Estimation (slides, demo)
Group 4: Image Segmentation (slides)
Group 6: Pose Based Form Correction Trainer (slides, demo)
Group 8: Parking Spot Detection OpenCV (slides)
Group 9: Identity Verification using Siamese Neural Networks (slides)
Group 11: Few-shot Object Classification
in Clutter Scenes (slides)
Group 16: Solving Sudoku using Object Character Recognition (slides)
Week 16, 5/4Project Presentation II
Group 10: Visual Question Answering (slides)
Group 12: Scene Description Generation (slides)
Group 13: A Study on Artist Attestation (slides)
Group 14: Object Detection with DETR (slides)
Group 15: Comparative Analysis of Blood Cell Image Classification (slides)
Group 17: Referring Expression Comprehension with Audio Query (slides)
Group 18: Image Segmentation for Platypuses in Nature (slides)
Group 19: Image Grounding using Attention based Transformer (slides)
Group 20: Cutting-Edge Techniques for Depth Map Super-Resolution (slides)