Machine Learning for Visual Understanding
(시각적 이해를 위한 기계학습)

Spring 2021

Mon/Wed, 12:30 - 13:45

TA: TBD

Summary

This course covers mathematical modeling and machine learning techniques to analyze visual (and other multimedia) data. Specifically, this course focuses on fundamental machine learning and recent deep learning methods that are widely used in visual data analysis, and discusses how these methods are applied to solve various problems with visual data. This course consists of lectures, practices, and a team project. Topics include

  • · Review of machine learning and neural networks

  • · Convolutional Neural network (CNNs)

  • · Recurrent neural networks (RNNs)

  • · Image problems (image classification, object detection, segmentation)

  • · Video problems (video classification, action recognition, temporal localization, tracking)

  • · Multi-modal data analysis (visual-audio-text)

  • · Generative modelin

Logistics

Textbook

  • · “Probabilistic Machine Learning: An Introduction (2nd Ed.)” by Kevin Murphy, 2021, MIT Press.
  • · “Deep Learning” by Ian Goodfellow, Yoshua Bengio, Aaron Courville, 2015, MIT Press.
  • ·  Additional reading materials and papers will be provided.

Prerequisite

  • · Intermediate+ Python programming: you should be able to code what you think in Python.
  • · Machine learning basics: took this course or equivalent
  • · Basic calculus, linear algebra, data structures and algorithms

Grading

  • · Assignments 20%
  • · Mid-term exam 25%
  • · Final exam 25%
  • · Team project 30% (proposal 5%, mid-term 10%, final 15%)

Content

1. Course Introduction

2. First Approaches for Image Classification

3. Loss Functions and Optimization

4. Neural Networks Basics &  Backpropagation

5. Convolutional Neural Networks

6. Training Neural Networks

7. Transfer Learning, CNN Case Studies

8. Object Detection

9. Video Classification (Action Recognition)

10. Recurrent Neural Networks

11. RNN-based Video Models

12. Metric Learning

13. Multimodal Learning

14. Generative Models

15. Self-supervised Learning

16. Style Transfer

17. Scientific Applications