데이터사이언스 원론

Foundations of Data Science
Joonseok Lee, Jaejin Lee, Seunggeun Lee, Min-hwan Oh, Hyung-Sin Kim, Sang Kyun Cha, Sanghack Lee

Goals

This course aims to cultivate theoretical knowledge and hands-on experience necessary for students from various backgrounds to deal with and analyze big data. Through this course, students learn the basic knowledge of data-oriented computing, quantitative thinking and reasoning, and exploratory data analysis. Based on this, students learn key principles and techniques for data-driven problem solving, such as data analysis methods, big data management systems, problem formulation, data collection and organization, visualization, reasoning, predictive modeling, and decision making.

Content

Statistics & Basic Programming

DateTopicInstructorDue
Introduction· Course Logistics
· What is Data Science?
· Lab: Linux, Python Programming
Joonseok Lee
Sang Kyun Cha
Jaejin Lee
Statistics for Data Science, Basic Programming· Data sampling, Probability
· Lab: Python Programming
Seunggeun Lee
Jaejin Lee
Statistics for Data Science, Basic Programming· Random variables and Expectation
· Lab: Python Programming
Seunggeun Lee
Jaejin Lee
Statistics for Data Science, Basic Programming· Variance and Asymptotics
· Lab: Python Programming
Seunggeun Lee
Jaejin Lee
Statistics for Data Science, Basic Programming· Estimation, Bias and Mean squared error
· Lab: Python Programming
Seunggeun Lee
Jaejin Lee

Computing Methodology

DateTopicInstructorDue
Algorithmic Thinking, Computational Complexity· Time and Space Complexity
· Lab: Peak Finding
Min-hwan Oh
Searching· Searching (Binary search)
· Lab: Searching Problems
Min-hwan Oh
Sorting· Sorting (Insertion sort, Selection sort, Merge sort, Quick sort)
· Lab: Python Programming
Min-hwan Oh
Data Structures· Array, Linked list
· Lab: Array, Linked list Problems
Hyung-Sin Kim
Data Structures· Stack, Queue
· Lab: Stack, Queue Problems
Hyung-Sin Kim
Data Structures· Trees (Binary tree, Binary search tree)
· Lab: Trees Problems
Hyung-Sin Kim
Data Structures· Graph, Hash table
· Lab: Graph, Hash table Problems
Hyung-Sin Kim

Database

DateTopicInstructorDue
Introduction to Database· Introduction to Database
· Lab: SQL
Sang Kyun Cha
Graph Database· Graph Database
· Lab: Neo4j
Sang Kyun Cha
Mid-term ExamLast day to unregister

Machine Learning

DateTopicInstructorDue
Linear Regression· Introduction to ML, Linear Regression
· Lab: Linear Regression
Sanghack Lee
Linear Regression· Logistic Regression
· Lab: Logistic Regression
Sanghack Lee
Decision Trees· Decision Trees
· Lab: Decision Trees, Random forests
Sanghack Lee
Overfitting, Regularization· Overfitting, Regularizaion
· Lab: Regularization Methods
Sanghack Lee
Nearest Neighbors· Nearest Neighbor Classifiers
· Lab: Handwritten Digits Classification using Nearest Neighbors
Joonseok Lee
Optimization· Gradient descent, SGD, advanced optimization, Cross validation
· Lab: Gradient descent
Joonseok Lee
Neural Networks· Neural networks, Backpropagation
· Lab: Neural networks with TensorFlow
Joonseok Lee
Introduction to Deep Learning· Deep learning, Convolutional neural networks (CNN)
· Lab: CNN-based Image Classification
Joonseok Lee
Unsupervised Learning· Clustering, Dimension reduction
· Lab: K-means Clustering for Image Compression
Joonseok Lee

Advanced Topics

DateTopicInstructorDue
Decision Making· Reinforcement learning
· Lab: Reinforcement learning
Min-hwan Oh
Ambient AI· Ambient AI
· Lab: Ambient AI with Edge Devices
Hyung-Sin Kim
Causal Inference· Causal Inference
· Lab: Causal Inference
Sanghack Lee
Final Exam

Grading Policy

  • : Assignment 35%, Mid-term 30%, Final exam 30%, Attendance 5%