Advanced Statistics for Data Science
(데이터사이언스를 위한 고급 통계 분석)

Instructor: Seunggeun Lee (


This course introduces statistical methods for advanced data analysis, especially regression-based methods. Based on the characteristics of data and analysis purpose, students will learn how to find appropriate statistical models, how to fit the data and interpret the results. Through the course project, student will apply the methods to real data. The course will cover the following topics:

  • · Linear model and linear mixed model

  • · Generalized linear model

  • · Shrinkage method and variable selection

  • · Graphical methods and causal Inference


01. Course introduction: Review of key distributions and matrix algebra

02. Linear regression 1

03. Linear regression 2

04. Linear regression 3

05. Mixed effect model

06. Shrinkage/Penalized methods

07. GLM Introduction

08. GLM Estimation and Midterm

09. GLM Inference

10. Logistic regression

11. Logistic regression

12. Multinomial regression

13. Poisson regression

14. Graphical Model and Causal Inference

15. Final project presentation


Linear regression and GLM materials are from the following book. PDF versions of books are available online. But the course will mostly follow course slides. So it is not required to buy the books.

  • · Julian J Faraway, Linear models with R, 2nd Edition. CRC Press (Chapman & Hall).

  • · Dobson, AJ., Barnett, A.G. An Introduction to Generalized Linear Models, 3rd Edition. CRC Press (Chapman & Hall).

Grading Policy

  • · Attendance: 5%

  • · Task: 40%

  • · Midterm: 25%

  • · Final: 30%


  • Students are expected to have background in basic statistics.