Scalable high-performance computing
(확장형 고성능 컴퓨팅)

Fall 2020

Mon/Wed, 11:00 - 12:15

Instructor: Jaejin Lee (


This course will teach the students how to use high-performance computing (HPC) systems for data science. High-performance computing is not just achieving high performance; it is answering the question of how to make computing scalable from a single processor to a computer with almost infinite computing power. Exploiting parallelism is the basis for the answer. Multicores are currently widely used from mobile devices to supercomputers. Heterogeneous systems that use accelerators, such as GPUs and FPGAs, together with general-purpose CPUs, are increasing the number of users. In this course, you will learn the structure of these multicore and heterogeneous systems, and learn the basics of parallel programming, including parallelization and vectorization. You will gain programming experience with commonly used parallel programming models, Pthreads, OpenMP, MPI, OpenCL, and CUDA. Also, you will learn how to apply them to accelerate big data processing and Deep Learning.


1. Course introduction

2. Structures of sequential computer systems

3. Conventional hardware acceleration techniques for sequential computer systems

4. Dependences and parallelism

5. Structures of accelerators

6. Structures of parallel computer systems

7. Processes, threads, and virtual memory

8. Thread scheduling

9. Parallelization and vectorization

10. Synchronization

11. Optimizations for memory hierarchies

12. Loop optimizations

13. Pthreads

14. OpenMP

15. MPI

16. OpenCL

17. CUDA

18. Optimizations for I/O

19. SPARK and HPC

20. Optimizations for Deep Learning


Proficiency in Python programming and experience with PyTorch and TensorFlow

Grading Policy

  • · Attendance: 10%

  • · Assignment: 40%

  • · Midterm: 20%

  • · Final-term: 30%