你将学到什么
分布式
计算机架构
Openmp
并行计算
课程概况
This course will introduce you to the multiple forms of parallelism found in modern Intel architecture processors and teach you the programming frameworks for handling this parallelism in applications. You will get access to a cluster of modern manycore processors (Intel Xeon Phi architecture) for experiments with graded programming exercises.
This course can apply to various HPC and datacenter workloads and framework including artificial intelligence (AI). You will learn how to handle data parallelism with vector instructions, task parallelism in shared memory with threads, parallelism in distributed memory with message passing, and memory architecture parallelism with optimized data containers. This knowledge will help you to accelerate computational applications by orders of magnitude, all the while keeping your code portable and future-proof.
Prerequisite: programming in C/C++ or Fortran in the Linux environment and Linux shell proficiency (navigation, file copying, editing files in text-based editors, compilation).
课程大纲
周1
完成时间为 3 小时
Modern Code
In the Introduction we will learn...
7 个视频 (总计 41 分钟), 1 个阅读材料, 3 个测验
周2
完成时间为 3 小时
Vectorization
13 个视频 (总计 72 分钟), 3 个阅读材料, 2 个测验
周3
完成时间为 2 小时
Multithreading with OpenMP
10 个视频 (总计 41 分钟), 3 个阅读材料, 2 个测验
周4
完成时间为 3 小时
Memory Traffic
14 个视频 (总计 57 分钟), 3 个阅读材料, 2 个测验
周5
完成时间为 2 小时
Clusters and MPI
11 个视频 (总计 48 分钟), 3 个阅读材料, 2 个测验