diff --git a/2024/pics/PedroValero-Lara.jpeg b/2024/pics/PedroValero-Lara.jpeg deleted file mode 100644 index 2c2ae2c..0000000 Binary files a/2024/pics/PedroValero-Lara.jpeg and /dev/null differ diff --git a/2024/pics/PhilippeTillet.jpeg b/2024/pics/PhilippeTillet.jpeg new file mode 100644 index 0000000..ece293b Binary files /dev/null and b/2024/pics/PhilippeTillet.jpeg differ diff --git a/2024/program.html b/2024/program.html index f5fc5dd..d7516ad 100644 --- a/2024/program.html +++ b/2024/program.html @@ -3,14 +3,77 @@ title: AsHES Workshop ---
-TBD - +
+

Opening Remarks

+

10:30 am - 10:40 am

- +

Session 1: High-Performance Computing

+

10:40 am - 12:00 pm

+

Session Chair: Shintaro Iwasaki, Meta

+ + +

Lunch Break

+

12:00 pm - 1:00 pm

+ + +

Keynote

+

1:00 pm - 2:00 pm

+

Block-based GPU Programming with Triton

+

Philippe Tillet, OpenAI

+

Abstract: + Philippe Tillet + Traditional single instruction, multiple threads (SIMT) programming with CUDA, for all its benefits, can be daunting to machine learning researchers in need of fast custom kernels. We'll shed light on alternative programming models capable of improving GPU programmability without too much of an impact on expressivity. Some such models have recently emerged (e.g., Exo, MLIR Affine), but these are rarely applicable beyond dense tensor algebra — making them a poor fit for workloads requiring (for example) custom data structures. We'll describe the design and implementation of Triton, a mid-level programming language that uses block-based abstractions to simplify kernel development and fusion for researchers without any GPU programming expertise. +

+

Bio: + Philippe Tillet first began working with GPUs in 2011 as a contributor to the ViennaCL library. He then received his B.S. from Telecom SudParis (France) in 2012, his M.S. from NCTU (Taiwan) in 2014, and his Ph.D. from Harvard University in 2020 with a dissertation on compilers for blocked algorithms on GPUs. He joined OpenAI full time in 2020 to pursue his work on the Triton compiler — a project he started in 2018 after being frustrated by the difficulty of writing auto-tuners for matrix multiplications in CUDA. Since then, he grew the Triton language into a reference for block-based programming model, and wrote all the training kernels that were used by GPT4. +

+ +

Session 2: Accelerating AI/ML Workloads

+

2:00 pm - 3:10 pm

+

Session Chair: Carl Pearson, Sandia National Laboratories

+ + +

Closing Remarks

+

3:10 pm - 3:20 pm

+ +

Presentation

+ All presentations will be in-person. + Presenters are expected to target 25 minutes (full papers) or 15 minutes (short papers) for the talks with 5 minutes for questions. +