BLIS Retreat 2019
Contributed talks
-
Cris Cecka, NVIDIA
Title: Programming GPUs for speed-of-light linear algebra -
Marat Dukhan, Google Research
Title: Indirect GEMM and Indirect Convolution Algorithm
Abstract: Deep learning frameworks commonly implement convolution operators with GEMM-based algorithms. In these algorithms, convolution is implemented on top of GEMM primitive, provided by highly optimized BLAS libraries. Convolutions with 1x1 kernels can be directly represented as a GEMM call, but convolutions with larger kernels require a special memory layout transformation - im2col or im2row - to fit into GEMM interface. The Indirect Convolution algorithm provides the efficiency of the GEMM primitive without the overhead of im2col transformation. In contrast to GEMM-based algorithms, the Indirect Convolution does not reshuffle the data to fit into the GEMM primitive but introduces an indirection buffer --a buffer of pointers to the start of each row of image pixels. This broadens the application of our modified GEMM function to convolutions with arbitrary kernel size, padding, stride, and dilation.
Paper on ArXiv . -
Albert Cohen, Google
Title: What MLIR has to offer as a toolkit for building and leveraging numerical libraries in ML and HPC -
Thomas Hines, Tennessee Tech
Title: Issues with fat by thin matrix multiplication -
Jianyu Huang, Facebook
Title: FBGEMM: High-Performance Low-Precision Library for Deep Learning Inference -
Tze Meng Low, CMU
Analytical models for MMM-like problems on GPUs -
Devin Matthews, SMU
Title: GEMM-Based Kernels for Tensor Hypercontraction -
John McCalpin, TACC
Title: What You Don’t Know Can Hurt Performance — Snoop Filters in Intel Xeon Scalable Processors -
Christos Psarras, RWTH-Aachen
Title: The Linear Algebra Mapping Problem -
Martin Schatz, Facebook
Title: FLAME in Machine Learning (ML) Applications -
Tyler Smith, ETH-Zurich
Title: I/O Lower Bounds for Small MMM -
Nicholai Tukanov, UT-Austin
Title: Mapping BLIS to the IBM Power9 architecture -
Field Van Zee, UT-Austin
Title: The BLIS Approach to Skinny Matrix Multiplication -
Kiran Varaganti, AMD
Title: BLIS optimizations and results on AMD Rome (tentative title)