BLIS Retreat 2019

Contributed talks

Cris Cecka, NVIDIA
Title: Programming GPUs for speed-of-light linear algebra

Marat Dukhan, Google Research
Title: Indirect GEMM and Indirect Convolution Algorithm

Abstract: Deep learning frameworks commonly implement convolution operators with GEMM-based algorithms. In these algorithms, convolution is implemented on top of GEMM primitive, provided by highly optimized BLAS libraries. Convolutions with 1x1 kernels can be directly represented as a GEMM call, but convolutions with larger kernels require a special memory layout transformation - im2col or im2row - to fit into GEMM interface. The Indirect Convolution algorithm provides the efficiency of the GEMM primitive without the overhead of im2col transformation. In contrast to GEMM-based algorithms, the Indirect Convolution does not reshuffle the data to fit into the GEMM primitive but introduces an indirection buffer --a buffer of pointers to the start of each row of image pixels. This broadens the application of our modified GEMM function to convolutions with arbitrary kernel size, padding, stride, and dilation.
Paper on ArXiv .
Albert Cohen, Google
Title: What MLIR has to offer as a toolkit for building and leveraging numerical libraries in ML and HPC

Greg Henry (Intel)
Title: TBD

Thomas Hines, Tennessee Tech
Title: Issues with fat by thin matrix multiplication

Jianyu Huang, Facebook
Title: FBGEMM: High-Performance Low-Precision Library for Deep Learning Inference

Tze Meng Low, CMU
Analytical models for MMM-like problems on GPUs
Devin Matthews, SMU
Title: GEMM-Based Kernels for Tensor Hypercontraction

John McCalpin, TACC
Title: What You Don’t Know Can Hurt Performance — Snoop Filters in Intel Xeon Scalable Processors

Maggie Myers, UT-Austin
Title: Educational outreach

Devangi Parikh, UT-Austin
Title: What BLIS brings to the table--performance, portability, and productivity

Christos Psarras, RWTH-Aachen
Title: The Linear Algebra Mapping Problem

Martin Schatz, Facebook
Title: FLAME in Machine Learning (ML) Applications

Brian Shi (Intel)
Title: TBD

Tyler Smith, ETH-Zurich
Title: I/O Lower Bounds for Small MMM

Nicholai Tukanov, UT-Austin
Title: Mapping BLIS to the IBM Power9 architecture

Field Van Zee, UT-Austin
Title: The BLIS Approach to Skinny Matrix Multiplication

Kiran Varaganti, AMD
Title: BLIS optimizations and results on AMD Rome (tentative title)