Équipe persyval HPES journée sur la compilation dynamique
Mercredi 12 novembre 2014 Salle 320 UFR IM2AG
Bâtiment F - Institut Fourier (Tram Gabriel Fauré) 60 rue de la chimie, Saint Martin d'Hères
Abstract: OpenCL (Open Computing Language) is designed for portable heterogeneous computing across a wide spectrum of platforms and architectures. OpenCL's portability is supported by dynamic compilation - every OpenCL driver must be able to compile portable (C99-based) OpenCL-C code for its device at runtime. Such dynamic compilation by design provides attractive opportunities, which surprisingly have not been fully leveraged yet. In this talk we will review several recent dynamic optimizations for OpenCL, focusing on how they involve compilation.
Short bio: Ayal Zaks joined the Intel OpenCL team in Haifa late 2011, and became the manager of its compiler for Intel CPUs and Xeon Phi co-processors. Prior to that Ayal spent 15 years at the IBM Haifa Research Laboratory where he managed its Compiler Technologies group and worked on compiler optimizations. Ayal is a member of the HiPEAC network of excellence. He received B.Sc., M.Sc., and Ph.D. degrees in mathematics and operations research from Tel Aviv University, and is a senior adjunct lecturer at the Technion.
Abstract: Compilation is an essential step to create efficient applications. This step allows the use of high-level and target independent languages while maintaining good performances. However, many obstacle prevent compilers to fully optimize applications. For static compilers, the major obstacle is the poor knowledge of the execution context, particularly knowledge on the architecture and data. This knowledge is progressively known during the application life cycle. Compilers progressively integrated dynamic code generation techniques to be able to use this knowledge. However, those techniques usually focuses on improvement of hardware capabilities usage but don’t take data into account. In this thesis, we investigate data usage in applications optimization process on Nvidia GPU.
We present a method that uses different moments in the application life cycle to create adaptive libraries able to take into account data size. Those libraries can therefore provide more adapted kernels. With the GEMM algorithm, the method is able to provide gains up to 100 % while avoiding code size explosion.
The thesis also investigate runtime code generation gains and costs from the execution speed, memory footprint and energy consumption point of view. We present and study 2 light-weight runtime code generation approaches that can specialize code. We show that those 2 approaches can obtain comparable, and even superior, gains compared to LLVM but at a lower cost.
M. Gaël Thomas : Professeur à Télécom Sud Paris, Rapporteur
M. François Bodin : Professeur à l’Université de Rennes I, Rapporteur
M. Albert Cohen : Directeur de recherche à l’INRIA, Examinateur
Mme Karine Heydemann : Maître de Conférences à l’UPMC Paris, Examinatrice
M. Jean-François Méhaut : Professeur à l’Université Joseph Fourier, Examinateur
M. Thierry Lepley : Ingénieur recherche senior à Nvidia, Examinateur
M. Ayal Zaks :Ingénieur recherche senior à Intel Israël, Examinateur
M. Henri-Pierre Charles : Directeur de recherche au CEA LIST, Directeur de thèse