Graduate and Postdoctoral Studies
Static Cost Estimation for Data Layout Selection on GPUs
Thursday, April 6, 2017
to 5:00 PM
1049 Duncan Hall
Performance modeling provides mathematical models and quantitative analysis for designing and optimizing computer systems and architectures. For many classes of applications, high-latency memory accesses often dominate execution time. Thus, performance modeling for memory accesses on high performance architectures has become an important research topic.
The data layout of an application refers to the way in which data is stored and organized. In high performance computation, the data layout can significantly affect the efficiency of memory access operations. In recent years, the problem of data layout selection has been well studied on various multi-core CPU and some heterogeneous architectures.
GPUs have memory hierarchies different from multi-core CPUs. While data layout selection on GPUs has been studied by several existing works, there is still a lack of a mathematical cost model for data layout selection on GPUs. This motivates us to investigate static cost analysis methods which could better guide future data layout selection work, and perhaps even the design of new SIMT architectures.
In this thesis, we present a comprehensive cost analysis for data layout selection on GPUs. We build our cost function based on knowledge of the GPU memory hierarchy, and develop an algorithm which allows researchers to perform compile time cost estimation for a given data layout. Furthermore, we introduce a new vector based cost representation to represent the estimated cost, which can better estimate the memory access cost of applications with dynamic length loops. We apply our cost analysis to selected benchmarks from past publications on data layout selection, and our experimental results show that our cost analysis can accurately predict the relative costs of different data layouts.