A Member of the Architecture Research Group at NVIDIA,
Legion: A Programming Model for Modern Parallel Architectures
Tuesday, March 28, 2017
to 5:00 PM
1070 Duncan Hall
Reception in DH 3092 at 3:30pm
6100 Main St
Houston, Texas, USA
Modern computing architectures are growing increasingly complicated. Laws of physics have forced processor counts into the thousands or even millions, resulted in the creation of deep distributed memory hierarchies, and encouraged the use of multiple processor and memory types in the same system. To fully utilize such a system, an application must partition the computation and carefully manage data movement and the associated, unavoidable latencies.
Legion addresses these challenges by combining a traditional hierarchical application structure (i.e. tasks/functions calling other tasks/functions) with a hierarchical data model (logical regions, which may be arbitrarily partitioned into subregions). Application code written using tasks and regions forms a machine-agnostic description of a computation from which the Legion runtime can extract task- and data-parallelism and then map the application to the processors and memories of a particular machine.
While Legion is a full programming model with many interesting features, this talk will focus on two. First, we will discuss Realm, the lower-level runtime used by Legion to execute a mapped application. A novel feature of Realm is support for arbitrary composition of all primitive operations through events. A new “generational event” data structure allows Realm to efficiently and scalably handle a very large number of events in a distributed environment.
Second, we examine the challenge of partitioning complex application data structures. As a general-purpose programming model, Legion relies on the application to specify the desired partitions of data, but it can assist with the computation and analysis of those partitions. Using a new "dependent partitioning" framework, inter-related partitions of multiple regions (e.g. the nodes and edges of a graph data structure) can be described succinctly in application code and implemented in an efficient and scalable way by the Legion runtime. The framework admits static analysis, allowing many assertions regarding the consistency of different partitions to be verified during compilation.
Biography of Sean Treichler:
Sean Treichler is a member of the Architecture Research Group at NVIDIA, exploring novel architectures for parallel computing in conjunction with programming models, compilers, and runtime systems that make the performance of such architectures accessible to application programmers. Having recently received his PhD from the Computer Science Department at Stanford, advised by Prof. Alex Aiken, he now spends most of his free time (ha!) trying to keep up with his two daughters as they explore and develop their own interests.