Graduate and Postdoctoral Studies
Computational and Applied Mathematics
A Parallel-In-Time Gradient-Type Method For Optimal Control Problems
Friday, April 7, 2017
to 4:00 PM
2014 Duncan Hall
This thesis proposes and analyzes a new parallel-in-time gradient-type method for time-dependent optimal control problems. When the classical gradient method is applied to time-dependent optimal control problems, each iteration requires the forward solution of the state equations followed by the backward solution of the adjoint equations before the gradient can be computed and the controls can be updated. The solution of the state equations and the adjoint equations are typically computationally expensive and are carried out sequentially consuming most of the computation time. The proposed new parallel-in-time gradient-type method introduces parallelism by splitting the time domain into N time subdomains and executes the forward and backward computation in each time subdomain in parallel using state and adjoint variables at time subdomain boundaries from last optimization iteration as initial values. In the time used by a single classical gradient method iteration, roughly N parallel-in-time gradient-type iterations can be executed. In addition, I generalize the proposed parallel-in-time gradient-type method so that it allows for different time domain partitions for forward and backward computations and overlapping time subdomains. Due to the time domain decomposition, the state and the adjoint equations are not satisfied, since states and adjoints exhibit jump discontinuities at the time subdomain boundaries. Therefore the control update direction in the parallel gradient-type method is not a negative gradient direction and existing convergence theories for the gradient method do not apply.
I provide convergence analyses for the new parallel-in-time gradient-type method applied to discrete-time optimal control problems. For linear-quadratic discrete-time optimal control problems, I show that the new parallel-in-time gradient-type method can be interpreted as a multiple-part splitting iteration scheme where control update in one iteration is determined by control variable iterates from multiple previous iterations, and I prove convergence of the new method for sufficiently small fixed step size by showing that the spectral radius of a corresponding implicitly constructed iteration matrix is less than one. For general non-linear discrete-time optimal control problems the parallel-in-time gradient-type method is combined
with metric projection onto a convex set to handle simple control constraints. Convergence is proved for sufficiently small step sizes. Convergence theorems are given with different assumptions of the problem, e.g. convex objective function, compact control constraints, etc.
For linear-quadratic optimal control problems, I also interpret the parallel-in-time gradient-type method using a multiple shooting reformulation of the optimization problem. The new method can be seen as using a gradient-type iteration to solve the optimality saddle point system in the multiple shooting formulation. An alternative convergence proof is given for linear-quadratic problems by the multiple shooting point of view.
The new parallel-in-time gradient-type method is applied to linear-quadratic optimal control problems governed by linear advection-diffusion partial differential equations (PDEs), and to a well-rate optimization problem governed by a system of highly non-linear PDEs that models two-phase immiscible incompressible subsurface flow. In the linear-quadratic optimal control problem, each parallel gradient-type iteration with N time subdomains takes roughly 1/N of the classical gradient iteration time. The gradient-type control update in the parallel gradient-type iteration differs from the exact gradient update in the classical gradient method. However, the parallel-in-time gradient-type method converges to the optimal solution with a similar number of iterations as the classical gradient method, which leads to the strong scaling with up to around 50 cores in one of the examples. In the well-rate optimization problem, apart from the parallelism in the time dimension introduced by the parallel-in-time gradient-type method, in each time subdomain, the existing parallelism in the apace dimension, i.e. parallel linear solvers/preconditioners, is also applied, which constitutes a two-level parallelism that enjoys the speed-ups from time and space dimension multiplied. In the model problem, four times speed-ups is achieved by the parallelism only in the time dimension.