We propose a new
method, called Adaptive Convex Enveloping, for solving convex stochastic
dynamic programs (DP) with multidimensional continuous state and decision
variables. The new method approximates the value functions with supporting
hyperplanes, which is convexity-preserving. As a result, the method
is able to handle large numbers of decision variables and constraints
with the speed and reliability of convex optimizations. To construct
a supporting hyperplane approximation, we use a recursive partitioning
algorithm, which “learns” the shape of the underlying function
and adds supporting hyperplanes adaptively. The approximation so constructed
uses supporting hyperplanes economically, and guarantees that the error
is less than a tolerance for all points in a continuous state space.
Error bounds of
the solution can be developed using convexity. Both the optimal value
function and the value function of the obtained policy are bounded below
by the solution of Adaptive Convex Enveloping, and bounded above by
the solution plus the total tolerance. These bounds are directly available
without having to do any simulation, and they are simple numbers, rather
than analysis terms, thus they provide confidence. The performance of
the obtained policy can also be simulated and used as a tighter upper
bound of the optimal value function.
An exciting feature
of Adaptive Convex Enveloping is that it is very much standardviized.
It saves the user the time spent on trial and error for designing the
approximation functions and tuning the parameters. For the optimization,
our method calls existing mathematical optimization libraries. So to
solve a DP, the user only needs to write models for the libraries to
read, which is a similar experience as doing mathematical optimizations.
In all, the method is convenient for modeling and solving real world
problems. We apply Adaptive Convex Enveloping to management of battery
exchange stations to find the optimal policy for charging a large number
of electric vehicle batteries. We solve two variants of the problem:
one with known but changing electricity prices and stochastic customer
demands, the other with an additional option for the station to discharge
the batteries in the inventory and sell electricity back to the grid.
<<
Back