Scroll back to top
These notes were originally written by T. Bretl and were transcribed by S. Bout.
The following thing is called an optimization problem:
The solution to this problem is the value of
We know that we are supposed to choose a value of
We know that we are supposed to minimize
In particular, the solution to this problem is
We could plot the cost function. It is clear from the plot that the
minimum is at
We could apply the first derivative test. We compute the first derivative:
Then, we set the first derivative equal to zero and solve for
Values of
In general, we write optimization problems like this:
Again,
Here is another example:
The solution to this problem is the value of both
as small as possible. There are two differences between this optimization problem and the previous one. First, there is a different cost function:
Second, there are two decision variables instead of one. But again, there are at least two ways of finding the solution to this problem:
We could plot the cost function. The plot is now 3D — the “x” and
“y” axes are
We could apply the first derivative test. We compute the partial
derivative of
Then, we set both partial derivatives equal to zero and solve for
As before, we would have to apply a
further test in order to verify that this choice of
An equivalent way of stating this same optimization problem would have been as follows:
You can check that the cost function shown above is the same as the cost function we saw before (e.g., by multiplying it out). We could have gone farther and stated the problem as follows:
We have returned to having just one decision variable
This example is exactly
the same as the previous example, except that the two decision variables
(now renamed
We are no longer free to choose
Then, we plug this result into the cost function:
By doing so, we have shown that solving the constrained optimization problem
is equivalent to solving the unconstrained optimization problem
and then taking
The point here was not to show how to solve constrained optimization problems in general, but rather to identify the different parts of a problem of this type. As a quick note, you will sometimes see the example optimization problem we’ve been considering written as
The meaning is exactly
the same, but
“Among all choices of
for which there exists an satisfying , find the one that minimizes .”
We have seen three example problems. In each case, we were looking for the minimizer, i.e., the choice of decision variable that made the cost function as small as possible:
The solution to
was
The solution to
was
The solution to
was
It is sometimes useful to focus on the minimum instead of on the minimizer, i.e., what the “smallest value” was that we were able to achieve. When focusing on the minimum, we often use the following “set notation” instead:
The problem
is rewritten
The meaning is to “find the minimum value of
The problem
is rewritten
Again, the meaning is to “find the minimum value of
The problem
is rewritten
And again, the meaning is to “find the minimum value of
The important thing here is to understand the notation and to understand the difference between a “minimum” and a “minimizer.”
The following thing is called an optimal control problem:
Let’s try to understand what it means.
The statement
says that we are being asked to choose an input trajectory
The statement
is a
constraint. It implies that we are restricted to choices of
and satisfying the ordinary differential equation
One example of an ordinary differential equation that looks like this is our usual description of a system in state-space form:
The statement
says what we are trying to minimize — it is the cost function in
this problem. Notice that the cost function depends on both
As usual, there are a variety of ways to solve an optimal control problem. One way is by application of what is called the Hamilton-Jacobi-Bellman Equation, or “HJB.” This equation is to optimal control what the first-derivative test is to optimization. To derive it, we will first rewrite the optimal control problem in “minimum form” (see “Minimum vs Minimizer”):
Nothing has changed here, we’re just asking for the minimum and not the minimizer. Next, rather than solve this problem outright, we will first state a slightly different problem:
The two changes that I made to go from the original problem to this one are:
Make the initial time arbitrary (calling it
Make the initial state arbitrary (calling it
I also made three changes in notation. First, I switched from
You should think of the problem (2)
as a function itself. In goes an initial time
We call
The reason this equation is called a “recursion” is that
it expresses the function
from the first part and the cost
from the second part (where "
We now proceed to approximate the terms in (4) by first-order series expansions. In particular, we have
and we also have
If we plug both of these into (4), we find
Notice that nothing inside the minimum depends on anything other than
Also, notice that
does not depend
on
To simplify further, we can
subtract
The equation is called the Hamilton-Jacobi-Bellman Equation, or simply the HJB equation. As you can see, it is a partial differential equation, so it needs a boundary condition. This is easy to obtain. In particular, going all the way back to the definition (3), we find that
The importance of HJB is that if you can find a
solution to (5)-(6) — that is, if you can find a function
Here is the linear quadratic regulator (LQR) problem:
It is an optimal control problem — if you define
then you see that this problem has the same form as (1). It is
called “linear” because the dynamics are those of a linear (state space)
system. It is called “quadratic” because the cost is quadratic (i.e.,
polynomial of order at most two) in
The matrices
What this notation
means is that
Suppose
First, notice that
In fact,
Second, notice that we can replace
So, it is not that
Our “Solution approach” tells us to solve the LQR problem in two steps.
First, we find a function
What function
This function has the form
for some symmetric matrix
This is matrix calculus (e.g., see wikipedia or Chapter A.4.1 of Convex Optimization (Boyd and Vandenberghe)). The result on the left should surprise no one. The result on the right is the matrix equivalent of
You could check that this result is correct by considering an example. Plug these partial derivatives into HJB and we have
To evaluate the minimum, we apply the first-derivative test (more matrix calculus!):
This equation is easily solved:
Plugging this back into (7), we have
In order for this equation to be true for any
In summary, we have found that
solves the HJB equation, where
backward in time, starting from
Now that we know
for
Here is how to find the solution to an LQR problem with Python:
Text successfully copied to clipboard!
import numpy as np
from scipy import linalg
def lqr(A, B, Q, R):
# Find the cost matrix
P = linalg.solve_continuous_are(A, B, Q, R)
# Find the gain matrix
K = linalg.inv(R) @ B.T @ P
return K