Optimization Tips & Tricks

This script contains the examples shown in the webinar titled Optimization Tips and Tricks: Getting Started using Optimization with MATLAB presented live on 21 August 2008. To view the webinar, please go here and click on recorded webinars.

I've tried to make the demos self explanatory in this file. But if something is broken or not clear, please let me know.

To run the demos in this file, you need Optimization Toolbox and Genetic Algorithm and Direct Search Toolbox, which I will refer to as GADS. These files work with Release R2008a. You will also need the Parallel Computing Toolbox to run hte parallel examples. Most demos should work with older releases, but some will require R2008a products. In particular, the parallel optimization examples require it. You can obtain a demo copy from http://www.mathworks.com/products/optimization/tryit.html or http://www.mathworks.com/products/gads/tryit.html

Contents

Introduction to Optimization - Mt. Washington Demo

This first example shows you how optimization in general works. Optimization solvers are domain searching algorithms that differ in the types of problems (or domains) they can solve (search). This example shows how the pattern search algorithm can be used to find the peak of the White Mountain Range.

Note: this demo is available in the GADS Toolbox.

% Load data for white mountains
load mtWashington;
% Load psproblem which have all required settings for pattern search
load psproblem;
% Run the solver and show displays
patternsearch(psproblem);

This command will load the example in optimization tool, a graphical user interface for setting up and running optimization problems. You can access optimization tool from the Start Menu --> Toolboxes --> Optimization --> Optimization Tool (optimtool) or by typing optimtool at the command prompt.

% Remove the % symbol if you'd like to run this part of the code.
% mtwashingtondemo

You should see two plots when running the Mt. Washington demo. One is a surface plot, the other is a topology map. On the topology map you can see that starting point and iterations (filled circles) and the tested points that were not selected (+ symbols). There is also a slider bar on the right that you can use to speed up/slow down the process.

Notice how the pattern search solver expands and contracts the search radius as it explores the domain for the maximum value. This is an example of a patterned search, as the name implies, and is only one of many search patterns the pattern search solver supports.

I showed this example to illustrate how effective the pattern search solver can be on a highly rough surface like this one, which gradient based solvers like the ones in Optimization Toolbox couldn't solve.

This leads to the primary difference between the Optimization Toolbox solvers and the GADS solvers. Optimization Toolbox solvers rely on gradients (well, most of them do anyway, fminsearch does not) to determine the search direction. GADS solvers use gradient free methods to find search directions and optimal values. This makes GADS solvers more useful on highly nonlinear, discontinuous, ill-defined, or stochastic problems. They are also better at finding global solutions, although they do not guarantee a global solution will be found.

The price you pay is that the GADS solvers are often slower and are not able to handle as large of problems as the Optimization Toolbox solvers.

Getting Started - MATLAB Backslash

Let's start with a simple curve fitting example. We have time (t) and a response (y). Looking at the plot, you can see that the curve follows an exponential decay.

close all, clear all
t = [0 .3 .8 1.1 1.6 2.3]';
y = [.82 .72 .63 .60 .55 .50]';
plot(t,y,'o')

Let's assume that an equation of this from can be used to describe the data

We can linearize this problem and solve it in MATLAB using the backslash operator.

E = [ones(size(t)) exp(-t)]
c = E\y
E =

    1.0000    1.0000
    1.0000    0.7408
    1.0000    0.4493
    1.0000    0.3329
    1.0000    0.2019
    1.0000    0.1003


c =

    0.4760
    0.3413

Notice that we have an over determined system. The neat thing about backslasch (\) is that when you try to solve an over determined system (more equations than unknowns), it tries to minimize the error when estimating the solution. It's an optimization solver (can be used as a least squares minimizer).

Let's plot the results. Not a bad fit.

yhat = @(c,t) c(1) + c(2)*exp(-t); % I'll describe function handles in a bit
ezplot(@(t)yhat(c,t),[0,2.5])
hold on
plot(t,y,'o')
hold off

Now Solve using lsqnonneg optimization solver

Let's try solving this with an optimization solver. Say lsqnonneg.

c = lsqnonneg(E,y)
c =

    0.4760
    0.3413

We get the same result. Note that lsqnonneg solves a linear system of equations using least squares minimization. If you haven't already, take a look at the optimization documentation for lsqnonneg. And if you have R2008a or later, take a look at the new documentation (Optimization Toolbox --> Optimization Overview --> Choosing a solver). This page gives a nice summary of the optimziation solvers, their problem types, and a decision matrix/table at the bottom of the page. Look at the least-squares solvers and note the different formulations.

That wasn't to hard, but let's now add a constraint to our curve fitting problem and make it a bit more challenging (i.e. something that Curve Fitting Toolbox doesn't currently support.

Constrain the y-intercept

Let's say that we know the data value at time 0 precisely. That is, we can set the starting condition precisely for this exponentially decaying system. This is a constraint, and we can add it to our least squares minimization problem that we've developed so far.

This constraint is a linear constraint on c(1) and c(2). At t = 0, c(1) + c(2)*1 = 0.8 --> Aeq*x = beq --> [1 1][c1 c2]' = 0.8 I've written the constraint in the format accepted by the Optimization Toolbox Solvers. And if we look at the doc for the solver we need, we'd find that the solver that matches our problems is lsqlin.

Aeq = [1 1];
beq = 0.82;
cc = lsqlin(E,y,[],[],Aeq,beq)
Warning: Large-scale method can handle bound constraints only;
    using medium-scale method instead.
Optimization terminated.

cc =

    0.4749
    0.3451

Now let's plot it against our data and the unconstrained fit earlier. We can see that it has a slightly different shape, and if you were to zoom into the y-intercept, you'd see that it passes through our data point where the unconstrained fit did not.

ezplot(@(t)yhat(c,t),[0,2.5])
hold on
tt = 0:0.1:2.5;
plot(tt,yhat(cc,tt),'g')
plot(t,y,'o')
hold off
legend('Unconstrained Fit', 'Constrained Fit', 'Data')

Working with function handles

The solver I showed earlier accepted MATLAB data types, arrays, to define the problem and constraints. However, most of the optimization solvers require that a function be passed in as the objective function(also true for nonlinear constraints). The mechanism that allows you to pass a function into another function is known as a function handle.

A function handle keeps reference to your function and allows you to pass a function into another function.

Let's redefine our equation as an anonymous function (an anonymous function is simply a function that exists only in memory - not in an M-fle).

yhat = @(c,t) c(1) + c(2)*exp(-t)
yhat(cc,t) % pass in the previous solution and the time vector
yhat = 

    @(c,t)c(1)+c(2)*exp(-t)


ans =

    0.8200
    0.7305
    0.6299
    0.5897
    0.5445
    0.5095

Note the @ symbol. This declares a function hanlde, and you can see that it behaves like a function call. Note that I can even redefine my function to only take in one input field (this will be important later). Let's redefine yhat to only be a function of c.

yhat2 = @(c) yhat(c,t)
yhat2(cc)
yhat2 = 

    @(c)yhat(c,t)


ans =

    0.8200
    0.7305
    0.6299
    0.5897
    0.5445
    0.5095

Notice that when I redefined yhat as yhat2, the time vector t is not passed. How does MATLAB know how to use t in this case? When I redefined the function to accept only c, MATLAB stored the time vector in the function handle object, so it can use it.

To prove this point, notice what happens if I redefine t, yhat2 does not use this new value since it is still referencing the t I had defined at the time I declared yhat2. So t became a constant vector, and was saved with the function handle definition of yhat2.

This is often a point of confusion when first starting to use function handles and it's good to be aware of how parameters get saved when creating a function handle.

t = 0:0.1:2.5; % has more points than before.
y2 = yhat2(cc)
y1 = yhat(cc,t')
y2 =

    0.8200
    0.7305
    0.6299
    0.5897
    0.5445
    0.5095


y1 =

    0.8200
    0.7872
    0.7574
    0.7305
    0.7062
    0.6842
    0.6643
    0.6462
    0.6299
    0.6152
    0.6018
    0.5897
    0.5788
    0.5689
    0.5600
    0.5519
    0.5445
    0.5379
    0.5319
    0.5265
    0.5216
    0.5171
    0.5131
    0.5095
    0.5062
    0.5032

Solve using lsqcurvefit

Ok, so let's get back to the reason I introduced function handles. You need them for most of the nonlinear optimization solvers, as they expect you to pass in a function as the objective or nonlinear constraint.

lsqcurvefit expects a function to be passed in. Here I pass it yhat (note that it is already a function handle in the workspace, if I wanted to pass in an M-file function called yhat, I need to pass it as @yhat). Also take notice that lsqcurvefit allows you to pass in the data to fit to, in this case t and y. You can see we get the same result as in the unconstrained case (this solver does not allow linear constraints, that's why we're using the unconstrained case).

t = [0 .3 .8 1.1 1.6 2.3]';     % define again since we changed it earlier
clear yhat2;                    % no longer needed
c = lsqcurvefit(yhat,[1 1],t,y)
Optimization terminated: first-order optimality less than OPTIONS.TolFun,
 and no negative/zero curvature detected in trust region model.

c =

    0.4760    0.3413

Solve using lsqnonlin

lsqcurvefit is one of the few solvers that let you pass in an objective function with more than one field. But most of our solvers only allow you to pass in one input field - the decision vector. lsqnonlin is an example. Notice in this case that I redefine yhat to be only a function of the decision variables, c. It is also different from yhat2 I defined earlier in that it is subtracting out the y data - i.e. returning only the residuals of the fit. So y and t are passed in as constant parameters to this optimization problem.

This is an example of how you can use function handles to pass in data that your objective function may need to perform a calculation but is considered constant for the optimization problem.

These results agree with prior ones.

c = lsqnonlin(@(c)yhat(c,t)-y,[1 1])
Optimization terminated: first-order optimality less than OPTIONS.TolFun,
 and no negative/zero curvature detected in trust region model.

c =

    0.4760    0.3413

Passing data using function handles with an M-file objective function

This example shows how to pass in the constant data t and y when using a M-file function called myCurveFcn (it's the same as yhat, but in an M-file).

c = lsqnonlin( @(c)myCurveFcn(c,t,y), [1 1])
Optimization terminated: first-order optimality less than OPTIONS.TolFun,
 and no negative/zero curvature detected in trust region model.

c =

    0.4760    0.3413

Nested functions

While function handles are very useful for passing constant parameters, there is another method that works well. This example shows how to use nested function to pass data around in functions. Note that the objective function doesn't get passed t and y into it from the function calling syntax. The nested function has access to it's parent's function workspace, this is where it gets values of t and y that it needs to use in the body of the objective function.

type nestedOptimExample
c = nestedOptimExample()
function c = nestedOptimExample()

% Data
t = [0 .3 .8 1.1 1.6 2.3]';
y = [.82 .72 .63 .60 .55 .50]';

% Curve fitting (Optimization routine)
c = lsqnonlin( @myNestedCurve, [1 1]);

% Objective Function
% Note that t or y are not passed in the function definition
    function resid = myNestedCurve(x)
        yhat = x(1) + x(2)*exp(-t); % where did t come from?
        resid = yhat - y;           % where does y come from?
        
        % t and y are defined in the parent function but passed to all
        % subfunctions, do not need to pass them explicitly to nested
        % functions.
    end
end
        
Optimization terminated: first-order optimality less than OPTIONS.TolFun,
 and no negative/zero curvature detected in trust region model.

c =

    0.4760    0.3413

Nonlinear Optimization and Topology Considerations

Let's switch gears now, and talk about nonlinear optimization. PEAKS is a function that ships with MATLAB and is shown below. It is a highly nonlinear function and note how close the peaks and valleys are together. This is a very challenging surface to minimize using a gradient based solver, as we'll soon see.

peaks
 
z =  3*(1-x).^2.*exp(-(x.^2) - (y+1).^2) ... 
   - 10*(x/5 - x.^3 - y.^5).*exp(-x.^2-y.^2) ... 
   - 1/3*exp(-(x+1).^2 - y.^2) 
 

Let's define our optimization problem. I want to find the minimum value of the PEAKS function. My objective function is the PEAKS function, which I've redefined in a M-file function called peaksObj. It's been redefined here so that it can accept data in the format that the optimization solvers prefer. I'd also like to add a nonlinear constraint to this problem, so you can see how to add them for your problem. I've defined my nonlinear constraint in a function called peaksCon.

clear all, close all
type peaksObj
type peaksCon
function f = peaksObj(x)
%PEAKSOBJ casts PEAKS function to a form accepted by optimization solvers.
%   PEAKSOBJ(X) calls PEAKS for use as an objective function for an
%   optimization solver.  X must conform to a M x 2 or N x 2 array to be
%   valid input.
%
%   Syntax
%      f = peaksObj(x)
%
%   Example
%      x = [ -3:1:3; -3:1:3]
%      f = peaksObj(x)
%
%   See also peaks

% Check x size to pass data correctly to PEAKS
[m,n] = size(x);

if (m*n) < 2
    error('peaksObj:inputMissing','Not enough inputs');
elseif (m*n) > 2 && (min(m,n) == 1) || (min(m,n) > 2)
    error('peaksObj:inputError','Input must have dimension m x 2');
elseif n ~= 2
    x = x';
end 

% Objective function
f = peaks(x(:,1),x(:,2));


function [c,ceq] = peaksCon(x)
%PEAKSCON Constraint function for optimization with PEAKSOBJ
%   PEAKSCON(X) is the constraint function for use with PEAKSOBJ.  X is of
%   size M x 2 or 2 x N.
%
%   Sytnax
%      [c,ceq] = peaksCon(x)
%
%   See also peaksobj, peaks

% Check x size to pass data correctly to constraint definition
[m,n] = size(x);
if (m*n) < 2
    error('peaksObj:inputMissing','Not enough inputs');
elseif (m*n) > 2 && (min(m,n) == 1) || (min(m,n) > 2)
    error('peaksObj:inputError','Input must have dimension m x 2');
elseif n ~= 2
    x = x';
end
% Set plot function to plot constraint boundary
try
    mypref = 'peaksNonlinearPlot';
    if ~ispref(mypref)
        addpref(mypref,'doplot',true);
    else
        setpref(mypref,'doplot',true);
    end
catch
end

% Define nonlinear equality constraint
ceq = [];

% Define nonlinear inequality constraint
% x1^2 + x^2 <= 3^2
c = x(:,1).^2 + x(:,2).^2 - 9; 
% fmincon accepted input form is ceq <= 0



We'll us the constrained nonlinear minimization solver, fmincon, to solve this problem. Let's define the starting point and solve the problem. (Note that I've created an output plotting function so we can see the solver's iteration history and defined this using solver options).

x0 = [0 1.5];
options = optimset('Disp','iter','OutputFcn',@peaksOutputFcn);
x = fmincon(@peaksObj,x0,[],[],[],[],[],[],@peaksCon,options)
Warning: Trust-region-reflective method does not currently solve this type of problem,
 using active-set (line search) instead.

                                Max     Line search  Directional  First-order 
 Iter F-count        f(x)   constraint   steplength   derivative   optimality Procedure 
    0      3      7.99662        -6.75                                         
    1      6     -3.73667       -7.574            1        -29.3         10.7   
    2     11     -4.43864       -4.976         0.25         23.2         22.8  Hessian modified twice  
    3     19     -4.44456       -4.667       0.0313           28         19.5   
    4     35     -6.14699       -6.299     0.000122     8.4e+003         3.78   
    5     38     -6.54458       -6.296            1        0.117          0.5   
    6     41     -6.55111       -6.303            1     0.000584       0.0201   
    7     44     -6.55113       -6.307            1     1.2e-005       0.0108   
    8     47     -6.55113       -6.305            1     4.4e-007      0.00105  Hessian modified  
Optimization terminated: magnitude of directional derivative in search
 direction less than 2*options.TolFun and maximum constraint violation
  is less than options.TolCon.
No active inequalities.

x =

    0.2283   -1.6256

The top plot is a surface plot of the peaks function. The lower plot is a contour plot of the upper plot with the constraint boundary shown in the shaded blue region. This is where we violate the constraint.

Notice how fast it converged to a solution. Looks good right? Let's see what happens when we change the staring point.

x0 = [-3 -3];
x = fmincon(@peaksObj,x0,[],[],[],[],[],[],@peaksCon,options)
Warning: Trust-region-reflective method does not currently solve this type of problem,
 using active-set (line search) instead.

                                Max     Line search  Directional  First-order 
 Iter F-count        f(x)   constraint   steplength   derivative   optimality Procedure 
    0      3 6.67128e-005            9                                         Infeasible start point
    1      6    0.0141123        1.125            1       0.0565        0.532   
    2      9    0.0263901      0.03138            1       0.0157       0.0175   
    3     12    0.0266373    0.0002477            1     0.000238       0.0117   
    4     15    0.0144905       0.1616            1      -0.0153       0.0366  Hessian modified twice  
    5     18  -0.00406119       0.3202            1      -0.0149         0.31  Hessian modified  
    6     24    -0.167801        -1.94        0.125       0.0239        0.673   
    7     31    -0.217389       -2.388       0.0625        0.123         0.79   
    8     34    -0.862566       -4.674            1       -0.709         1.74   
    9     45     -1.19241       -6.091      0.00391         86.2         3.95  Hessian modified  
   10     48     -1.85816       -6.444            1       -0.644         3.71   
   11     56     -2.90929       -7.486       0.0313         22.8         2.12   
   12     59     -3.02511        -7.12            1       0.0704        0.671   
   13     62     -3.04903       -7.148            1     -0.00704        0.116   
   14     65     -3.04985       -7.144            1   -5.92e-006      0.00879   
   15     68     -3.04985       -7.143            1   -4.37e-007     0.000802   
Optimization terminated: magnitude of directional derivative in search
 direction less than 2*options.TolFun and maximum constraint violation
  is less than options.TolCon.
No active inequalities.

x =

   -1.3473    0.2045

Bummer! We didn't get the right solution. We did get a local minima, but not the best or global minima. Is this a problem with fmincon? No, this solver was designed to be fast and efficient at finding local minima. It wasn't designed to be a global solver. Most optimization solvers are only local solvers. This is because it is very difficult to guarantee a global solution for nonlinear optimization problems like this one.

This problem also illustrates that the choice of starting point is often critical to finding the right solution.

Try a random starting point (uniform distribution grid of 4 points)

For most practical nonlinear problems, you probably will not run into this issue. But if you suspect that your problem may be this nonlinear, we can get around this starting point issue using multiple staring points (commonly referred to as multistart optimization). Let's do a simple example. I'll run the same optimization problem from four different staring points. How should I choose them? There are a variety of methods you can choose from (such as design of experiments, space filling designs, grid patterns, or any other you can think of). I'm simply going to generate four random starting points ands see how good this works. I'll use a uniform random number generator (all points are equally likely to be selected) and bound the starting points in the range of +/- 3 on each axis.

a = -3;
b = 3;
x1 = a + (b-a) * rand(2);
x2 = a + (b-a) * rand(2);
x0r = [x1; x2];

for i = 1:length(x0r)
    xopt(i,:) = fmincon(@peaksObj,x0r(i,:),[],[],[],[],[],[],@peaksCon,options);
end
xopt
Warning: Trust-region-reflective method does not currently solve this type of problem,
 using active-set (line search) instead.

                                Max     Line search  Directional  First-order 
 Iter F-count        f(x)   constraint   steplength   derivative   optimality Procedure 
    0      3   -0.0795646      -0.4252                                         
    1      6    -0.413869       -2.275            1       -0.604         1.45   
    2     11     -2.56952       -6.502         0.25         72.4          7.4  Hessian modified twice  
    3     15     -5.56537       -5.314          0.5         6.81         6.11   
    4     22     -5.84013       -5.668       0.0625        0.418         3.92   
    5     25     -6.44176       -6.558            1        0.798         2.39   
    6     28     -6.54602       -6.248            1       0.0564        0.501   
    7     31     -6.55112       -6.303            1    -0.000436       0.0205   
    8     34     -6.55113       -6.306            1    5.55e-007     0.000984   
Optimization terminated: magnitude of directional derivative in search
 direction less than 2*options.TolFun and maximum constraint violation
  is less than options.TolCon.
No active inequalities.
Warning: Trust-region-reflective method does not currently solve this type of problem,
 using active-set (line search) instead.

                                Max     Line search  Directional  First-order 
 Iter F-count        f(x)   constraint   steplength   derivative   optimality Procedure 
    0      3   0.00611663         3.08                                         Infeasible start point
    1      6    0.0569658       0.1963            1        0.114        0.161   
    2      9    0.0603061     0.006044            1      0.00313        0.053   
    3     12    0.0410363      0.08198            1      -0.0174        0.123  Hessian modified  
    4     19    0.0198556       0.4217       0.0625       -0.282       0.0935   
    5     22    0.0268688      0.09531            1      0.00751       0.0125   
    6     25    0.0279356      0.01503            1      0.00104       0.0253   
    7     28    0.0279799     0.005353            1    9.12e-006     0.000958  Hessian modified  
    8     31    0.0280466     0.001358            1    5.53e-005     0.000208  Hessian modified  
    9     34    0.0280769   2.295e-005            1    3.01e-005    2.39e-005  Hessian modified  
   10     37    0.0280774   8.225e-008            1    5.24e-007    9.95e-007  Hessian modified  
Optimization terminated: first-order optimality measure less
 than options.TolFun and maximum constraint violation is less
 than options.TolCon.
Active inequalities (to within options.TolCon = 1e-006):
  lower      upper     ineqlin   ineqnonlin
                                     1
Warning: Trust-region-reflective method does not currently solve this type of problem,
 using active-set (line search) instead.

                                Max     Line search  Directional  First-order 
 Iter F-count        f(x)   constraint   steplength   derivative   optimality Procedure 
    0      3      -3.4031       -6.603                                         
    1      9     -5.03052       -5.169        0.125         46.6         6.18   
    2     15     -5.22044       -5.843        0.125         3.86         5.42   
    3     18     -5.82527       -6.669            1          3.3         3.81   
    4     21     -6.53704       -6.221            1        0.221        0.735   
    5     24     -6.55099       -6.315            1      0.00183       0.0842   
    6     27     -6.55112       -6.304            1    8.92e-005       0.0195   
    7     30     -6.55113       -6.305            1    4.86e-007     0.000687  Hessian modified  
Optimization terminated: magnitude of directional derivative in search
 direction less than 2*options.TolFun and maximum constraint violation
  is less than options.TolCon.
No active inequalities.
Warning: Trust-region-reflective method does not currently solve this type of problem,
 using active-set (line search) instead.

                                Max     Line search  Directional  First-order 
 Iter F-count        f(x)   constraint   steplength   derivative   optimality Procedure 
    0      3    -0.390414        -3.09                                         
    1      6     -2.29416       -7.845            1         6.76         4.76   
    2     10     -2.71202       -6.502          0.5         2.79         2.44   
    3     13     -3.02988       -6.998            1       -0.123        0.745   
    4     18     -3.03676       -7.104         0.25       0.0651        0.486   
    5     21     -3.04064        -7.21            1       0.0384        0.389   
    6     24     -3.04983       -7.138            1     0.000688       0.0242   
    7     27     -3.04985       -7.143            1    6.69e-007     0.000986   
Optimization terminated: magnitude of directional derivative in search
 direction less than 2*options.TolFun and maximum constraint violation
  is less than options.TolCon.
No active inequalities.

xopt =

    0.2282   -1.6255
    2.7571    1.1824
    0.2282   -1.6256
   -1.3474    0.2046

Now this result will be different each time you run it, but you should see that in this case we found the global minima at least once (possibly more). So multistart can be an effective method for finding the global minima. Also note that I get additional minima when using this method. This may also be useful, depending upon your problem. For example, the global minima may not be a stable condition when you take uncertainty in X1/X2 into consideration and you migh prefer the other minima in this case.

I used only 4 points, but depending upon your problem and the size of your domain you would like to search you may need many more points. So choose your multistart strategy carefully and incorporate any knowledge you may know about your problem to guide you on deciding how many point you need.

Solve using genetic algorithms

One of the benefits of the solvers in the Genetic Algorithm and Direct Search Toolbox is that they are useful for problems that are highly nonlinear, such as this one. They also are useful for problems that are stochastic, contain custom data types such as integer sets, problems that have undefined derivates, holes in the search domain, or are ill-defined. They also are better at finding global solutions than the gradient based solver in Optimization Toolbox.

But they are more expensive and are limited to solving problems of smaller size.

Let's solve this problem using genetic algorithms (GA).

options = gaoptimset('Disp','iter','OutputFcns',@peaksOutputFcn);
x = ga(@peaksObj,2,[],[],[],[],[],[],@peaksCon,options)
                           Best       max        Stall
Generation  f-count        f(x)     constraint  Generations
    1        1060      -6.54929            0      0
    2        2100      -6.55001            0      0
    3        3140      -6.55044            0      0
    4        4180      -6.55087            0      0
    5        5220      -6.55087            0      1
Optimization terminated: average change in the fitness value less than options.TolFun
 and constraint violation is less than options.TolCon.

x =

    0.2279   -1.6213

The plot shows how GA progresses towards a solution. It uses a stochastic search, based upon an initial population of points (green), and the selects a subpopulation (black) for creating a new population to evaluate. GA get's it's name from it's search method, which selectively generate new candidate points to evaluate based upon a method that is similar to breeding among two 'parents' to generate a 'child'. I won't get into the details here, but if you want to know more about GA solvers, I suggest you take a look at our documentation, or watch the Genetic Algorithms for Financial Application Webinar (has a nice description of GA algorithms in it).

You can see that the final population (red) contains the global minima.

Solve using simulated annealing

Another popular solver is Simulated Annealing. This solver is a little more cost effective then GA, but note that it doesn't allow you to solve a problem with nonlinear constraints. But for our problem, we can approximate the nonlinear constraint as a rectangular region, vs. a circular one. So we'll use the lower/upper bounds to add this constraint. Notice how this solver works. It starts out be randomly sampling the domain and gradually reduces the search radius around a mimima that it's found. Periodically, the solver will reset and start searching a wider radius. This is the mechanism that allows this solver to jump out of a local minima and find the global minima.

lb = [-3 -3];
ub = -lb;
options = saoptimset('Display','iter','OutputFcns',@peaksOutputFcn);
x = simulannealbnd(@peaksObj,x0,lb,ub,options)
                           Best        Current           Mean
Iteration   f-count         f(x)         f(x)         temperature
     0          1   6.67128e-005   6.67128e-005            100
    10         11      -0.100733      -0.100733          56.88
    20         21      -0.599882     -0.0186801        34.0562
    30         31       -4.06499     -0.0658076        20.3907
    40         41       -4.06499      0.0220188        12.2087
    50         51       -4.85205   -1.34497e-005        7.30977
    60         61       -4.85205     -0.0297897        4.37663
    70         71       -4.85205    5.7267e-005        2.62045
    80         81       -4.85205     0.00011612        1.56896
    90         91       -4.85205     0.00905351       0.939395
   100        101       -5.96702       -5.96702        0.56245
   110        111       -5.96702       -5.96702        0.33676
   120        121       -5.96702       -5.96702       0.201631
   130        131       -6.54645       -6.54645       0.120724
   140        141       -6.54645       -6.54645      0.0722817
   150        151       -6.54645       -6.52976      0.0432777
   160        161       -6.54645       -6.54053       0.025912
   170        171       -6.55003       -6.55003      0.0155145
   180        181       -6.55045       -6.54975     0.00928908
   190        191       -6.55045       -6.54471     0.00556171
   200        201       -6.55097       -6.54889        0.00333
   210        211       -6.55097       -6.54825      0.0019938
*  219        222       -6.55104       -6.55104        58.0536
   220        223       -6.55104       -3.08855        55.1509
   230        233       -6.55104     -0.0775709        33.0209
   240        243       -6.55104    0.000777385        19.7708
   250        253       -6.55104       -5.45924        11.8375
   260        263       -6.55104    -0.00416131        7.08756
   270        273       -6.55104   -0.000653794        4.24358
   280        283       -6.55104   -0.000455036        2.54079
   290        293       -6.55104       -1.52425        1.52127
   300        303       -6.55104       -1.04489       0.910838

                           Best        Current           Mean
Iteration   f-count         f(x)         f(x)         temperature
   310        313       -6.55104       -5.46296       0.545352
   320        323       -6.55104       -5.66249       0.326522
   330        333       -6.55104       -5.50983       0.195501
   340        343       -6.55104       -6.54819       0.117054
   350        353       -6.55104       -6.45574      0.0700844
   360        363       -6.55104         -6.469      0.0419621
   370        373       -6.55104       -6.49128      0.0251243
   380        383       -6.55104       -6.49261      0.0150428
   390        393       -6.55104       -6.50319      0.0090067
   400        403       -6.55104       -6.54814     0.00539264
   410        413       -6.55104       -6.54864     0.00322877
   420        423       -6.55104       -6.55045     0.00193319
   430        433       -6.55109       -6.55052     0.00115747
   440        443       -6.55109       -6.55088     0.00069302
   450        453        -6.5511        -6.5511    0.000414937
*  455        460       -6.55113       -6.55113        51.9753
   460        465       -6.55113    0.000468286        40.2175
   470        475       -6.55113   -5.52136e-005        24.0797
   480        485       -6.55113     0.00431116        14.4174
   490        495       -6.55113      -0.097089        8.63223
   500        505       -6.55113      -0.120368        5.16843
   510        515       -6.55113       -5.86936        3.09453
   520        525       -6.55113       -2.91456        1.85281
   530        535       -6.55113     0.00881933        1.10935
   540        545       -6.55113       -1.17515       0.664206
   550        555       -6.55113       -1.53236       0.397685
   560        565       -6.55113       -2.13656       0.238109
   570        575       -6.55113       -2.64956       0.142564
   580        585       -6.55113       -2.53638      0.0853586
   590        595       -6.55113       -3.00086      0.0511073
   600        605       -6.55113       -3.02188      0.0305999

                           Best        Current           Mean
Iteration   f-count         f(x)         f(x)         temperature
   610        615       -6.55113       -3.04576      0.0183213
...

Solve using pattern search

Let's now try out pattern search and see how well it does. I explained patternsearch earlier so I won't get into details here. I will however note that this solver's default settings are set to find the local optimum. We'll add a search option that will make it more global.

options = psoptimset('Display','iter','OutputFcns',@peaksOutputFcn,...
                    'CompleteSearch','on','SearchMethod',@searchlhs);
x = patternsearch(@peaksObj,x0,[],[],[],[],[],[],@peaksCon,options)
                                  max
  Iter   f-count      f(x)      constraint   MeshSize      Method
    0     1 6.67128e-005            9       0.9901    
    1    70     0.777299            0        0.001   Increase penalty
    2   236     -6.33651            0       1e-005   Increase penalty
    3   438      -6.5511            0       1e-007   Increase penalty
    4   716     -6.55113            0       1e-009   Increase penalty
Optimization terminated: mesh size less than options.TolMesh
 and constraint violation is less than options.TolCon.

x =

    0.2283   -1.6255

Which solver should I choose?

Let's time these approaches and see how they fare.

compareApproaches
Optimization terminated: average change in the fitness value less than options.TolFun
 and constraint violation is less than options.TolCon.


Results from different optimization solvers (default settings):

Solver					Fcn Calls		Time(s)
--------------------------------------------------------------
fmincon (lucky guess)   	47 			0.174108
fmincon (random start)  	189 		0.843512
Genetic Algorithm       	5220 		17.886585
Simulated Annealing     	2406 		1.226381
Pattern Search          	453 		3.991359

From this chart it’s clear that using a gradient based solver, such as fmincon is the best approach. Notice that even if run with a random start, it is still faster than the other methods and often results in fewer function calls. Among the gradient-free methods, we can see that the order of preference in terms of function calls is PS, SA, then GA. And if we considered time to solution, it would be SA, PS, then GA. (Note that I haven't used the vectorize option in these solvers).

Notice that we also can’t necessarily relate function evaluation with time to find the solution when using different solvers. This becomes even more of a consideration when you use parallel computing, which I’ll discuss in a minute. In general, my recommendation is that if you think you have a nonlinear problem like this, or one that you can’t use a gradient based solver on, I’d start with PS as it is typically the most efficient. I’d then use simulated annealing, as it is a compromise between PS and GA and can also handle customized data types, such as integers.

However, GA is likely your best choice if you have custom data types or you have knowledge of the problem that can lend itself to modifying GA’s population selection and modification functions. GA can be efficient if you take the time to set up these functions. I've glossed over the complexity of GA in this demo, and you can tune it to perform better.

Using parallel computing with optimization

Both Optimization Toolbox and Genetic Algorithm and Direct Search Toolbox have built-in support for parallel computing (requires Parallel Computing Toolbox). The built-in support is in selected constrained nonlinear solvers in Optimization Toolbox and Genetic algorithms and Pattern Search solvers in GADS Toolbox.

Let's see how it works. First lets time our objective

tic
peaksObj(x0)
toc
ans =

    7.9966

Elapsed time is 0.004151 seconds.

Let's now run our multistart fmincon example and time it.

clear all, close all, clc
a = -3;
b = 3;
x1 = a + (b-a) * rand(2);
x2 = a + (b-a) * rand(2);
x0r = [x1; x2];

options = optimset('Algorithm','active-set','Display','none');
tic
for i = 1:length(x0r)
    xopt(i,:) = fmincon(@peaksObj,x0r(i,:),[],[],[],[],[],[],@peaksCon,options);
end
toc
Elapsed time is 1.835448 seconds.

And let's rerun it using the build in parallel computing capabilities. Note that to use parallel computing, I need to set an option and open up the matlabpool (start the additional labs). I'm running this on a dual core laptop, so I can use two processing cores (labs) at a time.

options = optimset(options, 'UseParallel','Always');
matlabpool open local 2
tic
for i = 1:length(x0r)
    xopt(i,:) = fmincon(@peaksObj,x0r(i,:),[],[],[],[],[],[],@peaksCon,options);
end
toc
Starting matlabpool using the parallel configuration 'local'.
Waiting for parallel job to start...
Connected to a matlabpool session with 2 labs.
Elapsed time is 2.548319 seconds.

Hey, what gives? This doesn't provide me any benefit!

True, the built-in support is only useful for objective function and/or nonlinear constraint functions that are computationally intensive. What constitutes an expensive function relies a lot on your local network and hardware, but a general rule of thumb is that if your objective function executes on the order of a few hundred milliseconds, you'll probably benefit from using parallel computing.

Let's tweak my problem so it is expensive. I've added a line of code to peaksOjb that simply takes time to compute (if you're interested, it's in the file peaksObjExp).

Time it again.

tic
peaksObjExp(x0r(1,:));
toc
Elapsed time is 0.799754 seconds.

We need to rerun the base case since we are using a different objective function.

options = optimset(options, 'UseParallel','Never');
tic
for i = 1:length(x0r)
    x = fmincon(@peaksObjExp,x0r(i,:),[],[],[],[],[],[],@peaksCon,options);
end
toc
Elapsed time is 41.643885 seconds.

And let's see if we get an improvement using my two cores.

options = optimset(options, 'UseParallel','Always');
tic
for i = 1:length(x0r)
    x = fmincon(@peaksObjExp,x0r(i,:),[],[],[],[],[],[],@peaksCon,options);
end
toc
Elapsed time is 35.457548 seconds.

You can see that I was able to get roughly a 15% improvement. This improvement wasn't bad considering I have one machine with two cores that share the same memory. You'd probably be able to get a better result using more cores or more computers in a cluster.

Let's now take a look at how you could do even better by exploiting the nature of this problem using parallel computing on the for loop.

Do-it-yourself parallel optimization

The built-in parallel computing support in fmincon accelerates the estimation of gradients. While this helped to speed up the solution time, keep in mind that each iteration is dependent upon the previous iteration's results. In other words, there are limits to how much parallel computing can speed up a gradient based solver since it contains serial operations that can't be parallelized.

This problem can benefit form using parallel computing in a different way. Notice that the optimization is being performed four times (the for loop). Each loop is independent of the others, so we can make use of parallel computing in this case, and the Parallel Computing Toolbox makes this easy with the parfor language construct.

options = optimset(options, 'UseParallel','Never');
tic
parfor i = 1:length(x0r)% Notice the for loop is changed to parfor
    x = fmincon(@peaksObjExp,x0r(i,:),[],[],[],[],[],[],@peaksCon,options);
end
toc
Elapsed time is 32.111325 seconds.

You can see that the solution time was reduced by more than 30%. I've showed this method to illustrate that you may benefit from using parallel computing directly in your problems, rather than relying on our built-in solution in some cases. If you do have an expensive objective function, you might be able to use parallel computing in the definition of your objective function to get better performance than using the built-in support. Often the expense arises in the objective function evaluation, and if you can speed that up you'll see the benefit in the optimization solver performance.

clear all, close all, clc
matlabpool close
Sending a stop signal to all the labs...
Waiting for parallel job to finish...
Performing parallel job cleanup...
Done.

Scaling considerations

I'd like to show you some additional considerations when working with optimzation. When solving problems, you should pay attention to the magnitude of the decision variables and the objective functions. If there is a wide range, this can impact how the solver proceeds to a solution (numerical issues). Let's look at the case where I have two decision variables of different magnitudes. The plot below shows an engine performance map. The axes show engine speed in RPM and the intake pressure ratio. The color contours are a measure of engine efficiency. Note how the magnitude of the engine speed is nearly 10,000 times that of the pressure ratio. This can present a problem for an optimization solver.

load VEdata
VEPlot([3500 0.5]);

If you think about this magnitude difference in terms of how a gradient based solver searches using derivatives of the objective function with respect to the decision variables, we get a large difference in the change seen due to the magnitude of decision variables (i.e. df/dx1 ~1/1000, df/dx2 ~1/0.1). This large difference leads to numerical problems, and as we see in this example, the optimization solver doesn't progress towards the solution.

options = optimset('Disp','Iter','Algorithm','interior-point',...
                   'OutputFcn',@VEPlot);
objFun = @(x)-VEMap(x,RPM,Pratio,VE);
x0 = [3500 0.5];
lb = [1000 0 ];
ub = [6000 1.0];
tic
z = fmincon(objFun,x0,[],[],[],[],lb,ub,[],options)
toc
                                         First-order     Norm of
 Iter F-count         f(x)  Feasibility   optimality        step
    0     5  -8.963084e-001   0.000e+000   1.071e-001
    1     8  -9.156335e-001   0.000e+000   4.956e-002  1.071e-001
    2    12  -9.324939e-001   0.000e+000   6.138e-002  1.128e-001
    3    18  -9.380426e-001   0.000e+000   2.470e-002  3.484e-002
    4    24  -9.382448e-001   0.000e+000   1.548e-001  1.067e-002
    5    27  -9.384511e-001   0.000e+000   6.813e-003  4.124e-003
    6    30  -9.384536e-001   0.000e+000   5.860e-003  2.930e-004
    7    34  -9.384598e-001   0.000e+000   5.846e-003  7.286e-004
    8    44  -9.384614e-001   0.000e+000   1.800e-003  1.821e-004
    9    55  -9.384620e-001   0.000e+000   1.799e-003  7.960e-005
   10    69  -9.384621e-001   0.000e+000   1.799e-003  4.352e-006
   11    80  -9.384621e-001   0.000e+000   1.799e-003  1.904e-006
   12    92  -9.384621e-001   0.000e+000   1.799e-003  4.164e-007
   13   106  -9.384621e-001   0.000e+000   6.895e-002  2.277e-008
Optimization terminated: current point satisfies constraints and relative 
change in x is less than options.TolX.

z =

  1.0e+003 *

    3.5000    0.0008

Elapsed time is 17.487496 seconds.

Redefine RPM to have same scale as Pratio

It is good practice to define your optimization problems such that the magnitudes of the objective/constraint functions and decision variables are of similar magnitude. In this example, I rescaled the decision variables to be of a similar magnitude. I've also changed the objective function definition to accept the scaled decision variables.

Notice that when the problem is scaled correctly, we get the right solution.

close(gcf)
options = optimset('Disp','Iter','Algorithm','interior-point',...
                   'OutputFcn',@(x,o,s)VEPlot([x(1)*10000 x(2)],o,s));
objFun = @(x)-VEMap([x(1)*10000 x(2)],RPM,Pratio,VE);
x0 = [3500/10000 0.5];
lb = [1000/10000 0 ];
ub = [6000/10000 1.0];
tic
z = fmincon(objFun,x0,[],[],[],[],lb,ub,[],options)
toc
                                         First-order     Norm of
 Iter F-count         f(x)  Feasibility   optimality        step
    0     5  -8.963084e-001   0.000e+000   4.366e-001
    1     9  -9.553156e-001   0.000e+000   4.404e-001  9.015e-002
    2    13  -9.631107e-001   0.000e+000   3.441e-001  7.976e-002
    3    16  -9.639381e-001   0.000e+000   1.002e-001  2.053e-002
    4    20  -9.661179e-001   0.000e+000   2.729e-001  3.795e-002
    5    23  -9.702423e-001   0.000e+000   4.237e-001  2.313e-002
    6    27  -9.723160e-001   0.000e+000   6.661e-001  8.627e-003
    7    31  -9.731487e-001   0.000e+000   7.639e-002  5.586e-003
    8    35  -9.732677e-001   0.000e+000   3.384e-001  2.739e-003
    9    38  -9.732614e-001   0.000e+000   1.608e-001  1.757e-003
   10    41  -9.732861e-001   0.000e+000   1.515e-001  2.618e-004
   11    47  -9.733193e-001   0.000e+000   1.572e-001  2.427e-004
   12    61  -9.733199e-001   0.000e+000   1.051e-001  4.199e-006
   13    72  -9.733201e-001   0.000e+000   1.145e-001  1.837e-006
   14    75  -9.733570e-001   0.000e+000   2.016e-001  5.554e-004
   15    80  -9.734154e-001   0.000e+000   2.691e-001  8.766e-004
   16    92  -9.734183e-001   0.000e+000   1.988e-001  5.462e-005
   17   107  -9.734184e-001   0.000e+000   5.276e-002  3.508e-007
   18   110  -9.734161e-001   0.000e+000   2.696e-001  4.706e-005
   19   113  -9.734165e-001   0.000e+000   3.431e-001  6.377e-006
   20   116  -9.734182e-001   0.000e+000   3.420e-001  2.521e-005
   21   126  -9.734186e-001   0.000e+000   9.745e-002  6.302e-006
   22   143  -9.734186e-001   0.000e+000   1.983e-001  8.029e-008
   23   147  -9.734186e-001   0.000e+000   1.983e-001  4.308e-008
   24   154  -9.734187e-001   0.000e+000   1.983e-001  9.129e-008
   25   158  -9.734187e-001   0.000e+000   1.983e-001  6.031e-007
   26   172  -9.734187e-001   0.000e+000   9.729e-002  7.010e-009
Optimization terminated: current point satisfies constraints and relative 
change in x is less than options.TolX.

z =

    0.4244    0.6689

Elapsed time is 25.217529 seconds.

Tolerances (TolX, TolFun, etc)

Since we're on the topic of magnitudes, let's return to the Mt. Washington example. Notice the scale of the axes, 1,000s of feet. Also note that we are seeking the peak, which is also in 1,000s of feet. If we run this problem with default settings, we'll get the right solution but it will take 68 iterations to do so.

clear all, close all
% Load data for white mountains
load mtWashington;
% Load psproblem which have all required settings for pattern search
load psproblem;
% show iteration history
psproblem.options.Display = 'iter';
% Run the solver and show displays
patternsearch(psproblem);

Iter     f-count          f(x)      MeshSize     Method
    0        1        -2392.5            10      
    1        5       -2396.17            20     Successful Poll
    2        9        -2401.5            40     Successful Poll
    3       13       -2409.58            80     Successful Poll
    4       17          -2428           160     Successful Poll
    5       21       -2462.75           320     Successful Poll
    6       25       -2593.67           640     Successful Poll
    7       29       -2958.83          1280     Successful Poll
    8       33       -4574.17          2560     Successful Poll
    9       37       -4574.17          1280     Refine Mesh
   10       41       -4956.11          2560     Successful Poll
   11       45       -4956.11          1280     Refine Mesh
   12       49       -5138.08          2560     Successful Poll
   13       53       -5194.89          5120     Successful Poll
   14       57       -5194.89          2560     Refine Mesh
   15       61       -5194.89          1280     Refine Mesh
   16       65          -5687          2560     Successful Poll
   17       69          -5687          1280     Refine Mesh
   18       73          -5687           640     Refine Mesh
   19       77       -5767.33          1280     Successful Poll
   20       81       -5767.33           640     Refine Mesh
   21       85       -6052.19          1280     Successful Poll
   22       89       -6052.19           640     Refine Mesh
   23       93       -6052.19           320     Refine Mesh
   24       97       -6172.67           640     Successful Poll
   25      101       -6172.67           320     Refine Mesh
   26      105       -6172.67           160     Refine Mesh
   27      109       -6261.25           320     Successful Poll
   28      113       -6261.25           160     Refine Mesh
   29      117       -6261.25            80     Refine Mesh
   30      121        -6271.5           160     Successful Poll

Iter     f-count        f(x)       MeshSize      Method
   31      125        -6271.5            80     Refine Mesh
   32      129        -6271.5            40     Refine Mesh
   33      133       -6275.53            80     Successful Poll
   34      137       -6275.53            40     Refine Mesh
   35      141       -6276.06            80     Successful Poll
   36      145       -6276.06            40     Refine Mesh
   37      149       -6276.06            20     Refine Mesh
   38      153       -6278.75            40     Successful Poll
   39      157       -6278.75            20     Refine Mesh
   40      161       -6278.75            10     Refine Mesh
   41      165       -6278.81            20     Successful Poll
   42      169       -6278.81            10     Refine Mesh
   43      173       -6278.81             5     Refine Mesh
   44      177          -6280            10     Successful Poll
   45      181          -6280             5     Refine Mesh
   46      185          -6280           2.5     Refine Mesh
   47      189          -6280          1.25     Refine Mesh
   48      193          -6280         0.625     Refine Mesh
   49      197          -6280        0.3125     Refine Mesh
   50      201          -6280        0.1563     Refine Mesh
   51      205          -6280       0.07813     Refine Mesh
   52      209          -6280       0.03906     Refine Mesh
   53      213          -6280       0.01953     Refine Mesh
   54      217          -6280      0.009766     Refine Mesh
   55      221          -6280      0.004883     Refine Mesh
   56      225          -6280      0.002441     Refine Mesh
   57      229          -6280      0.001221     Refine Mesh
   58      233          -6280     0.0006104     Refine Mesh
   59      237          -6280     0.0003052     Refine Mesh
   60      241          -6280     0.0001526     Refine Mesh

Iter     f-count        f(x)       MeshSize      Method
   61      245          -6280    7.629e-005     Refine Mesh
   62      249          -6280    3.815e-005     Refine Mesh
   63      253          -6280    1.907e-005     Refine Mesh
   64      257          -6280    9.537e-006     Refine Mesh
   65      261          -6280    4.768e-006     Refine Mesh
   66      265          -6280    2.384e-006     Refine Mesh
   67      269          -6280    1.192e-006     Refine Mesh
   68      273          -6280     5.96e-007     Refine Mesh
Optimization terminated: mesh size less than options.TolMesh.

If you take a look at the default settings for the tolerances on the objective function (TolFun) and on the decision variables (TolX) you'll note that it is 1e-6. I don't need tolerances anywhere near this tight since my problem is in 1,000s of feet. I doubt my elevation data is even this accurate. This is unnecessary and you can significantly speed up the optimization solver performance by adjusting these parameters to be closer to the accuracy required for the problem. In this case, I'll set the TolFun to 10 (feet), and rerun the solver. We can see that the iterations have been reduced to 43, nearly a 40% reduction.

psproblem.options.TolFun
psproblem.options.TolX

% Adjust the tolerance
psproblem.options.TolFun = 10;

% Run the solver and show displays
patternsearch(psproblem);
ans =

  1.0000e-006


ans =

  1.0000e-006



Iter     f-count          f(x)      MeshSize     Method
    0        1        -2392.5            10      
    1        5       -2396.17            20     Successful Poll
    2        9        -2401.5            40     Successful Poll
    3       13       -2409.58            80     Successful Poll
    4       17          -2428           160     Successful Poll
    5       21       -2462.75           320     Successful Poll
    6       25       -2593.67           640     Successful Poll
    7       29       -2958.83          1280     Successful Poll
    8       33       -4574.17          2560     Successful Poll
    9       37       -4574.17          1280     Refine Mesh
   10       41       -4956.11          2560     Successful Poll
   11       45       -4956.11          1280     Refine Mesh
   12       49       -5138.08          2560     Successful Poll
   13       53       -5194.89          5120     Successful Poll
   14       57       -5194.89          2560     Refine Mesh
   15       61       -5194.89          1280     Refine Mesh
   16       65          -5687          2560     Successful Poll
   17       69          -5687          1280     Refine Mesh
   18       73          -5687           640     Refine Mesh
   19       77       -5767.33          1280     Successful Poll
   20       81       -5767.33           640     Refine Mesh
   21       85       -6052.19          1280     Successful Poll
   22       89       -6052.19           640     Refine Mesh
   23       93       -6052.19           320     Refine Mesh
   24       97       -6172.67           640     Successful Poll
   25      101       -6172.67           320     Refine Mesh
   26      105       -6172.67           160     Refine Mesh
   27      109       -6261.25           320     Successful Poll
   28      113       -6261.25           160     Refine Mesh
   29      117       -6261.25            80     Refine Mesh
   30      121        -6271.5           160     Successful Poll

Iter     f-count        f(x)       MeshSize      Method
   31      125        -6271.5            80     Refine Mesh
   32      129        -6271.5            40     Refine Mesh
   33      133       -6275.53            80     Successful Poll
   34      137       -6275.53            40     Refine Mesh
   35      141       -6276.06            80     Successful Poll
   36      145       -6276.06            40     Refine Mesh
   37      149       -6276.06            20     Refine Mesh
   38      153       -6278.75            40     Successful Poll
   39      157       -6278.75            20     Refine Mesh
   40      161       -6278.75            10     Refine Mesh
   41      165       -6278.81            20     Successful Poll
   42      169       -6278.81            10     Refine Mesh
   43      173       -6278.81             5     Refine Mesh
Optimization terminated: change in the function value less than options.TolFun.

End game

This concludes the examples shown in the webinar (plus some additional ones). I hope you enjoyed them and learned something new along the way.