2. POUNDERS
2.1. General Description
The Practical Optimization Using No Derivatives and Exploiting Recognized Structure method, better known as POUNDERS, solves the optimization problem
subject to
where \(\Ffun\) is a vector-valued, user-provided blackbox (“zeroth-order”) function from \(\R^{\np}\) to \(\R^{\nd}\),
\(\hfun\) is a smooth function mapping from \(\R^{\nd}\) to \(\R\),
\(Low\) is a user-provided boundary constraint that permits values of \(-\infty\) to specify that the problem is unconstrained for the associated parameter, and
\(Upp\) is a user-provided boundary constraint that permits values of \(+\infty\) to specify that the problem is unconstrained for the associated parameter.
Originally, POUNDERS was designed to minimize only “sum of squares” mappings of blackbox functions, that is, POUNDERS solved
subject to
As such, the current implementation of POUNDERS generalizes the class of functions to which POUNDERS can be applied.
POUNDERS will never attempt to evaluate \(\Ffun\) outside of the provided bounds, but it is
possible to take advantage of function values at infeasible \(\psp\) if
these are passed initially through the (X_init,F_init) entries of Prior. In each iteration,
POUNDERS forms a set of quadratic models interpolating each of the \(\np\) functions in
\(\Ffun\) and minimizes a derived scalar-valued model of \(f\) within an
infinity-norm trust region.
If a user wishes to employ an outer function \(\hfun\) other than a sum-of-squares, then the user must specify a custom outer-function \(\hfun\) that maps vectors in \(\R^{\nd}\) to a scalar value. In that case, users must also provide a “combine models” function that POUNDERS uses to map the linear and quadratic terms from the quadratic models of \(\Ffun\) into a single quadratic model.
For more detailed information please refer to Wild [6]. A brief description can also be found in Kortelainen et al. [2].
We provide two implementations of POUNDERS, namely, pounders and pounders_concurrent.
2.2. Programmatic Interface
2.2.1. Status Code
Both POUNDERS implementations return a termination criteria flag. The interpretation of the value of the flag is identical across both implementations and is defined by:
0 - normal termination because norm of \(\gradf(\psp)\) at final \(\psp\) satisfied user-provided gradient tolerance,
> 0 - the budget specified in nf_max was reached; the value of the flag is the 2-norm of \(\gradf\) at final \(\psp\)
-1 - input was fatally incorrect (error message shown)
-2 - a valid model produced
X[nf] == X[xk_in]or(mdec == 0, hF[nf] == hF[xk_in])(this indicates that the TRSP solver failed to find decrease in the model)-3 - a
NaNwas encountered-4 - error in trust-region subproblem solver
-5 - an attempt at model improvement failed
-6 - the trust-region radius delta has shrunk to`delta_min`, and the model is valid
The programmatic interface is generally maintained identical between both implementations. Nevertheless, we provide the interface for each implementation to provide language-specific descriptions.
2.2.2. Python
- ibcdfo.run_pounders(Ffun, X_0, n, nf_max, g_tol, delta_0, m, Low, Upp, Prior=None, Options=None, Model=None)
Run POUNDERS on the optimization problem specified by the given arguments.
- Parameters:
Ffun – Function that returns \(\Ffun(\psp)\) as \(\nd\) element NumPy array for given \(\psp\)
X_0 – \(\np\) element NumPy array that specifies the initial point
n – Dimension (number of continuous, real-valued input variables)
nf_max – Maximum number of function evaluations (\(> \np+1\))
g_tol – Tolerance for the 2-norm of the model gradient
delta_0 – Positive initial trust region radius
m – Dimension of output of
Ffun(number of component functions)Low – \(\np\) element NumPy array of lower bounds
Upp – \(\np\) element NumPy array of upper bounds
Prior –
dictdescribing past evaluations ofFfun. Set toNoneto run optimization assuming no past evaluations. A nonempty Prior must contain entries:nfs - Number of past function evaluations
X_init - \(\mathrm{nfs} \times \np\) NumPy array of points \(\psp_k\)
F_init - \(\mathrm{nfs} \times \nd\) NumPy array of values \(\Ffun(\psp_k)\) computed with
Ffunxk_in - Zero-based index into
X_initandF_initthat corresponds to the point and value to use as initial point for optimization. Note that if Prior is nonempty, this will override the previously specified X_0.
Options –
dictof method options. Set toNoneto use default values.printf (default is 0)
0 - No printing to screen
1 - Debugging level of output to screen
2 - More verbose screen output
spsolver - Trust-region subproblem solver flag (default is 2, not recommended to change)
hfun - Outer function \(\hfun\) that maps given \(\Ffun(\psp)\) to scalars for minimization (default is sum-of-squares that yields \(f\))
combinemodels - Function that maps the linear and quadratic terms from the models of \(\Ffun\) into a single quadratic model (default is ordinary least squares)
Model –
dictof model building options. Set toNoneto use default values.np_max - Maximum number of interpolation points (\(>\np+1\)) (default is \(2\np+1\))
Par - Five element
listforformquad(default \([\sqrt{n}, \max\{10,\sqrt{n}\}, 10^{-3}, 10^{-3}, 0]\))
- Returns:
X - \(\mathrm{nf\_max+nfs}\times \np\) NumPy array containing locations of evaluated points in the order in which they were evaluated
F - \(\mathrm{nf\_max+nfs}\times \nd\) NumPy array containing the function values at
Xwith matching orderinghF - \(\mathrm{nf\_max+nfs}\times 1\) Composed values
hfun(Ffun(x))for evaluated pointsxinXflag - Termination criteria flag (See general POUNDERS documentation)
xk_in - Zero-based index of point in
Xrepresenting incumbent at termination (approximate local minimizer if flag=0)
- ibcdfo.run_pounders_concurrent(Ffun, X_0, n, nf_max, g_tol, delta_0, m, Low, Upp, Prior=None, Options=None, Model=None)
This version of POUNDERS calls
Ffunwith batches of model-building points. In particular, when new geometry points are needed, POUNDERS constructs the corresponding rows ofXand callsFfun(X[idx_new])
rather than calling
Ffunseparately for each row ofX.Any parallelism/concurrency must be implemented inside
Ffunrather than inside POUNDERS. Users who want concurrent evaluation of model-building points should provide anFfunthat accepts a batch of pointsX[idx_new]with shape(batch_size, n), where each row is one point to evaluate. The user-providedFfunshould evaluate thesebatch_sizepoints with whatever degree of concurrency is available and return an array with shape(batch_size, m), whose rows contain the correspondingFfunoutputs in the same order as the input points.Otherwise, this implementation and its interface are identical to those of the standard POUNDERS implementation. Please refer to
ibcdfo.run_pounders()for more information.
2.2.3. MATLAB
- pounders(Ffun, X_0, n, nf_max, g_tol, delta_0, m, Low, Upp, Prior, Options, Model)
Run POUNDERS on the optimization problem specified by the given arguments.
- Parameters:
Ffun – Handle to function that returns \(\Ffun(\psp)\) as \(1 \times \nd\) vector for given \(\psp\)
X_0 – [dbl] \(1 \times \np\) vector that specifies the initial point
n – [int] Dimension (number of continuous, real-valued input variables)
nf_max – [int] Maximum number of function evaluations (\(> \np+1\))
g_tol – [dbl] Tolerance for the 2-norm of the model gradient
delta_0 – [dbl] Positive initial trust region radius
m – [int] Dimension of output of
Ffun(number of component functions)Low – [dbl] \(1 \times \np\) vector of lower bounds
Upp – [dbl] \(1 \times \np\) vector of upper bounds
Prior –
structof past evaluations ofFfun. If no past evaluations, then either do not set Prior, or else provide an empty struct. Otherwise, arguments must be provided for all dictionary entries:nfs - Number of past function evaluations
X_init - \(\mathrm{nfs} \times \np\) matrix of points \(\psp_k\)
F_init - \(\mathrm{nfs} \times \nd\) matrox of values \(\Ffun(\psp_k)\) obtained with
Ffunxk_in - One-based index into
X_initandF_initthat corresponds to the point and value to use as initial point for optimization. Note that if nonempty Prior is specified, then X_0 from previous argument will be ignored.
Options –
structof method options. To use default values, either do not provide or else set an empty struct.printf (default is 0)
0 - No printing to screen
1 - Debugging level of output to screen
2 - More verbose screen output
spsolver - Trust-region subproblem solver flag (default is 2, not recommended to change)
hfun - Outer function \(\hfun\) that maps given \(\Ffun(\psp)\) to scalars for minimization (default is sum-of-squares that yields \(f\).)
combinemodels - Handle to function that maps the linear and quadratic terms from the models of \(\Ffun\) into a single quadratic model (default is ordinary least squares)
Model –
structof model building options. Do not provide or set to an emptystructto use default values.np_max - Maximum number of interpolation points (\(>\np+1\)) (default is \(2\np+1\))
Par - \(1 \times 5\) vector for
formquad(default is \([\sqrt{n},\max\{\sqrt{n},10\},10^{-3},10^{-3},0]\))
- Returns:
X [dbl] \(\mathrm{nf\_max+nfs}\times \np\) matrix containing locations of evaluated points in the order in which they were evaluated
F [dbl] \(\mathrm{nf\_max+nfs}\times \nd\) matrix containing the function values at
Xwith matching orderinghF [dbl] \(\mathrm{nf\_max+nfs}\times 1\) matrix containing composed values
hfun(Ffun(x))for evaluated pointsxinXflag [dbl] Termination criteria flag (See general POUNDERS documentation)
xk_in [int] One-based index of point in
Xrepresenting final incumbent
2.2.4. General \(\hfun\) Functions
The following \(\hfun\) functions are available for immediate use with both
the Python and MATLAB implementations of POUNDERS. While they are presented
through their integration into the Python package, the documentation is valid
for the MATLAB version of these functions, which are located in
pounders/m/general_h_funs.
- ibcdfo.pounders.h_leastsquares(F)
\(\hfun\) function for constructing the standard least-squares objective function
\[f(\psp) = \hfun\left(\Ffun(\psp)\right) = \sum_{i = 1}^{\nd} \Ffuncomp{i}(\psp)^2,\]which is the \(\hfun\) function used by default.
The
combine_leastsquaresfunction should also be passed to POUNDERS when using this \(\hfun\) function.
- ibcdfo.pounders.h_neg_leastsquares(F)
\(\hfun\) function for constructing the negative least-squares objective function
\[f(\psp) = \hfun\left(\Ffun(\psp)\right) = -\sum_{i = 1}^{\nd} \Ffuncomp{i}(\psp)^2.\]The
combine_neg_leastsquaresfunction should also be passed to POUNDERS when using this \(\hfun\) function.
- ibcdfo.pounders.h_identity(F)
Identity \(\hfun\) function for using POUNDERS when the objective is not composite; that is, when \(\Ffun: \R^{\np} \to \R\) is scalar-valued and
\[f(\psp) = \hfun\left(\Ffun(\psp)\right) = \Ffun(\psp).\]When using this \(\hfun\) function, the
combine_identityfunction should also be passed to POUNDERS.
- ibcdfo.pounders.h_emittance(F)
\(\hfun\) function for constructing the emittance objective function
\[f(\psp) = \hfun\left(\Ffun(\psp)\right) = \Ffuncomp{1}(\psp)\Ffuncomp{2}(\psp) - \Ffuncomp{3}(\psp)^2\]limited to the special case of \(\Ffun : \R^{\np} \to \R^3\).
The
combine_emittancefunction should also be passed to POUNDERS when using this \(\hfun\) function.
2.2.5. Parameterized \(\hfun\) Functions
POUNDERS permits parameterized \(\hfun\) functions, which it can use
only once users have chosen a single set of parameter values for formulating a
specific \(\hfun\) function and, therefore, a single related problem. As an example, the
following routine can be used to create a single hfun and combinemodels
matched pair for a single set of desired parameter values. While this routine
is presented through its integration into the Python package, the
documentation is valid for the MATLAB version of these routines, which is
located in pounders/m/general_h_funs.
- ibcdfo.pounders.create_squared_diff_from_mean_functions(alpha)
Create an \(\hfun\) function and its associated combinemodel function, both using the given parameter value \(\alpha\), for constructing and using the POUNDERS objective function
\[f(\psp; \alpha) = \hfun\left(\Ffun(\psp); \alpha\right) = \sum_{i=1}^{\nd} \left(\Ffuncomp{i}(\psp) - \overline{\Ffun}(\psp)\right)^2 - \alpha \overline{\Ffun}(\psp)^2\]where
\[\overline{\Ffun}(\psp) = \frac{1}{\nd}\sum_{i=1}^{\nd} \Ffuncomp{i}(\psp)\]is the average value of all components in \(\Ffun(\psp)\). This objective, therefore, prefers vectors close to their average.
- Param:
\(\alpha\) is a problem-specific parameter that specifies how much the objective function should penalize small (large) averages for \(\alpha\) positive (negative).
- Returns:
(hfun, combinemodels) constructed with the given \(\alpha\)