3. Manifold Sampling
3.1. General Description
Manifold Sampling is a derivative-free trust-region method for solving composite nonsmooth optimization problems of the form
where:
\(\Ffun:\R^{\np} \rightarrow \R^{\nd}\) is a black-box vector-valued function (e.g., a simulation output or a vector of residuals), and
\(\hfun:\R^{\nd} \rightarrow \R\) is a nonsmooth function, expressible as a continuous selection. Examples of continuous selections include piecewise linear functions, piecewise quadratic functions or functions formed from max, min or absolute value operations.
In the programmatic interface, the input to hfun is often denoted
\(\zvec\), representing the intermediate vector
\(\zvec = \Ffun(\psp)\).
Analyses assume that \(\Ffun\) is continuous and locally sufficiently smooth so that accurate trust-region surrogate models can be constructed. However, in practice, these assumptions do not need to be explicitly checked and derivatives of \(\Ffun\) certainly need not be available.
The composite structure \(\hfun(\Ffun(\psp))\) arises naturally in robust regression, minimax design, simulation-based calibration, and optimization with embedded nonsmooth merit functions.
Manifold Sampling builds local surrogate models for components of \(\Ffun\) and exploits the piecewise structure of \(\hfun\) by identifying locally active manifolds, that is, smooth pieces \(\hfun\) that are active near the current incumbent. At each iteration, a trust-region subproblem is solved on a model derived in part from the currently active manifolds, producing a step that balances local improvement and model accuracy.
Unlike generic nonsmooth methods, manifold sampling deliberately leverages the composite structure \(\hfun(\Ffun(\psp))\), which can significantly improve practical performance when evaluations of \(\Ffun\) are expensive.
For algorithmic details, see Larson et al. [4], Khan et al. [1], Larson et al. [5], and Larson and Menickelly [3].
3.2. Programmatic Interface
3.2.1. Status Code
All Manifold Sampling implementations return a termination criteria flag. The interpretation of the value of the flag is identical across implementations and possible values are
> 0 - successful termination; the value of the flag is the final stationarity measure (\(\chi_k\))
0 - the budget of
nf_maxfunction evaluations was exhausted.-1 - model construction failed (an empty or invalid local model was constructed)
-2 - trust-region subproblem failed, likely due to an unbounded or poorly scaled affine-envelope subproblem
The programmatic interface is generally maintained identically across all implementations. Nevertheless, we provide the interface for each implementation to provide language-specific descriptions.
3.2.2. Python
- ibcdfo.run_MSP(hfun, Ffun, x0, L, U, nf_max, subprob_switch)
Run manifold sampling to solve the composite nonsmooth optimization problem.
- Parameters:
hfun –
Function implementing \(\hfun\). Supports two calling modes.
Mode 1:
hval, grads, hashes = hfun(z)
where:
zis a length-\(\nd\) arrayhvalis the scalar value \(\hfun(\zvec)\)gradsis a \((\nd, l)\) array whose columns are gradients of the \(l\) active selection functions at \(\zvec\)hashesis a list of identifiers for those \(l\) active manifolds
Mode 2:
vals, grads = hfun(z, hashes)
where the list
hashesspecifies which manifolds to evaluate,vals[i]is the value of theith corresponding selection, and columniofgradsis the gradient of theith selection at \(\zvec\).Ffun – Function returning \(\Ffun(\psp)\) as a length-\(\nd\) array for a given length-\(\np\) array
x.x0 – Initial point (length \(\np\) array).
L – Lower bounds for
x(length \(\np\) array).U – Upper bounds for
x(length \(\np\) array).nf_max – Maximum number of evaluations of
Ffun.subprob_switch – Selects the trust-region subproblem solver/variant used internally.
- Returns:
Tuple
(X, F, h, xkin, flag)whereX– array of evaluated points in evaluation orderF– array with rows \(\Ffun(X[i])\)h– array with \(h[i] = \hfun(\Ffun(X[i]))\)xkin– zero-based index inXof the final trust-region centerflag– termination code (See general documentation)
3.2.3. MATLAB
- manifold_sampling_primal(hfun, Ffun, x0, L, U, nf_max, subprob_switch)
Run manifold sampling to solve the composite nonsmooth optimization problem.
- Parameters:
hfun –
Handle to function that, given a \(\np \times 1\) point \(\zvec\), returns
the value \(h(\zvec)\)
\(\np \times l\) matrix that contains gradients for all \(l\) active selections at \(\zvec\)
\(1 \times l\) vector of hashes for each manifold active at \(\zvec\)
Given point \(\zvec\) and \(l\) hashes \(H\), returns
An \(1 \times l\) vector that contains the values \(h_i(\zvec)\) for each hash in \(H\)
A \(p \times l\) matrix that contains the gradients of \(h_i(\zvec)\) for each hash in \(H\)
Ffun – Function that returns the value \(\Ffun(\psp)\) of the blackbox simulation as an \(1 \times \nd\) vector for given \(\psp\)
x0 – \(1 \times \np\) vector that specifies the starting point
L – \(1 \times \np\) vector of lower bounds
U – \(1 \times \np\) vector of upper bounds
nf_max – [int] Maximum number of function evaluations
subprob_switch – Selects the trust-region subproblem solver used internally.
- Returns:
X - \(\mathrm{nf\_max} \times \np\) matrix containing the points evaluated in the order in which they were evaluated
F - \(\mathrm{nf\_max} \times \nd\) matrix containing the blackbox function values \(\Ffun(\psp)\) for all points in
Xwith matching orderingh - \(\mathrm{nf\_max} \times 1\) matrix containing the values \(\hfun(\Ffun(\psp))\) for all points in
Xwith matching ordering.xkin - One-based index of current trust region center.
flag - Termination criteria flag (See general documentation)
3.2.4. General \(\hfun\) Functions
The following \(\hfun\) functions are available for immediate use with the
Python implementation of Manifold Sampling. With the potential exception of
application-specific functions, these same functions are available for immediate
use with the MATLAB implementation in manifold_sampling/m/general_nonsmooth_h_funs.
While they are presented here through their integration into the Python package,
the documentation is generally valid for the MATLAB version of these functions
as well. In MATLAB implementations, the H0 argument is also optional.
- ibcdfo.manifold_sampling.h_one_norm(z, H0=None)
\(\hfun\) function for constructing the 1-norm objective function
\[\hfun\left(\zvec\right) = \sum_{i = 1}^{\nd} \abs{z_i}.\]
- ibcdfo.manifold_sampling.h_pw_minimum(z, H0=None)
\(\hfun\) function for constructing the pointwise minimum objective function
\[ \hfun\left(\zvec\right) = \min \set{z_1, \cdots, z_{\nd}}\]
- ibcdfo.manifold_sampling.h_pw_minimum_squared(z, H0=None)
\(\hfun\) function for constructing the pointwise minimum objective function
\[ \hfun\left(\zvec\right) = \min \set{z_1^2, \cdots, z_{\nd}^2}\]
- ibcdfo.manifold_sampling.h_pw_maximum(z, H0=None)
\(\hfun\) function for constructing the pointwise maximum objective function
\[ \hfun\left(\zvec\right) = \max \set{z_1, \cdots, z_{\nd}}\]
- ibcdfo.manifold_sampling.h_pw_maximum_squared(z, H0=None)
\(\hfun\) function for constructing the pointwise maximum objective function
\[ \hfun\left(\zvec\right) = \max \set{z_1^2, \cdots, z_{\nd}^2}\]
- ibcdfo.manifold_sampling.h_quantile(z, H0=None)
\(\hfun\) function whose value is determined by returning at each \(z\) the \(q^{\mathrm{th}}\) quantile of
\[\set{z_1^2, \cdots, z_{\nd}^2}.\]Note that this example has hard-coded the value \(q=1\), which corresponds to returning the minimum value of the above set. You must alter this function for other quantiles by changing the value of \(q\) to an integer between 1 and \(\nd\), inclusive. Note this functionality is the same as provided by h_pw_minimum_squared.
- ibcdfo.manifold_sampling.h_max_gamma_over_KY(z, H0=None)
\(\hfun\) function for constructing the objective function
\[\hfun\left(\zvec; KY_1, \cdots, KY_{11}\right) = \max \left\{\frac{z_1}{KY_1}, \cdots, \frac{z_{11}}{KY_{11}}\right\},\]where the components
\[z_j = \gamma(\kappa, \Delta, \zeta, KY_j)\]are computed from the application-specific model function \(\gamma\) based on application-specific parameters \(\psp = (\kappa, \Delta, \zeta)\). Presently, the \(KY\) parameters are hardcoded to the uniform grid
\[(KY_1, KY_2, \cdots, KY_{11}) = (0.1, 0.15, \cdots, 0.6).\]
- ibcdfo.manifold_sampling.h_max_plus_quadratic_violation_penalty(z, H0=None)
\(\hfun\) function for constructing the objective function
\[\hfun\left(\zvec\right) = \max \set{z_1, \cdots, z_{p1}} + \alpha\sum_{i=p1+1}^{\nd} \max\set{z_i, 0}^2\]where \(\zvec = (z_1, \cdots, z_p)\) is an application-specific residual or feature vector. Presently, \(p1 = p-1\) and \(\alpha = 0\) are hardcoded, so in the current implementation reduces to
\[\max \left\{ z_1, \cdots, z_{p-1} \right\}.\]The final component \(z_p\) is kept to reflect the original max-plus-violation-penalty structure, even though its contribution is currently zeroed by the hardcoded choice \(\alpha = 0\).
3.2.5. Parameterized \(\hfun\) Functions
Manifold Sampling permits parameterized \(\hfun\) functions, which it can
use only once users have chosen a single set of parameter values for formulating
a specific \(\hfun\) function and, therefore, a single related problem. As an example, the
following routines can be used to create a single hfun for a single set of
desired parameter values. While these routines are presented through their
integration into the Python package, the documentation is valid for the MATLAB
version of these routines, which are located in
manifold_sampling/m/general_nonsmooth_h_funs.
- ibcdfo.manifold_sampling.create_censored_L1_loss_hfun(C, D)
A generalized version of Womersley’s censored \(\ell_1\) loss function Womersley [7].
Given observed outputs \(\zvec\in\R^m\), a per-component censoring floor \(\cvec\in\R^m\), and target data \(\dvec\in\R^m\), this objective is
\[\hfun(\zvec;\cvec,\dvec) = \sum_{i=1}^{m} \left|\, d_i - \max(z_i, c_i)\,\right|.\]This produces a one-sided (censored) discrepancy: components with \(z_i < c_i\) are treated as if the observation were \(c_i\), so the loss does not continue to decrease by driving \(z_i\) below the censoring floor. This reduces sensitivity to outliers, preventing any single component from dominating the measure of misfit.
- Parameters:
C – 1D NumPy array that provides the censoring values \(c_i\).
D – 1D NumPy array that provides the target data \(d_i\).
- Returns:
hfunconstructed with the given \(c_i, d_i\) that is compatible only with \(\zvec\) arguments of the same length asCandD.
- ibcdfo.manifold_sampling.create_piecewise_quadratic_hfun(Qs, zs, cs)
Create an \(\hfun\) function using the given \(Q_j, \zvec_j, c_j\) parameter values for constructing and using the manifold sampling piecewise quadratic objective function
\[\hfun\left(\zvec; Q_1, \cdots, Q_l, \zvec_1, \cdots, \zvec_l, c_1, \cdots, c_l\right) = \max_{j\in\set{1, \cdots, l}}\set{\norm{\zvec - \zvec_j}^2_{Q_j} + c_j}.\]This family of \(h\) functions is included in the package as an example and might not be useful for constructing practical optimization problems.
- Parameters:
Qs – \(m \times m \times l\) NumPy array that contains the \(Q_j \in \R^{m \times m}\) parameter values. Note that at least one \(Q_j\) should be symmetric positive definite in order for the associated optimization problem to be well-defined. There is no error checking to confirm that given
Qsarguments satisfy this requirement.zs – \(m \times l\) NumPy array that contains the \(\zvec_j \in \R^m\) parameter values
cs – 1D NumPy array of length \(l\) that specifies the \(c_j\) parameter values
- Returns:
hfunconstructed with the given \(Q_j, \zvec_j, c_j\) that is compatible only with \(\zvec\) arguments whose lengths are compatible with the shapes ofQsandzs.