3. Manifold Sampling

3.1. General Description

Manifold Sampling is a derivative-free trust-region method for solving composite nonsmooth optimization problems of the form

\[\min_{\psp \in \R^{\np}} \hfun(\Ffun(\psp))\]

where:

\(\Ffun:\R^{\np} \rightarrow \R^{\nd}\) is a black-box vector-valued function (e.g., a simulation output or a vector of residuals), and
\(\hfun:\R^{\nd} \rightarrow \R\) is a nonsmooth function, expressible as a continuous selection. Examples of continuous selections include piecewise linear functions, piecewise quadratic functions or functions formed from max, min or absolute value operations.

In the programmatic interface, the input to hfun is often denoted \(\zvec\), representing the intermediate vector \(\zvec = \Ffun(\psp)\).

Analyses assume that \(\Ffun\) is continuous and locally sufficiently smooth so that accurate trust-region surrogate models can be constructed. However, in practice, these assumptions do not need to be explicitly checked and derivatives of \(\Ffun\) certainly need not be available.

The composite structure \(\hfun(\Ffun(\psp))\) arises naturally in robust regression, minimax design, simulation-based calibration, and optimization with embedded nonsmooth merit functions.

Manifold Sampling builds local surrogate models for components of \(\Ffun\) and exploits the piecewise structure of \(\hfun\) by identifying locally active manifolds, that is, smooth pieces \(\hfun\) that are active near the current incumbent. At each iteration, a trust-region subproblem is solved on a model derived in part from the currently active manifolds, producing a step that balances local improvement and model accuracy.

Unlike generic nonsmooth methods, manifold sampling deliberately leverages the composite structure \(\hfun(\Ffun(\psp))\), which can significantly improve practical performance when evaluations of \(\Ffun\) are expensive.

For algorithmic details, see Larson et al. [4], Khan et al. [1], Larson et al. [5], and Larson and Menickelly [3].

3.2. Programmatic Interface

3.2.1. Status Code

All Manifold Sampling implementations return a termination criteria flag. The interpretation of the value of the flag is identical across implementations and possible values are

> 0 - successful termination; the value of the flag is the final stationarity measure (\(\chi_k\))
0 - the budget of nf_max function evaluations was exhausted.
-1 - model construction failed (an empty or invalid local model was constructed)
-2 - trust-region subproblem failed, likely due to an unbounded or poorly scaled affine-envelope subproblem

The programmatic interface is generally maintained identically across all implementations. Nevertheless, we provide the interface for each implementation to provide language-specific descriptions.

3.2.2. Python

ibcdfo.run_MSP(hfun, Ffun, x0, L, U, nf_max, subprob_switch)

Run manifold sampling to solve the composite nonsmooth optimization problem.

Parameters:

hfun –
Function implementing \(\hfun\). Supports two calling modes.

Mode 1:
```
hval, grads, hashes = hfun(z)
```
where:
- z is a length-\(\nd\) array
- hval is the scalar value \(\hfun(\zvec)\)
- grads is a \((\nd, l)\) array whose columns are gradients of the \(l\) active selection functions at \(\zvec\)
- hashes is a list of identifiers for those \(l\) active manifolds
Mode 2:
```
vals, grads = hfun(z, hashes)
```
where the list hashes specifies which manifolds to evaluate, vals[i] is the value of the i th corresponding selection, and column i of grads is the gradient of the i th selection at \(\zvec\).
Ffun – Function returning \(\Ffun(\psp)\) as a length-\(\nd\) array for a given length-\(\np\) array x.
x0 – Initial point (length \(\np\) array).
L – Lower bounds for x (length \(\np\) array).
U – Upper bounds for x (length \(\np\) array).
nf_max – Maximum number of evaluations of Ffun.
subprob_switch – Selects the trust-region subproblem solver/variant used internally.

Returns:

Tuple (X, F, h, xkin, flag) where

X – array of evaluated points in evaluation order
F – array with rows \(\Ffun(X[i])\)
h – array with \(h[i] = \hfun(\Ffun(X[i]))\)
xkin – zero-based index in X of the final trust-region center
flag – termination code (See general documentation)

3.2.3. MATLAB

manifold_sampling_primal(hfun, Ffun, x0, L, U, nf_max, subprob_switch)

Run manifold sampling to solve the composite nonsmooth optimization problem.

Parameters:

hfun –
Handle to function that, given a \(\np \times 1\) point \(\zvec\), returns
- the value \(h(\zvec)\)
- \(\np \times l\) matrix that contains gradients for all \(l\) active selections at \(\zvec\)
- \(1 \times l\) vector of hashes for each manifold active at \(\zvec\)
Given point \(\zvec\) and \(l\) hashes \(H\), returns
- An \(1 \times l\) vector that contains the values \(h_i(\zvec)\) for each hash in \(H\)
- A \(p \times l\) matrix that contains the gradients of \(h_i(\zvec)\) for each hash in \(H\)
Ffun – Function that returns the value \(\Ffun(\psp)\) of the blackbox simulation as an \(1 \times \nd\) vector for given \(\psp\)
x0 – \(1 \times \np\) vector that specifies the starting point
L – \(1 \times \np\) vector of lower bounds
U – \(1 \times \np\) vector of upper bounds
nf_max – [int] Maximum number of function evaluations
subprob_switch – Selects the trust-region subproblem solver used internally.

Returns:

X - \(\mathrm{nf\_max} \times \np\) matrix containing the points evaluated in the order in which they were evaluated
F - \(\mathrm{nf\_max} \times \nd\) matrix containing the blackbox function values \(\Ffun(\psp)\) for all points in X with matching ordering
h - \(\mathrm{nf\_max} \times 1\) matrix containing the values \(\hfun(\Ffun(\psp))\) for all points in X with matching ordering.
xkin - One-based index of current trust region center.
flag - Termination criteria flag (See general documentation)

3.2.4. General \(\hfun\) Functions

The following \(\hfun\) functions are available for immediate use with the Python implementation of Manifold Sampling. With the potential exception of application-specific functions, these same functions are available for immediate use with the MATLAB implementation in manifold_sampling/m/general_nonsmooth_h_funs. While they are presented here through their integration into the Python package, the documentation is generally valid for the MATLAB version of these functions as well. In MATLAB implementations, the H0 argument is also optional.

ibcdfo.manifold_sampling.h_one_norm(z, H0=None): \(\hfun\) function for constructing the 1-norm objective function

\[\hfun\left(\zvec\right) = \sum_{i = 1}^{\nd} \abs{z_i}.\]

ibcdfo.manifold_sampling.h_pw_minimum(z, H0=None): \(\hfun\) function for constructing the pointwise minimum objective function

\[ \hfun\left(\zvec\right) = \min \set{z_1, \cdots, z_{\nd}}\]

ibcdfo.manifold_sampling.h_pw_minimum_squared(z, H0=None): \(\hfun\) function for constructing the pointwise minimum objective function

\[ \hfun\left(\zvec\right) = \min \set{z_1^2, \cdots, z_{\nd}^2}\]

ibcdfo.manifold_sampling.h_pw_maximum(z, H0=None): \(\hfun\) function for constructing the pointwise maximum objective function

\[ \hfun\left(\zvec\right) = \max \set{z_1, \cdots, z_{\nd}}\]

ibcdfo.manifold_sampling.h_pw_maximum_squared(z, H0=None): \(\hfun\) function for constructing the pointwise maximum objective function

\[ \hfun\left(\zvec\right) = \max \set{z_1^2, \cdots, z_{\nd}^2}\]

ibcdfo.manifold_sampling.h_quantile(z, H0=None)

\(\hfun\) function whose value is determined by returning at each \(z\) the \(q^{\mathrm{th}}\) quantile of

\[\set{z_1^2, \cdots, z_{\nd}^2}.\]

Note that this example has hard-coded the value \(q=1\), which corresponds to returning the minimum value of the above set. You must alter this function for other quantiles by changing the value of \(q\) to an integer between 1 and \(\nd\), inclusive. Note this functionality is the same as provided by h_pw_minimum_squared.

ibcdfo.manifold_sampling.h_max_gamma_over_KY(z, H0=None)

\(\hfun\) function for constructing the objective function

\[\hfun\left(\zvec; KY_1, \cdots, KY_{11}\right) = \max \left\{\frac{z_1}{KY_1}, \cdots, \frac{z_{11}}{KY_{11}}\right\},\]

where the components

\[z_j = \gamma(\kappa, \Delta, \zeta, KY_j)\]

are computed from the application-specific model function \(\gamma\) based on application-specific parameters \(\psp = (\kappa, \Delta, \zeta)\). Presently, the \(KY\) parameters are hardcoded to the uniform grid

\[(KY_1, KY_2, \cdots, KY_{11}) = (0.1, 0.15, \cdots, 0.6).\]

ibcdfo.manifold_sampling.h_max_plus_quadratic_violation_penalty(z, H0=None)

\(\hfun\) function for constructing the objective function

\[\hfun\left(\zvec\right) = \max \set{z_1, \cdots, z_{p1}} + \alpha\sum_{i=p1+1}^{\nd} \max\set{z_i, 0}^2\]

where \(\zvec = (z_1, \cdots, z_p)\) is an application-specific residual or feature vector. Presently, \(p1 = p-1\) and \(\alpha = 0\) are hardcoded, so in the current implementation reduces to

\[\max \left\{ z_1, \cdots, z_{p-1} \right\}.\]

The final component \(z_p\) is kept to reflect the original max-plus-violation-penalty structure, even though its contribution is currently zeroed by the hardcoded choice \(\alpha = 0\).

3.2.5. Parameterized \(\hfun\) Functions

Manifold Sampling permits parameterized \(\hfun\) functions, which it can use only once users have chosen a single set of parameter values for formulating a specific \(\hfun\) function and, therefore, a single related problem. As an example, the following routines can be used to create a single hfun for a single set of desired parameter values. While these routines are presented through their integration into the Python package, the documentation is valid for the MATLAB version of these routines, which are located in manifold_sampling/m/general_nonsmooth_h_funs.

ibcdfo.manifold_sampling.create_censored_L1_loss_hfun(C, D)

A generalized version of Womersley’s censored \(\ell_1\) loss function Womersley [7].

Given observed outputs \(\zvec\in\R^m\), a per-component censoring floor \(\cvec\in\R^m\), and target data \(\dvec\in\R^m\), this objective is

\[\hfun(\zvec;\cvec,\dvec) = \sum_{i=1}^{m} \left|\, d_i - \max(z_i, c_i)\,\right|.\]

This produces a one-sided (censored) discrepancy: components with \(z_i < c_i\) are treated as if the observation were \(c_i\), so the loss does not continue to decrease by driving \(z_i\) below the censoring floor. This reduces sensitivity to outliers, preventing any single component from dominating the measure of misfit.

Parameters:

C – 1D NumPy array that provides the censoring values \(c_i\).
D – 1D NumPy array that provides the target data \(d_i\).

Returns:

hfun constructed with the given \(c_i, d_i\) that is compatible only with \(\zvec\) arguments of the same length as C and D.

ibcdfo.manifold_sampling.create_piecewise_quadratic_hfun(Qs, zs, cs)

Create an \(\hfun\) function using the given \(Q_j, \zvec_j, c_j\) parameter values for constructing and using the manifold sampling piecewise quadratic objective function

\[\hfun\left(\zvec; Q_1, \cdots, Q_l, \zvec_1, \cdots, \zvec_l, c_1, \cdots, c_l\right) = \max_{j\in\set{1, \cdots, l}}\set{\norm{\zvec - \zvec_j}^2_{Q_j} + c_j}.\]

This family of \(h\) functions is included in the package as an example and might not be useful for constructing practical optimization problems.

Parameters:

Qs – \(m \times m \times l\) NumPy array that contains the \(Q_j \in \R^{m \times m}\) parameter values. Note that at least one \(Q_j\) should be symmetric positive definite in order for the associated optimization problem to be well-defined. There is no error checking to confirm that given Qs arguments satisfy this requirement.
zs – \(m \times l\) NumPy array that contains the \(\zvec_j \in \R^m\) parameter values
cs – 1D NumPy array of length \(l\) that specifies the \(c_j\) parameter values

Returns:

hfun constructed with the given \(Q_j, \zvec_j, c_j\) that is compatible only with \(\zvec\) arguments whose lengths are compatible with the shapes of Qs and zs.