cholupdates.rank_1.downdate_seeger

cholupdates.rank_1.downdate_seeger(L, v, check_diag=True, overwrite_L=False, overwrite_v=False, impl=None)

Update a Cholesky factorization after subtraction of a positive semidefinite symmetric rank-1 matrix using the algorithm from section 3 of [Seeger, 2008].

In other words, given \(A = L L^T \in \mathbb{R}^{N \times N}\), and \(v \in \mathbb{R}^N\), compute \(L^-\) such that

\[A^- := A - v v^T = L^- (L^-)^T.\]

Note that the Cholesky factorization of the downdated matrix may not exist, since a downdate might result in a matrix which is not positive definite.

This function computes the Cholesky factorization of \(A^-\) from \(L\) in \(O(N^2)\) time, which is faster than the \(O(N^3)\) time complexity of naively applying a Cholesky algorithm to \(A^-\) directly.

Parameters

L ((N, N) numpy.ndarray, dtype=numpy.single or numpy.double) – Lower-triangular Cholesky factor of \(A\). Must have a non-zero diagonal. Must have the same dtype as v. Must be a contiguous array (either Fortran- or C-contiguous). The algorithm is most efficient if this array is given in column-major layout, a.k.a. Fortran-contiguous or f-contiguous memory order. Hint: Lower-triangular Cholesky factors in column-major memory layout can be obtained efficiently (i.e. without requiring an additional copy) from scipy.linalg.cho_factor(). The entries in the strict upper-triangular part of L can contain arbitrary values, since the algorithm neither reads from nor writes to this part of the matrix. This behavior is useful when using the Cholesky factors returned by scipy.linalg.cho_factor() which contain arbitrary values on the irrelevant triangular part of the matrix.
v ((N,) numpy.ndarray, dtype=numpy.single or numpy.double) – The vector \(v\) which defines the symmetric rank-1 downdate matrix \(v v^T\). Must have the same dtype as L. Must be a contiguous array (either Fortran- or C-contiguous).
check_diag (bool) – If set to True, the function will check whether the diagonal of the given Cholesky factor L is non-zero and raise a ValueError if this is not the case. Setting check_diag to False can be used to speed up computations if it is clear that the Cholesky factor can not have zeros on its diagonal. Caution: If this argument is set to False and the Cholesky factor does contain zeros on its diagonal, the behavior of the function will be undefined.
overwrite_L (bool) – If set to True, the function will overwrite the array L with the upper Cholesky factor \(L^-\) of \(A^-\), i.e. the result is computed in-place. Passing False here ensures that the array L is not modified.
overwrite_v (bool) – If set to True, the function will reuse the array v as an internal computation buffer, which will modify v. Passing False here ensures that the array v is not modified. In this case, an additional array of shape (N,) and dtype numpy.double must be allocated.
impl (Optional[str]) –
Defines which implementation of the algorithm to use. Must be one of
- None
  Choose the Cython implementation if it is available, otherwise use the Python implementation.
- ”cython”
  Use the Cython implementation. Throws a ValueError if the Cython implementation is not available.
- ”python”
  Use the Python implementation.
Defaults to None.

Returns

Lower-triangular Cholesky factor \(L^-\) of \(A - v v^T\) with shape (N, N). The diagonal entries of this matrix are guaranteed to be positive. The strict upper-triangular part of this matrix will contain the values from the upper-triangular part of L. The matrix will inherit dtype and memory order from L.

Return type

(N, N) numpy.ndarray, dtype=L.dtype

Raises

ValueError – If L does not have shape (N, N) for some N.
numpy.linalg.LinAlgError – If the diagonal of L contains zeros and check_diag is set to true.
ValueError – If v does not have shape (N,), while L has shape (N, N).
TypeError – If L does not have dtype numpy.single or numpy.double.
TypeError – If v does not have dtype numpy.single or numpy.double.
TypeError – If L and v don’t have the same dtype.
ValueError – If L or v is not contiguous in memory.
ValueError – If impl was set to "cython", but the Cython implementation is not available.
numpy.linalg.LinAlgError – If the downdate results in a matrix \(L^-\), which is not positive definite.

See also

cholupdates.rank_1.downdate: Interface function for all rank-1 downdate methods. Can be used to call this function conveniently.

Notes

This function implements the algorithm from 1, section 3. In the following, we will elaborate on the theory behind the algorithm. Let \(A \in \mathbb{R}^{n \times n}\) be symmetric and positive-definite, and \(v \in \mathbb{R}^n\). The vector \(v\) defines a symmetric rank-1 downdate \(-v v^T\) to \(A\), i.e.

\[A^- := A - v v^T.\]

Note that \(A^-\) need not be positive definite. In the following, we will derive an efficient criterion to check whether \(A^-\) is positive definite. For the derivation of the algorithm, we assume that \(A^-\) is positive definite.

Assume that the Cholesky factorization of \(A\), i.e. a lower-triangular matrix \(L \in \mathbb{R}^{n \times n}\) with \(A = L L^T\), is given. We want to find a fast and stable method to update the Cholesky factorization \(L L^T\) of \(A\) to the Cholesky factorization \(A^- = L^- (L^-)^T\) of the updated matrix \(A^-\), where, again, \(L^- \in \mathbb{R}^{n \times n}\) is lower triangular.

If we can construct an orthogonal transformation \(Q \in \mathbb{R}^{(n + 1) \times (n + 1)}\) such that

(1)\[\begin{pmatrix} L & 0 \end{pmatrix} Q^T = \begin{pmatrix} L^- & v \end{pmatrix},\]

for some lower-triangular matrix \(L^-\), then

\[\begin{split}L^- (L^-)^T + v v^T = \begin{pmatrix} L^- & v \end{pmatrix} \begin{pmatrix} L^- & v \end{pmatrix}^T = \begin{pmatrix} L & 0 \end{pmatrix} \underbrace{Q^T Q}_{= I} \begin{pmatrix} L^T \\ 0 \end{pmatrix} = L L^T,\end{split}\]

which is equivalent to \(L^- (L^-)^T = L L^T - v v^T\). This means that we can retrieve the updated Cholesky factor as \(L^- = L \cdot Q_{1:n, 1:n}^T\).

In the remainder of this section, we will derive another constraint on \(Q\) which will directly give rise to an algorithm to compute \(Q\).

If we multiply equation (1) with the standard basis vector \(e_{n + 1} \in \mathbb{R}^{n + 1}\) from the right-hand side, we arrive at

\[\begin{pmatrix} L & 0 \end{pmatrix} \underbrace{Q^T e_{n + 1}}_{=: q} = v.\]

With \(p := q_{1:n} \in \mathbb{R}^n\), this implies \(L p = v\), i.e. \(p = L^{-1} v\), since \(L\) is invertible.

The vector \(p\) can be used to cheaply check whether the downdate results in a positive definite matrix, since \(p^T p < 1\) if and only if \(A^-\) is positive definite. To see this, note that \(L\) is invertible and that \(A^-\) is hence positive definite if and only if

\[L^{-1} A^- L^{-T} = L^{-1} (L L^T - v v^T) L^{-T} = I - (L^{-1} v) (v^T L^{-T}) = I - p p^T\]

is positive definite. One can show that the eigenvectors of \(I - p p^T\) are given by any \(u\) in the orthogonal complement of \(p\) with eigenvalue \(1 > 0\) and \(p\) with eigenvalue \(1 - p^T p\). Hence, \(I - p p^T\) is positive definite if and only if \(p^T p < 1\).

In order to find \(q_{n + 1}\), we note that

\[\lVert q \rVert_2^2 = e_{n + 1}^T \underbrace{Q Q^T}_{= I} e_{n + 1} = e_{n + 1}^T e_{n + 1} = 1,\]

which implies

\[1 =\lVert q \rVert_2^2 = p^T p + q_{n + 1}^2 \quad \Leftrightarrow \quad q_{n + 1}^2 = 1 - p^T p.\]

Note that this is well-defined, since \(A^-\) is assumed to be positive definite, which means that \(p^T p < 1\), i.e. \(1 - p^T p > 0\). All in all, our additional constraint on \(Q\) can be formulated as

(2)\[\begin{split}Q q = e_{n + 1} \quad \text{with} \quad q = \begin{pmatrix} p \\ \rho \end{pmatrix}, \quad L p = v, \quad \text{and} \quad \rho^2 := 1 - p^T p.\end{split}\]

From this constraint, we can now generate the desired orthogonal matrix \(Q\) in a principled way. Obviously, there are multiple different orthogonal matrices which fulfill (2). However, our particular choice must also fulfill (1). One particular way to construct \(Q\) is as a sequence of Givens rotations \(Q = Q_1 \dotsb Q_n\), where \(Q_i\) eliminates the \(i\)-th entry of

\[q^{(i + 1)} = \left( \prod_{j = i + 1}^n Q_j \right) q\]

with \(q^{(i + 1)}_{n + 1}\), i.e. \(Q_i\) transforms \(q^{(i + 1)}\) to \(q^{(i)}\) with \(q^{(i)}_i = 0\) and

\[q^{(i)}_{n + 1} = \sqrt{(q^{(i + 1)}_i)^2 + (q^{(i + 1)}_{n + 1})^2},\]

while preserving all other entries.

Obviously, this choice of \(Q\) fulfills (2) and we can show by induction that it also fulfills (1). Denote

\[\begin{split}\begin{pmatrix} L^{(i)} & v^{(i)} \end{pmatrix} & := \begin{pmatrix} L & 0 \end{pmatrix} \left( Q_i \dotsb Q_n \right)^T \\ & = \begin{pmatrix} L & 0 \end{pmatrix} Q_n^T \dotsb Q_i^T \\ & = \begin{pmatrix} L^{(i + 1)} & v^{(i + 1)} \end{pmatrix} Q_i^T.\end{split}\]

For \(i = n + 1\), we have \(L^{(i)} = L\), which is lower-triangular, and \(v^{(i)} = 0\), i.e. the first \(i - 1\) components of \(v^{(i)}\) contain zeros. Now assume that \(L^{(i + 1)}\) is lower triangular and that the first \((i + 1) - 1\) components of \(v^{(i + 1)}\) are zeros.

\[\begin{split}& \begin{pmatrix} L^{(i)} & v^{(i)} \end{pmatrix} \\ = \ & \begin{pmatrix} L^{(i + 1)} & v^{(i + 1)} \end{pmatrix} Q_i^T \\ = \ & \begin{pmatrix} L_{1:(i - 1), 1:(i - 1)}^{(i + 1)} & 0 & 0 & 0 \\ L_{i, 1:(i - 1)}^{(i + 1)} & L_{ii}^{(i + 1)} & 0 & 0 \\ L_{(i + 1):n, 1:i}^{(i + 1)} & L_{(i + 1):n, i}^{(i + 1)} & L_{(i + 1):n, (i + 1):n}^{(i + 1)} & v_{(i + 1):n}^{(i + 1)} \\ \end{pmatrix} \cdot \begin{pmatrix} I & 0 & 0 & 0 \\ 0 & c_i & 0 & s_i \\ 0 & 0 & I & 0 \\ 0 & -s_i & 0 & c_i \\ \end{pmatrix} \\ = \ & \begin{pmatrix} L_{1:(i - 1), 1:(i - 1)}^{(i + 1)} & 0 & 0 & 0 \\ L_{i, 1:(i - 1)}^{(i + 1)} & L_{ii}^{(i)} & 0 & v_i^{(i)} \\ L_{(i + 1):n, 1:i}^{(i + 1)} & L_{(i + 1):n, i}^{(i)} & L_{(i + 1):n, (i + 1):n}^{(i + 1)} & v_{(i + 1):n}^{(i)} \\ \end{pmatrix},\end{split}\]

where the \(0\) and \(I\) blocks are chosen such that the matrix product makes sense. One can now read off that \(L^{(i)}\) is lower-triangular and that the first \(i - 1\) entries of \(v^{(i)}\) are zeros. By induction, we can thus conclude that \(L^{(1)} = LQ^T_{1:n, 1:n}\) is lower-triangular. Moreover, since \(Q\) also fulfills (2), we know that \(v^{(1)} = v\). It follows that

\[\begin{pmatrix} L & 0 \end{pmatrix} Q^T = \begin{pmatrix} L^- & v \end{pmatrix},\]

i.e. \(L^- = L Q_{1:n, 1:n}^T\).

Note that this algorithm is a minor modification of the LINPACK 2 routine dchdd. The exact modifications are described in 1.

References

1(1,2)

Seeger, “Low Rank Updates for the Cholesky Decomposition”, 2008.

2

J. Dongarra, C. Moler, J. Bunch, and G. Steward. “LINPACK User’s Guide”. Society for Industrial and Applied Mathematics, 1979.