cholupdates.rank_1.downdate_seeger
- cholupdates.rank_1.downdate_seeger(L, v, check_diag=True, overwrite_L=False, overwrite_v=False, impl=None)
Update a Cholesky factorization after subtraction of a positive semidefinite symmetric rank-1 matrix using the algorithm from section 3 of [Seeger, 2008].
In other words, given \(A = L L^T \in \mathbb{R}^{N \times N}\), and \(v \in \mathbb{R}^N\), compute \(L^-\) such that
\[A^- := A - v v^T = L^- (L^-)^T.\]Note that the Cholesky factorization of the downdated matrix may not exist, since a downdate might result in a matrix which is not positive definite.
This function computes the Cholesky factorization of \(A^-\) from \(L\) in \(O(N^2)\) time, which is faster than the \(O(N^3)\) time complexity of naively applying a Cholesky algorithm to \(A^-\) directly.
- Parameters
L ((N, N) numpy.ndarray, dtype=numpy.single or numpy.double) – Lower-triangular Cholesky factor of \(A\). Must have a non-zero diagonal. Must have the same dtype as
v. Must be a contiguous array (either Fortran- or C-contiguous). The algorithm is most efficient if this array is given in column-major layout, a.k.a. Fortran-contiguous or f-contiguous memory order. Hint: Lower-triangular Cholesky factors in column-major memory layout can be obtained efficiently (i.e. without requiring an additional copy) fromscipy.linalg.cho_factor(). The entries in the strict upper-triangular part ofLcan contain arbitrary values, since the algorithm neither reads from nor writes to this part of the matrix. This behavior is useful when using the Cholesky factors returned byscipy.linalg.cho_factor()which contain arbitrary values on the irrelevant triangular part of the matrix.v ((N,) numpy.ndarray, dtype=numpy.single or numpy.double) – The vector \(v\) which defines the symmetric rank-1 downdate matrix \(v v^T\). Must have the same dtype as
L. Must be a contiguous array (either Fortran- or C-contiguous).check_diag (bool) – If set to
True, the function will check whether the diagonal of the given Cholesky factorLis non-zero and raise aValueErrorif this is not the case. Settingcheck_diagtoFalsecan be used to speed up computations if it is clear that the Cholesky factor can not have zeros on its diagonal. Caution: If this argument is set toFalseand the Cholesky factor does contain zeros on its diagonal, the behavior of the function will be undefined.overwrite_L (bool) – If set to
True, the function will overwrite the arrayLwith the upper Cholesky factor \(L^-\) of \(A^-\), i.e. the result is computed in-place. PassingFalsehere ensures that the arrayLis not modified.overwrite_v (bool) – If set to
True, the function will reuse the arrayvas an internal computation buffer, which will modifyv. PassingFalsehere ensures that the arrayvis not modified. In this case, an additional array of shape(N,)and dtypenumpy.doublemust be allocated.impl (Optional[str]) –
Defines which implementation of the algorithm to use. Must be one of
NoneChoose the Cython implementation if it is available, otherwise use the Python implementation.
- ”cython”
Use the Cython implementation. Throws a
ValueErrorif the Cython implementation is not available.
- ”python”
Use the Python implementation.
Defaults to None.
- Returns
Lower-triangular Cholesky factor \(L^-\) of \(A - v v^T\) with shape
(N, N). The diagonal entries of this matrix are guaranteed to be positive. The strict upper-triangular part of this matrix will contain the values from the upper-triangular part ofL. The matrix will inherit dtype and memory order fromL.- Return type
(N, N) numpy.ndarray, dtype=L.dtype
- Raises
ValueError – If
Ldoes not have shape(N, N)for someN.numpy.linalg.LinAlgError – If the diagonal of
Lcontains zeros andcheck_diagis set to true.ValueError – If
vdoes not have shape(N,), whileLhas shape(N, N).TypeError – If
Ldoes not have dtypenumpy.singleornumpy.double.TypeError – If
vdoes not have dtypenumpy.singleornumpy.double.TypeError – If
Landvdon’t have the same dtype.ValueError – If
Lorvis not contiguous in memory.ValueError – If
implwas set to"cython", but the Cython implementation is not available.numpy.linalg.LinAlgError – If the downdate results in a matrix \(L^-\), which is not positive definite.
See also
cholupdates.rank_1.downdateInterface function for all rank-1 downdate methods. Can be used to call this function conveniently.
Notes
This function implements the algorithm from 1, section 3. In the following, we will elaborate on the theory behind the algorithm. Let \(A \in \mathbb{R}^{n \times n}\) be symmetric and positive-definite, and \(v \in \mathbb{R}^n\). The vector \(v\) defines a symmetric rank-1 downdate \(-v v^T\) to \(A\), i.e.
\[A^- := A - v v^T.\]Note that \(A^-\) need not be positive definite. In the following, we will derive an efficient criterion to check whether \(A^-\) is positive definite. For the derivation of the algorithm, we assume that \(A^-\) is positive definite.
Assume that the Cholesky factorization of \(A\), i.e. a lower-triangular matrix \(L \in \mathbb{R}^{n \times n}\) with \(A = L L^T\), is given. We want to find a fast and stable method to update the Cholesky factorization \(L L^T\) of \(A\) to the Cholesky factorization \(A^- = L^- (L^-)^T\) of the updated matrix \(A^-\), where, again, \(L^- \in \mathbb{R}^{n \times n}\) is lower triangular.
If we can construct an orthogonal transformation \(Q \in \mathbb{R}^{(n + 1) \times (n + 1)}\) such that
(1)\[\begin{pmatrix} L & 0 \end{pmatrix} Q^T = \begin{pmatrix} L^- & v \end{pmatrix},\]for some lower-triangular matrix \(L^-\), then
\[\begin{split}L^- (L^-)^T + v v^T = \begin{pmatrix} L^- & v \end{pmatrix} \begin{pmatrix} L^- & v \end{pmatrix}^T = \begin{pmatrix} L & 0 \end{pmatrix} \underbrace{Q^T Q}_{= I} \begin{pmatrix} L^T \\ 0 \end{pmatrix} = L L^T,\end{split}\]which is equivalent to \(L^- (L^-)^T = L L^T - v v^T\). This means that we can retrieve the updated Cholesky factor as \(L^- = L \cdot Q_{1:n, 1:n}^T\).
In the remainder of this section, we will derive another constraint on \(Q\) which will directly give rise to an algorithm to compute \(Q\).
If we multiply equation (1) with the standard basis vector \(e_{n + 1} \in \mathbb{R}^{n + 1}\) from the right-hand side, we arrive at
\[\begin{pmatrix} L & 0 \end{pmatrix} \underbrace{Q^T e_{n + 1}}_{=: q} = v.\]With \(p := q_{1:n} \in \mathbb{R}^n\), this implies \(L p = v\), i.e. \(p = L^{-1} v\), since \(L\) is invertible.
The vector \(p\) can be used to cheaply check whether the downdate results in a positive definite matrix, since \(p^T p < 1\) if and only if \(A^-\) is positive definite. To see this, note that \(L\) is invertible and that \(A^-\) is hence positive definite if and only if
\[L^{-1} A^- L^{-T} = L^{-1} (L L^T - v v^T) L^{-T} = I - (L^{-1} v) (v^T L^{-T}) = I - p p^T\]is positive definite. One can show that the eigenvectors of \(I - p p^T\) are given by any \(u\) in the orthogonal complement of \(p\) with eigenvalue \(1 > 0\) and \(p\) with eigenvalue \(1 - p^T p\). Hence, \(I - p p^T\) is positive definite if and only if \(p^T p < 1\).
In order to find \(q_{n + 1}\), we note that
\[\lVert q \rVert_2^2 = e_{n + 1}^T \underbrace{Q Q^T}_{= I} e_{n + 1} = e_{n + 1}^T e_{n + 1} = 1,\]which implies
\[1 =\lVert q \rVert_2^2 = p^T p + q_{n + 1}^2 \quad \Leftrightarrow \quad q_{n + 1}^2 = 1 - p^T p.\]Note that this is well-defined, since \(A^-\) is assumed to be positive definite, which means that \(p^T p < 1\), i.e. \(1 - p^T p > 0\). All in all, our additional constraint on \(Q\) can be formulated as
(2)\[\begin{split}Q q = e_{n + 1} \quad \text{with} \quad q = \begin{pmatrix} p \\ \rho \end{pmatrix}, \quad L p = v, \quad \text{and} \quad \rho^2 := 1 - p^T p.\end{split}\]From this constraint, we can now generate the desired orthogonal matrix \(Q\) in a principled way. Obviously, there are multiple different orthogonal matrices which fulfill (2). However, our particular choice must also fulfill (1). One particular way to construct \(Q\) is as a sequence of Givens rotations \(Q = Q_1 \dotsb Q_n\), where \(Q_i\) eliminates the \(i\)-th entry of
\[q^{(i + 1)} = \left( \prod_{j = i + 1}^n Q_j \right) q\]with \(q^{(i + 1)}_{n + 1}\), i.e. \(Q_i\) transforms \(q^{(i + 1)}\) to \(q^{(i)}\) with \(q^{(i)}_i = 0\) and
\[q^{(i)}_{n + 1} = \sqrt{(q^{(i + 1)}_i)^2 + (q^{(i + 1)}_{n + 1})^2},\]while preserving all other entries.
Obviously, this choice of \(Q\) fulfills (2) and we can show by induction that it also fulfills (1). Denote
\[\begin{split}\begin{pmatrix} L^{(i)} & v^{(i)} \end{pmatrix} & := \begin{pmatrix} L & 0 \end{pmatrix} \left( Q_i \dotsb Q_n \right)^T \\ & = \begin{pmatrix} L & 0 \end{pmatrix} Q_n^T \dotsb Q_i^T \\ & = \begin{pmatrix} L^{(i + 1)} & v^{(i + 1)} \end{pmatrix} Q_i^T.\end{split}\]For \(i = n + 1\), we have \(L^{(i)} = L\), which is lower-triangular, and \(v^{(i)} = 0\), i.e. the first \(i - 1\) components of \(v^{(i)}\) contain zeros. Now assume that \(L^{(i + 1)}\) is lower triangular and that the first \((i + 1) - 1\) components of \(v^{(i + 1)}\) are zeros.
\[\begin{split}& \begin{pmatrix} L^{(i)} & v^{(i)} \end{pmatrix} \\ = \ & \begin{pmatrix} L^{(i + 1)} & v^{(i + 1)} \end{pmatrix} Q_i^T \\ = \ & \begin{pmatrix} L_{1:(i - 1), 1:(i - 1)}^{(i + 1)} & 0 & 0 & 0 \\ L_{i, 1:(i - 1)}^{(i + 1)} & L_{ii}^{(i + 1)} & 0 & 0 \\ L_{(i + 1):n, 1:i}^{(i + 1)} & L_{(i + 1):n, i}^{(i + 1)} & L_{(i + 1):n, (i + 1):n}^{(i + 1)} & v_{(i + 1):n}^{(i + 1)} \\ \end{pmatrix} \cdot \begin{pmatrix} I & 0 & 0 & 0 \\ 0 & c_i & 0 & s_i \\ 0 & 0 & I & 0 \\ 0 & -s_i & 0 & c_i \\ \end{pmatrix} \\ = \ & \begin{pmatrix} L_{1:(i - 1), 1:(i - 1)}^{(i + 1)} & 0 & 0 & 0 \\ L_{i, 1:(i - 1)}^{(i + 1)} & L_{ii}^{(i)} & 0 & v_i^{(i)} \\ L_{(i + 1):n, 1:i}^{(i + 1)} & L_{(i + 1):n, i}^{(i)} & L_{(i + 1):n, (i + 1):n}^{(i + 1)} & v_{(i + 1):n}^{(i)} \\ \end{pmatrix},\end{split}\]where the \(0\) and \(I\) blocks are chosen such that the matrix product makes sense. One can now read off that \(L^{(i)}\) is lower-triangular and that the first \(i - 1\) entries of \(v^{(i)}\) are zeros. By induction, we can thus conclude that \(L^{(1)} = LQ^T_{1:n, 1:n}\) is lower-triangular. Moreover, since \(Q\) also fulfills (2), we know that \(v^{(1)} = v\). It follows that
\[\begin{pmatrix} L & 0 \end{pmatrix} Q^T = \begin{pmatrix} L^- & v \end{pmatrix},\]i.e. \(L^- = L Q_{1:n, 1:n}^T\).
Note that this algorithm is a minor modification of the LINPACK 2 routine
dchdd. The exact modifications are described in 1.References