Monotone Operators¶
Learning notes for the paper Monotone operators and the proximal point algorithm , by R. Tyrrell Rockafellar. and On the Douglas-Rachford splitting method and the proximal point algorithm for maximal monotone operators , by Jonathan. Eckstein,Dimitri P. Bertsekas.
This paper describe the relationship of several algorithms that Douglas-Rachford splitting method is a special case of the proximal point algorithm, and Alternating diections method is an application of Douglas-Rachford splitting method. As a result, the properties of proximal point algorithm could be used to ADMM, result in a generalized ADMM algorithm.
This paper was lined as follows:
- Definition of monntone operator \(T(z)\).
- Extend the basis case of variational inequalities (extend the domain to the whole space).
- Solve the problem \(0 \in T(z)\).
- Stop criteria, and convergence analysis.
1. Some Backgrounds¶
Semi-continuity
(See more from the wiki page) Semi-continuity is a property of extended real-valued functions (it is weaker than continuity) .
Definition : Suppose X a topogical space, \(x_{0}\) is a point in X. And function f : \(X \to \mathbb{R} \cup \{ -\infty , \infty \}\) is a entended real-valued function. f is called upper semi-continuous at \(x_{0}\), if for every \(y > f(x_{0})\) there exists a neighborhood U of \(x_{0}\) such that \(f(x)<y\), for all \(x\in U\).
- Metric space : \(\lim_{x\to x_{0}}\sup f(x) \le f(x_{0})\).
- Floor function is an upper semi-continuous function, ceilling function is a lower semi-continuous function.
This article proposed a method for solving a lower semi-continuous proper convex function on a Hilbert space, by iterative taking :
More precisly :
With \(c_{k} > 0\).
Inner product \(<\cdot, \cdot>\)
- Conjugate symmetric \(\bar{<x, y>} = <y,x>\)。
- Linear in its first argument: \(<ax_{1} + bx_{2}, y> = a<x_{1}, y> + b<x_{2}, y>\).
- <x,x> is positive definit . i.e. <x,x> >0 if \(x\ne 0\), <x,x>=0 if x =0.
As a result, we have:
Notations:
- Operator : T.
- Overload the notataion of T and its graph : \(T = \{ (x,y) \mid y \in T(x) \}\)
- \(dom T = \{ x\in H \mid \exists y\in H : (x,y) \in T \}\)
- Range or Image of T : \(im T = \{ y\in H \mid \exists x\in H:(x,y)\in T \}\)
- \(A+B = \{ (x, y+z) \mid (x,y)\in A, (x,z) \in B \}\)
- Identity operator I : \(\{ (x,x) \mid x in H \}\)
2. Monotone operator¶
Definition Monotone : Let H be a real Hilbert spave with inner product \(<\cdot, \cdot>\). A multifunction T : H \(\to\) H is said to be a monotone operator if:
It is said to be maximal monotone if, in addtion, the graph
is not properly contained in the graph of any other monotone operator T’: H \(\to\) H.
- An operator is (maximal) monotone if and only if its inverse is (maximal) monotone.
- The best known monotone operator is the subgradient mapping \(\partial f\) of a closed proper convex function.
- If T is a subdiffferential \(\partial f\) of a lower semi-continuous convex function f : \(H \to (-\infty , +\infty], f \ne +\infty\), then T is maximal monotone, and the relation \(0\in T(z)\) means that f(z) = min f.
Strongly Monotone with modulus \(\alpha > 0\), i.e, one have:
- Which means that \(T' = T - \alpha I\) is monotone.
Theorem 1. A monotone operator T on H is maximal if and only if im(I+T) = H.
Proof: Use Zorn’s Lemma (or, equivalently the axiom of choice): that a partially ordered set containing upper bounds for every chain (that is, every totally ordered subset) necessarily contains at least one maximal element.
Nonexpansive: An operator C on H is said to be nonexpansive if :
- Nonexpansive operators are necessarily single-valued and Lipschitz continuous.
Firmly Nonexpansive: An operator J on H is said to be firmly nonexpansive if :
Lemma 1. (Firmly Nonexpansive) :
- All firmly nonexpansive operators are nonexpansive. (observiously)
- An operator J is firmly nonexpansive if and only if 2j-I is nonexpansive. (straightforward)
- An operator is firmly nonexpansive if and only if it is of the form \((1/2)(C+I)\), where C is nonexpansive. (the inverse of the upper one)
- An operator J is firmly nonexpansive if and only if I-J is firmly nonexpansive.
Theorem 2. Let c be any positive scalar. An operator T on H is monotone if and only if its resolvent \(J_{cT} = (I+ cT)^{-1}\) is firmly nonexpansive. Furthermore, T is maximal monotone if and only if \(J_{cT}\) is firmly nonexpansive and \(dom (J_{cT}) = H\).
- The purpose here is to stress the complete symmetry that exists between (maximal) monotone operators and (full-domained) firmly nonexpansive operators over any Hilbert space.
Proof :
Clearly, T is maximal if and only if cT is maximal. So, by Theroem 1, T is maximal if and only if im(T+cI) = H. This is in turn true if and only if \((I+cT)^{-1}\) has domain H, establishing the seconf statement. \(\square\)
Corollary 2.1. An operator K is firmly nonexpansive if and only if \(K^{-1} - I\) is monotone. K is firmly nonexpansive with full domain if and only if \(K^{-1} - I\) is maximal monotone.
Corollary 2.2. For any c >0, the resolvent \(J_{cT}\) of a monotone operator T is single-valued. If T is also maximal, then \(J_{cT}\) has full domain.
Corollary 2.3. (The Representation Lemma). Let c >0 and let T be monotone on H. Then every element z of H can be written in at most one way as x+cy, where \(y\in Tx\). If T is maximal, then every z of H can be writeen in exactly one way as x + cy, where \(y\in Tx\).
Corollary 2.4. The functional taking each operator T to \((I+T)^{-1}\) is a bijection between the collection of maximal monotone operators on H and the collection of firmly nonexpansive operators on H.
Lemma 2. Given any maximal monotone operator T, real number c > 0, and \(x\in H\), we have \(0\in Tx\), if and only if \(J_{cT}(x) = x\).
Proof: By direction calculation, \(J_{cT} = \{ (x+cy,x)\mid (x,y)\in T \}\), hence , \(0\in Tx \Leftrightarrow (x,0)\in T \Leftrightarrow (x,x) \in J_{cT}\). Since \(J_{cT}\) is single-valued, the proof is complete. \(\square\)
3. Variational Inequalities¶
The variational inequalities expression is:
Where D is a nonempty closed convex subset of H, and \(T_{0} : D \to H\) is single-valued, monotone and hemicontinuous (i.e. continuous along each linear segment in H with respect to the weak topology), and \(N_{D}(z)\) is the normal cone to D at z :
We can prove that this T is maximal monotone.
The problem \(0 \in T(z)\) reduce to \(-T_{0}(z) \in N_{D}(z)\), or :
If D is a cone, this condition will be the complementary problem:
4. Lagrangian¶
Another example corresponding to saddle point optimization. Let H be the product of two Hilbert spaces, \(H = H_{1}\times H_{2}\), and let \(L: H \to [-\infty , +\infty]\) be such that L(x,y) is convex in \(x\in H_{1}\), and concave in \(y\in H_{2}\). Which is exactly the case for a normal lagrangian function for a constrained convex optimization problem, where x is the primal variable, and y is the dual variable. Solve the problem is the find the saddle point the lagrangian function.
We build another operator \(T_{L}(z)\) the be the set of all w = (v,u) such that:
Solving the problem \(0 \in T_{L}(z)\), will obtain z=(x,y) such that:
Which is exactly the solution of the saddle point of L(x,y).
5. Algorithm¶
Fact: \(\forall z \in H, \ c > 0, \exists ! \ u \in H. \ s.t \ z-u\in cT(u)\). i.e. \(z\in (I + cT)(u)\)
Proof: Suppose there exists another u’ not equal to u, which satisfies the same conditions, i.e. \(z\in (I + cT)(u')\)
Done proof
From this fact (\(z\in (I + cT)(u)\)), we have :
is a single-valued form H to H. and we can also prove that it is non-expansive.
As we have P(z) =z, if and only if \(0\in T(z)\):
Algorithm: \(z^{k+1} \approx P_{k}(z^{k}) = (I+c_{k}T)^{-1}(z^{k})\)
Case 1 : If we take T = \(\partial f\), we have:
Case 2 : For T corresponding to a convex-concave function L , it becomes :
6. Stop Criteria¶
A :
B :
7. Applications¶
- \(T = \partial f\), f is the essential objective function in the problem.
- \(T = - \partial g\), f is the concave objective function in the dual problem.
- \(T_{L}\) corresponding to the convex-concave Largrangian function.
8. Convergence¶
- See more in this
- paper On the Douglas-Rachford splitting method and the proximal point algorithm for maximal monotone operators.
- The strong convergence is affirmative if \(T = \partial f\) with f quadratic.
- The strong convergence is assured if \(c_{k}\) is bounded away from zero and T is strongly monotone. In which case \(P_{k}' = (I + c_{k}'T')^{-1}\) is nonexpansive for any \(c_{k} >0\) (left to prove).