\documentclass{slides}
\pagestyle{plain}
\newcommand{\F}{\cal F}
\newcommand{\p}{{\cal P}}
\newcommand{\Kprod}{\otimes}
\newcommand{\Diag}{\rm Diag\,}
\newcommand{\Sn}{{\cal S}^n}
\newcommand{\tran}{^T}
\newcommand{\diag}{\rm diag\,}
\newcommand{\tr}{\rm trace\,}
\newcommand{\trace}{\rm trace\,}
\newcommand{\beq}{\begin{equation}}
\def\QED{~\rule[-1pt] {8pt}{8pt}\par\medskip ~~}
\begin{document}
%=======================
%=======================
\begin{slide}{}
\begin{center}
{\bf
Semidefinite Relaxations for Hard Combinatorial Problems
}
\end{center}


~~\\
~~\\
~~\\
~~\\
~~\\
~~\\
Henry Wolkowicz \\
University of Waterloo\\

~\\
~\\

\end{slide}
\begin{slide}{}
\begin{center}
OUTLINE
\end{center}

Introduction to SDP (duality, algorithms)

Max-Cut Problem instance - via Lagrangian duality then G-W approach
and also Nesterov (Ye?) results; PRW equivalences; valid
inequalities; algorithms large scale results

Other instances e.g. QAP, based on QQP model and Lagr relax recipe
(paradigm), algorithm based on dual problem.


\end{slide}
\begin{slide}{}
\begin{center}
INTRODUCTION
\end{center}
Semidefinite programming (denoted SDP) is an 
extension of linear programming (LP),
e.g. an (linear)  SDP is:
\[ 
(P)\quad  \begin{array}{rc}
   \mu^*:=  \min        &  C \bullet X  \\
     \mbox{s.t.} & {\cal A}X = b \\
                 & X \succeq 0,
    \end{array}
\]
$C,X \in \Sn$, 
space of symmetric $n \times n$ matrices;

inner product $C \bullet X = \tr CX;$

$A\succeq B$ if $A-B \succeq 0$ positive semidefinite
\[ 
{\cal A} :\Sn \rightarrow \Re^m
\]
is a linear operator, i.e.  $({\cal A}X)_i =\tr (A_iX),$
for given $A_i \in \Sn,~i=1, \ldots m.$


\end{slide}
\begin{slide}{}

SDP is a special case of the cone programming problem
\[ 
 \begin{array}{cc}
     \min        &  f(x)  \\
     \mbox{s.t.} & g(x) \succeq_K 0,
    \end{array}
\]
$f,g$ are appropriate functions
$K$ is a convex cone 
$g(x) \succeq_K 0$ is cone partial order, $g(x) \in K.$ 

very general mathematical program 

e.g.  standard equality and
inequality constraints when
 $K= \Re^n_+ \otimes \Sn_+ \otimes \{0\}.$


\end{slide}
\begin{slide}{}


\begin{center}
{GEOMETRY} 
\end{center}
Much of the elegant geometry of polyhedral sets
developed for LP can be extended to SDP.
(e.g.  Bohnenblust 1948, Barkey and Carlson 1975 and more recently 
Lewis and also Pataki)

similarities to LP:\\
{\em self-polar}
\[
\p = \p^+:= \left\{Y: X \cdot Y \geq 0, \forall X \in  \p \right\};
\]
{\em homogeneous}\\
for any $X,Y \in \mbox{int}(\p)$,
there exists an invertible linear operator $\cal A$ that leaves $\p$
invariant and ${\cal A}(X)=Y.$


\end{slide}
\begin{slide}{}

{\em faces}
\[ {\cal F} = \left\{Y: {\cal N}(Y) \supset {\cal N}(X) 
                \right\} \quad X \in \mbox{ relint } {\cal F}
\]

faces are {\em exposed}\\
${\cal F} =  \p \cap \phi^{\perp}$, where $\phi \in
 \p \cap {\cal F}^{\perp}$ {\em conjugate face},
(Here $\cdot{\perp}$ denote orthogonal complement.)

But\\
$\cal F$, $\p + {\cal F}^{\perp}$ is always closed

 $\p + \mbox{span}({\cal F})$ is never closed.


\end{slide}
\begin{slide}{}


\begin{center}
{Duality Theory and Optimality Conditions} 
\end{center}
Extenstions from LP to SDP, e.g.
Bellman and Fan 1963, but
strong duality theorems require
a Slater-type constraint qualification (strict feasibility). 

Modified optimality conditions without CQ???


\end{slide}
\begin{slide}{}

weak duality (using hidden constraints)
\begin{eqnarray*}
 \mu^* &=& \min\limits_{X\succeq 0} \max_y C \cdot X + y^T(b-{\cal A}X)\\
       &\leq& \max_y \min\limits_{X\succeq 0}  y^Tb +\left(C -
                                  {\cal A}^*y\right) \cdot X\\
       &=& \nu^*,
\end{eqnarray*}
${\cal A}^*$ adjoint operator of ${\cal A}$\\
dual program
\[ 
(D)\qquad \begin{array}{rc}
   \nu^*:=  \max     
     \mbox{s.t.} & {\cal A}^*y \preceq  C \\
    \end{array}
\]


\end{slide}
\begin{slide}{}


If a strictly feasible solution exists for
the primal (dual, respectively) then there is a zero duality gap,
$\mu^*=\nu^*$, and the dual (primal, repectively) is attained.

If the Slater constraint qualification fails, then one can work with
the minimal cones and
obtain characterizations of optimality and strong duality theorems, see
\cite{bw2}.  
(For (PSDP) this is the minimal face of $\p$ containing the feasible set.
For (DSDP) this is the minimal face of $\p$ which contains $C-{\cal
A}^*{\cal F},$ where $cal F$ denotes the feasible set.)
One can also obtain a strong dual 
program whose size is polynomial in the data of the original problem
\cite{Ram:95}. The relation between these duals is given in
\cite{RaTuWo:95}. 

The duality theory gives rise to the
characterization of optimality
\begin{equation}
\label{eq:optcond}
\begin{array}{cccc}
           {\cal A}^{*}y +Z - C  &=&  0 & \mbox{dual feasibility} \\
          b - {\cal A}(X) &=&  0 & \mbox{primal feasibility} \\
          ZX &=&  0 & \mbox{complementary slackness} \\
       Z,X \succeq 0.
\end{array}
\end{equation}
We call $X,(y,Z)$ a primal-dual optimal pair. $Z$ is the (dual) slack
variable.

First and second order optimality
conditions for general cone programming are given in e.g. Shapiro
\cite{Shap:94b,Shapiro:99}.

{Degeneracy}
Nondegeneracy and strict complementarity do not directly
follow through from LP to SDP.

$x$ is nondegenerate implies dual optimal face is a singleton (dual
solution is unique)
refs Shapiro ...

A theorem of Goldman and Tucker shows that
every (finite optimal valued) LP has an optimal primal-dual pair that
satisfies strict complementarity. 
For SDP this translates to $Z+X \succ 0.$
However, this result fail to hold for SDP.
Furthermore, the formulation of nondegeneracy conditions
requires study of the geometry of the semidefinite cone $\p$.  
The boundary of $\p$, like that of $\Re^n_+$, is composed
of smooth pieces, but the pieces are nonlinear.  This topic is studied 
in \cite{AlHaOv:95}; generic results for nondegeneracy are given in
\cite{PatTun:97}.


{Complexity}
SDP are convex programs and fall into the class of problems that can be
approximately solved in polynomial time.
The complexity is based on self-concordant barriers for the cone $\p.$
The smallest (best) barrier parameter is given in \cite{GulerTuncel:98}.

The complexity of determining the (exact) feasibility of a system of linear
matrix inequalities is discussed in \cite{PorkKhach:95,Ram:93}.


\end{slide}
\begin{slide}{}


\begin{center}
{ALGORITHMS}
\end{center}
No successful simplex type methods are available for SDP.


{Interior-Point Algorithms}
The SDP research is highlighted by the elegant primal-dual
interior-point algorithms. 
As mentioned in the preface, the
p-d i-p approach has
revolutionized the way we think about optimization problems. 
Interior-point methods provide polynomial time algorithms for linear
programming. In addition, the p-d i-p
algorithms have proven to be enormously successful in solving 
large scale LP models that arise from real life problems.
And, we can safely say that it is because of this change that
we can now regularly solve LPs with millions of variables,
e.g. \cite{int:Lustig:94}.
 
The work in \cite{int:Nesterov5} shows
that $\log \det (X)$ is a self-concordant barrier function
for SDP. Thus SDP can be solved in polynomial time using a sequence of
barrier subproblems.
Alizadeh \cite{Al:94} presented a transparent approach
to extend potential function methods from LP to SDP.
Simultaneously, strong
numerical results for SDP and the special case of the
min-max eigenvalue problem appeared in e.g. \cite{HeReVaWo:93}
and  \cite{Jarr:93}.
This started a flood of results in SDP from
researchers in LP.

The most popular methods are the p-d i-p methods. These are based
on applying Newton's method to solving the optimality conditions from
the barrier problems, i.e. the perturbed optimality conditions
\begin{equation}
\label{eq:optcondpert}
\begin{array}{ccc}
          Z + C - {\cal A}^{*}y  &=&  0, \label{dfeas} \\
          b - {\cal A}(X) &=&  0, \label{pfeas} \\
          X - \mu Z^{-1} &=&  0. \label{compslack}
\end{array}
\end{equation}
As in LP, the last equation can be written in many ways, e.g.
\begin{equation} \label{eq:zx}
ZX=0.
\end{equation}
Different choices give rise to different search directions.
However, unlike LP,
choices like  (\ref{eq:zx}) result in 
(\ref{eq:optcondpert}) being an overdetermined system
of nonlinear equations. There are many ways to overcome this problem,
e.g. solve this using Gauss-Newton approach not yet efficient or use
various symmetrization techniques (\cite{Monteir???} e.g. AHO
direction.

As in LP, the methods can be classified as potential reduction, p-d
i-p, predictor-corrector etc...
path following.

This is still an active area of research in particular for the
large sparse problems.

{Bundle Trust Algorithms}
However, there are bundle trust methods ...????
{Large Sparse Case}


\end{slide}
\begin{slide}{}

{SDP Relaxation for Max-Cut Problem}
The success and popularity of SDP is exemplified by the results of the
relaxation for the max-cut problem.
Let $G=(V,E)$ be an undirected graph with
edge set $V = \{v_i\}_{i=1}^n$ and weights $w_{ij}$
on the edges $(v_i,v_j) \in E$.
We want to find the index set ${\cal I} \subset \{1,2, \ldots n\},$
to maximize the weight of the edges
with one end point with index in ${\cal I}$ and the other in the
complement.
This is equivalent to
\[
(MC)~~~  \begin{array}{c}
    mc^*:=\max ~ \frac 12 \sum_{i<j} w_{ij}(1-x_ix_j),~~~x \in \F,
\end{array}
\]
where $\F:= \{\pm 1\}^n,$ and
$x_i=1$ if $i \in \cal I$ and -1 otherwise. The objective
function is a (homogeneous) quadratic form, $x^TQx.$ Therefore, we get
the equivalent quadratic model
\[
(QMC)~~~  \begin{array}{c}
    \max ~ x^TQx,~~~x_i^2 = 1, \quad \forall i.
\end{array}
\]

The LP relaxation is 
\[
(LPMC)~~~  \begin{array}{c}
    \max ~ x^TQx, \quad -1 \leq x_i \leq 1, \quad \forall i,
\end{array}
\]
which yields very little useful information, e.g. the maximum is 
$x=0$ when $Q \preceq 0$ and is intractable otherwise.

The SDP relaxation can be obtained easily from (QMC) by noting
$x^TQx  = \trace x^TQx = \trace Qxx^T.$ We then substitute $X=xx^T$ and
replace/relax the rank one condition on $X$ to $X \succeq 0.$ This yields 
\[
(SDPMC)~~~  \begin{array}{c}
  pmc^*:=  \max ~ \trace QX, \quad \diag(X) = e, \quad X\succeq 0,
\end{array}
\]
where $\diag (X)$ denotes the vector formed from the diagonal of $X$
and $e$ is the vector of ones.

There are several other equivalent forms. Another approach is to use
the Lagrangian relaxation of the quadratic model (QMC). 
\[
 \begin{array}{rcl}
    mc^*\leq dmc^*:& =& \min\limits_{\lambda} \max\limits_x x^TQx -  
             \sum_{i=1}^n \lambda_i (x^2_i-1) \\
           & =&  \min\limits_{Q-\Diag(\lambda) \succeq 0}  e^T \lambda,
\end{array}
\]
where the semidefinite constraint arises from 
assuming boundedness of the inner unconstrained
quadratic maximization problem,
which then has a maximum attained at $x=0.$
Since the Slater constraint qualification holds, $Q-\Diag(\lambda)
\succ 0,$ we conclude that strong duality holds with (SDPMC), i.e.
$mc^*\leq dmc^* = pmc^*.$

An alternative approach to this relaxation uses the fact that, for any
fixed normalized vector $u \in \Re^n,$ we can reformulate (MC) as
\begin{equation} \label{eq:normmc}
(MC)~~~  \begin{array}{c}
    mc^*=\max ~ \frac 12 \sum_{i<j} w_{ij}(1-v_i^Tv_j), \quad v_i \in
\{ -u,u\}, \quad \forall i.
\end{array}
\end{equation}
This leads to the relaxation used in \cite{GoeWil95} which is again
equivalent to the SDP relaxation. (Surprisingly, see \cite{PoWo:93,PoReWo:94},
this relaxation is equivalent to several others such as the LP
relaxation (box) with perturbations on the objective function.)
Goemans and Williams in \cite{GoeWil95} use 
(\ref{eq:normmc}) and a randomized
algorithm which provides an approximate solution to (MC) whose
objective value is within 1.14 of the optimal value. Previous results
were within 1.5 of the optimal value.
In practice, much better results (e.g. 1.06 on average in \cite{HeReVaWo:93})
are obtained empirically.


In summary, the SDP relaxation provides an efficient way of solving the
Lagrangian relaxation which has been shown to be very strong both
theoretically \cite{GoeWil95} and in practice
\cite{HRVW:96,HelmbergRendl:97,BensYeZhang:97,MontBurer:99}.

\end{slide}
\begin{slide}{}

Many SDP problems that arise are very large. Sparsity must be exploited
in order to solve these. Some headway in exploiting structure has been
made recently - Kojima et al, Ye...benson


Thus SDP includes LP as a special case;
and so it has all the elegant duality and geometric
results that LP does. In addition, it is a nonlinear problem and so has
extra subtleties that create many interesting difficulties; 
the geometry of the cone $\p$ is particular elegant and well studied.
Linear SDPs are convex programs that can be solved very efficiently
using interior-point methods.
And, SDP also has many very important applications. It arises as the
dual of Lagrangian relaxations for quadratic models of numerically hard
problems. Thus, we have an efficient tool for finding bounds and
estimates for many hard problems in a variety of different areas, e.g.
in Engineering, Combinatorics, Statistics.


{Historical notes}
It is only recently that SDP has become a
mathematical discipline in its own right.
This is due to the many important applications
that have arisen, as well as to the development of
efficient primal-dual interior-point (denoted p-d i-p) 
algorithms that can solve SDPs.

Up until this time, LP models were, and
still are, the principal tool used in approximations
of numerically hard problems.
In \cite{Dant:90}, Dantzig discusses his well known tale
about his confrontation with Hotelling at his first presentation of
the simplex method for LP, i.e. after his presentation
Hotelling stated: ``But we all know
that the world is nonlinear'' and that Von Neumann replied for him:
``The speaker titled his talk `linear programming' and carefully
stated his axioms. If you have an application that satisfies the
axioms, well use it. If it does not, then don't.'' And, this
provides a very good reason for the success of linear programming,
i.e. there are many applications and the software that has been
developed over the years provides us with very efficient solution
techniques, e.g. \cite{NemKanTodd:89}.  
The main algorithms were originally based on Dantzig's simplex method.
Then, begun by the work of
Karmarkar 1985 \cite{kar:84}, a revolution started 
in the way we solve LPs and led to the p-d i-p methods which has
resulted in a much greater increase in
size and speed, e.g. \cite{int:Lustig:94}.

However, in the article  \cite{Dant:90}, Dantzig himself states: 
``In the final analysis, of course,
Hotelling was right. The world is highly nonlinear. Fortunately
systems of linear inequalities (as opposed to equalities) permit us
to approximate most of the kinds of nonlinear relations encountered
in practical planning.'' Thus we can see the reason for the great
success of LP: good approximations of many problems
and efficient algorithms. However, what if we could efficiently
solve quadratic approximations? That is where SDP has stepped in,
i.e. it can efficiently solve approximations of quadratic
models of many applied problems and, thus, obtain improved
results compared to LP models.

As mentioned above, SDP is a special case of cone programming 
which is not a new area; nor is SDP itself.
Much of the theory of duality extends directly 
from LP to SDP.  Indeed, this was recognized for cone
programs as early as 1963 by 
Bellman and Fan \cite{FanBell:63}. Since that time many researchers in
diverse areas from engineering, matrix theory, statistics, optimization
have contributed to results on SDP.
More historical notes are included below.


{APPLICATIONS}
What has made this area so {\em alive} is that the theory and algorithms
are not derived in a vacuum, i.e. there are so many applications that it
has been impossible to include all of them here. Many applications are
old e.g. in statistics \cite{Flet:81c,Flet:85,Puk:93};
new ones occur in combinatorial problems, e.g. \cite{Rendl:97,Goem:97}
engineering (mentioned above) and matrix completion problems
\cite{infieldsLaur:97}.

Many applications of SDP arose in the literature
in surprisingly diverse areas. Many of these applications are direct
applications of SDP models while others use SDP to get approximate
solutions. 
We now list a few of the application areas.

{Combinatorial Optimization}
%%%% combinatorial optimization
Semidefinite programming plays an important role in combinatorial 
optimization where it is used to solve nonlinear convex relaxations 
of NP-hard problems.  The interest in SDP techniques for combinatorial 
optimization was motivated by remarkable theoretical results that 
characterize the quality of the bounds obtained via semidefinite 
programming.

A result that created great excitement in this area was the complexity
result of Goemans-Williamson on the complexity and strength of 
the SDP relaxations \cite{GoeWil95,GoeWil94}.
This showed that the SDP relaxation leads to
an approximate solution of the max-cut
problem whose objective value is within 1.14 of the optimal value.
Previous results were within 1.5 of the optimal value.
In practice, the empirical evidence is much better and on average the
one gets within 1.06 of the optimal value, see e.g. \cite{HeReVaWo:93}.
As seen in our 


SDP Rendl paper -   survey article on PC Pardalos et al ...


they often lead to subproblems that are too hard to solve. 
SDP provides a means of efficiently
solving relaxations of quadratic approximations.
This idea has been successfully applied to
the stable set problem and the max-cut problem, 
see e.g.\  \cite{GLS-88,LoSc:91}.
Indeed, this volume is dedicated to Svata Poljak, one of the important
early contributors to this subject. 
Our fifth paper, by Laurent, Poljak and Rendl, which was
completed shortly before Poljak's untimely death, is 
``Connections between semidefinite relaxations of the max-cut and stable
set problems.'' Besides exploring the links between these relaxations,
it obtains tighter relaxations than previously available
by introducing additional polyhedral constraints.


{Engineering} 
e.g. Engineering  \cite{BoGhFeBa:93,MR34:8830,MR32:5405}.


%%%%% control & engineering
We also find many applications of semidefinite programming in 
engineering, in particular in control theory, where it has emerged 
as an important practical tool for the analysis of nonlinear and 
time-varying systems, and for controller synthesis.  
Efficient algorithms for semidefinite programming make it
possible to numerically solve a wide variety of control problems
for which no analytical solution is known.  Semidefinite programming
is therefore of great interest to computer-aided control system
design.
SDPs also arise in other fields of engineering and mathematics,
such as structural optimization, circuit design, signal processing, 
and statistics, see e.g. \cite{BoGhFeBa:93,VaB1:97,VanBoy:94b}.

e.g. engineering Ricatti
robust control.  For many such applications, see \cite{BoGhFeBa:94,VanBoy:94}.
As noted above, in this arena SDP is usually known as LMI: 

Systems and Control - numerical search for Lyapunov functions for
stability; performance of control system models, e..g

  LMI for control is 
   \[ P \succ 0, A^TP +PA \prec 0  \]

is feasible iff every solution of
.... 
see Bala... in handbook or BGFB book.


{Matrix Completion Problems} 
The

{Nonlinear Programming} 
nonlinear programming  TRS and SQQP
The trust region subproblem is a basic tool in unconstrained nonlinear
programming, providing a local quadratic model of the objective function
which is to be ``trusted" only in a certain region. In this case,
the SDP relaxation of the quadratic problem provides an exact
applications to large scale minimization'', by Rendl and Wolkowicz,
introduces an SDP framework that allows one to efficiently exploit 
sparsity when solving unconstrained nonlinear programming problems.

{Statistics and Probability} 

equations, ..., statistics (Fletcher, ...), QAP ....


--------------------

In parallel with the applications to combinatorial optimization, many
applications of SDP arose in engineering, especially in the field of
linear matrix inequality.
An important but much harder problem in this field is the nonconvex
bilinear matrix inequality.  This is studied in our sixth paper,
``A cone programming approach to the bilinear matrix inequality problem
and its geometry,'' by Mesbahi and Papavassilopoulos.


So far, we have restricted the discussion to SDP with linear equality
constraints, i.e. the only nonlinearity in the problem is the boundary
of the cone $\p.$   Equivalently, an affine matrix-valued
function appears in the dual program, again with nonlinearity present only in
the boundary of $\p$.  However, one can consider SDP with nonlinear 
matrix-valued functions.  First and second order local optimality conditions 
and sensitivity analysis for such nonlinear
SDP's are studied in our eighth and last paper,
``First and second order analysis of nonlinear semidefinite programs,''
by Shapiro.  


A historical note: an influential volume in
the early history of {\em Math Programming Study}, the predecessor
of {\em Math Programming} Series B, was that on nondifferentiable 
optimization (Volume 3, 1975, M.L.\ Balinski and P.\ Wolfe, eds).  
That volume included one of the early papers on matrix eigenvalue optimization 
\cite{CuDoWo:75}.  SDP is an eigenvalue optimization problem in the
sense that the semidefinite constraint is equivalent to nonnegative
bounds on eigenvalues.  Solutions to SDP generally have
multiple zero eigenvalues (just as solutions to LP have many nonbasic, i.e.\
zero, variables).  However, it is not possible to formulate separate 
differentiable bounds on individual eigenvalues.
For more on eigenvalue optimization, see \cite{LewOve96}.

Although the topic is not discussed in this volume, we mention that
it has become appreciated recently that one reason for the special nature
of SDP is that the cone $\p$ is both homogeneous and self-dual.
Such cones have been completely classified and include only
the Lorentz cone (used for convex quadratic programming with convex quadratic
constraints), the semidefinite cone $\p$ and its 
extensions (of which the most important is the cone of positive 
semidefinite complex Hermitian matrices), and products of these 
(including $\Re^n_+$, a product of $n$ trivial semidefinite cones).  
For more details, see \cite{GulerTuncel:98}.

The World Wide Web is a tremendous resource for information on SDP. 
This includes research papers, public domain software, and bibliography 
files.  Several relevant web sites are listed below.  These contain further
pointers to other web sites with more information.
\begin{enumerate}
\item
http://www.zib.de/helmberg/semidef.html\\
contains a special home page for semidefinite programming
organized by C.\ Helmberg,
at Berlin Center for Scientific Computing,
Konrad Zuse Zentrum fur Informationstechnik, Berlin.
\item
ftp://orion.uwaterloo.ca/pub/henry/reports/psd.bib.gz\\
contains a bib file with papers related to SDP.
\item 
http://www.mcs.anl.gov:80/home/otc/InteriorPoint\\
is an archive of technical reports and papers on interior-point methods,
maintained by S.J. Wright at Argonne National Laboratory.  Many of the
recent reports discuss primal-dual interior-point methods for SDP,
a subject not discussed in this volume.
\end{enumerate}

SDP is an exciting subject that remains very active, and will no doubt
be so for some time to come.  Here is an area where nonlinear 
convex programming has shown its strength, beauty, and versatility.  
One finds elegant theoretical results, 
provably efficient algorithms and a wide variety of applications.

specific examples: TRS? QAP? SQQP?

{CONCLUSION}
Currently very hot area.

Future - improved estimates - close gap for NP-P polytime.


\end{slide}
\begin{slide}{}
{Problem Definition}
\label{sect:probdef}
Suppose that $G=(V,E)$ is an undirected graph with
edge set $V = \{v_i\}_{i=1}^n$ and weights $w_{ij}$
on the edges $(v_i,v_j) \in E$.
The {\em max-cut problem} consists in finding
the index set ${\cal I} \subset \{1,2, \ldots n\},$
to maximize the weight of the edges
with one end point with index in ${\cal I}$ and the other in the
complement.
This is equivalent to the following discrete optimization problem with a
quadratic objective.
\[
(MC)~~~  \begin{array}{c}
    \max ~ \frac 12 \sum_{i<j} w_{ij}(1-x_ix_j),~~~x \in  \{ \pm 1 \}^n.
\end{array}
\]
We equate $x_i=1$ with $i \in \cal I$ and -1 otherwise.  Define the
quadratic objective
\[ q(x) := x^tQx-2c^tx, \]
where $Q$ is an $n \times n$ symmetric matrix.
Then, the boolean quadratic programming problem is
 equivalent to the
{\em $(\pm 1)$-quadratic programming problem}
\beq
\label{eq:homogquad}
\mu^*:=\max ~ q(x),~~~x \in \{ \pm 1 \}^n.
\end{equation}
The max-cut problem is obtained by setting $c=0$ in the above objective
function.


{Polyhedral Approach}
\label{sect:polappr}
A linear relaxation is obtained using ...???
give details of linear relaxation here ?????

see e.g. the vast
literature on roof duality for boolean quadratic programming
\cite{HaHaSi:84,HaRu:70,AdDe:94} and other linear based approximations,
e.g. \cite{AdJo:94}.

For 0-1 quadratic programming there have been many different linear
approximations derived and tested. In \cite{BoCrHa:90}, the authors
show that three seemingly different linear relaxations are all equal.
These linear relaxations are referred to as roof duality. Then in
\cite{AdDe:94} it is shown that roof duality is in fact equal to
Lagrangian duality, i.e. the roof relaxation is equivalent to the
Lagrangian relaxation. Thus, many attempts in improving linear
relaxations have all led to the Lagrangian relaxation for a linear
version of the problem.

Another approach is to start by relaxing the cut polytope, the convex
hull of all cut characteristic vectors, into the unit cube. Valid
inequalities, preferably facet-defining inequalities, for the cut polytope 
are added successively as needed to tighten the relaxed set. Various classes of
facet-defining inequalities are known for the cut polytope \cite{PoTu:95}.
One such class of facet-defining inequalities is the class of triangle 
inequalities
  \[ x_{ij} +x_{ik} +x_{jk} \leq 2.\]
For a computational study of the max cut problem using such cutting plane
algorithm see \cite{DeRi:92}. 


{SDP Relaxation}
\label{sect:sdprelax}
Can we find a better approximation?
We can start by replacing the $(\pm 1)$ 
constraints with the quadratic constraints
\[  x_i^2=1,~i=1,\ldots,n. \]
This of course is an equivalent representation of the $\pm 1$
constraints. Thus this equivalent problem is just as hard to solve.
However, we can use Lagrangian relaxation to get an upper bound.
We delete the quadratic constraints and add them to the objective
function with Lagrange multipliers $\lambda_i.$
We then solve an unconstrained problem to get an upper bound
for the original constrained problem.
\begin{eqnarray*}
\mu^* \leq 
 \max_{x} x^tQx + \sum_i \lambda_i (1-x^2_i).
\end{eqnarray*}
Since the above is an upper bound for any choice of $\lambda_i$, we get
the following Lagrangian relaxation bound.
\begin{eqnarray*}
\mu^* \leq 
B_{mc} &:=& \min_{\lambda} \max_{x} x^tQx + \sum_i \lambda_i (1-x^2_i) \\
       & =& \min_{\lambda} \max_{x} x^t\left(Q-\Diag (\lambda) \right)x 
           + \lambda^te.
\end{eqnarray*}
First, we remark that the above min-max problem reduces to a tractable
convex minimization problem, since the {\em dual functional}
\[ 
  h(\lambda) := \max_{x} x^tQx + \sum_i \lambda_i (1-x^2_i) 
\]
is the maximum of linear functions and so is convex. In fact, the
Lagrangian dual is always a nice tractable problem.
We now note that the inner maximization problem is infinite valued
unless the Hessian of the Lagrangian is negative semidefinite, i.e. we
have the {\em hidden semidefinite constraint}
\[ Q - \Diag (\lambda) \preceq 0.  \]
Moreover, once we add this semidefinite constraint to the outer
minimization problem, the inner maximization is attained at $x=0.$ We
have eliminated the $x$ variable and the maximization part of the
problem.

Therefore the upper bound for MC can be obtained by solving the following SDP.
\beq
\label{eq:mcD}
\begin{array}{cc}
\min & \left<\lambda ,e\right> \\
\mbox{subject to} & Q-\Diag (\lambda) \preceq 0,
\end{array}
\end{equation}
where $\Diag ( \lambda)$ is the diagonal matrix formed using the vector
$\lambda.$
We can then treat this program as if it were an LP and note that the
adjoint operator $\Diag^*(Y)=\diag(Y),$ the vector of the matrix $Y.$
Moreover, the semidefinite cone is
self polar. We then get the following dual SDP to the above Lagrangian
dual.
\beq
\label{eq:mcP}
\begin{array}{cc}
\max & \tr QX \\
\mbox{subject to} & \diag(X) = e\\
      & X \succeq 0.
\end{array}
\end{equation}
The above is the SDP relaxation of MC. This can be obtained directly by
noting that $x_i^2=1$ is equivalent to
\[  \diag(xx^t) = e  \]
and $x^tQx= \tr Qxx^t.$ Therefore the relaxation is obtained by
replacing $xx^t$ by the symmetric positive semidefinite matrix $X$, i.e.
we relax the rank-1 requirement when $X=xx^t.$


{SDP Algorithm}
We can derive very efficient primal-dual
interior-point (p-d i-p) algorithms. For example, we start
with the dual problem and formulate the log-barrier problem with fixed
parameter $\mu >0$
\beq
\label{eq:duallogbar}
\begin{array}{cc}
\min & \left<\lambda ,e\right> + \tr Y\left(Q-\Diag (\lambda) + Z \right) 
             - \mu \log \det Z\\
\mbox{subject to} & Z \succ 0.
\end{array}
\end{equation}
We now solve this unconstrained problem by differentiation.

We get the dual and primal feasibility and complementary slackness.

change to get more linearity - apply  Newton step. Different
directions???

{Theoretical Error Bounds}
\label{sect:approxsol}
(Franz working on this)
We now discuss Goemans-Williamson and the approximate solution - start
with Poljak conjecture.
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}

Quadratically constrained quadratic programs, QQP.

Semidefinite Programming, SDP, yields bounds.

SDP bound is equivalent to the Lagrangian relaxation.

Theme (or conjecture):\\
{\em Lagrangian duality is best}\\
i.e.  tractable bounds are equivalent to
Lagrangian relaxation bound of an appropriate problem.

In some surprising (nonconvex cases) the Lagrangian Relaxation is exact
(zero duality gap).

\end{slide}
\begin{slide}{}


\begin{center}
{\bf QQP}
\end{center}

\[
         q_i(x) : =  x^TQ_ix+2g_i^Tx +\alpha_i~~Q_i=Q_i^T
\]
\[
(\mbox{QQP}_x)
   \begin{array}{rcl}
        q^*:= &  \min  & q_0(x)\\
&{\rm subject~to} & q_k(x) \leq 0\\
  && k \in {\cal I} :=\{1, \ldots , m\}\\
&& x \in \Re^n
            \end{array}
\]
Lagrangian of QQP$_x$
\[
          L(x,\lambda):= q_0(x)
 + \sum_{k \in {\cal I} } \lambda_k q_k(x),
\]
$\lambda=(\lambda_k) \geq 0$ are nonnegative Lagrange
multipliers.

\end{slide}
\begin{slide}{}

{\bf Lagrange multipliers can be used in two ways.}

1. Classical: if a CQ holds at optimum
$\bar{x}$, then the KKT necessary conditions for optimality
hold
\begin{eqnarray*}
    \bar{\lambda} \geq 0, \nabla L(\bar{x},\bar{\lambda})=0\\
     \bar{x} \mbox{ feasible }\\
     \bar{\lambda}_k q_k(\bar{x})=0, \forall k \in {\cal I}
\end{eqnarray*}
If the Lagrangian at $\bar{\lambda}$ is also convex
in $x$, then KKT is sufficient.

2. Lagrangian Relaxation (duality):
\[
       q^* \geq d^*:=  \max\limits_{\lambda \geq 0}
          \min_x  q_0(x)  
 + \sum_{k \in {\cal I} } \lambda_k q_k(x).
\]


\end{slide}
\begin{slide}{}

Zero duality gap holds if $q^*=d^*.$ 

Strong duality holds if $q^*=d^*$ and also $d^*$ is attained. 

Moreover, $d^*$ can be efficiently evaluated using SDP.

Questions:\\
$\bullet$Can one do better than the Lagrangian relaxation?\\
$\bullet$When does a zero duality gap hold?\\
$\bullet$What redundant constraints can
be added to close the duality gap?


\end{slide}
\begin{slide}{}

\begin{center}
{\bf Several Different Relaxations of Max-Cut}
\end{center}
\[
(MCQ)~~~  \begin{array}{c}
   \mu^*:= \max\limits_{x \in \F} q_0(x) \quad ( : = x^TQx -2 c^Tx).
\end{array}
\]
where $\F=\{\pm 1\}^n$\\
--------------------------------

perturbing diagonal of $Q$ on $\F:$
\[
\begin{array}{rcl}
q_u(x)& : = &x^T(Q+\Diag(u))x -2 c^Tx - u^Te\\
      &~=&q_0(x),~~\forall x \in \F.
\end{array}
\]

\end{slide}
\begin{slide}{}
{\bf Bound 0:}
the trivial bound from allowing the diagonal perturbations:
\[
  \mu^* \leq f_0 (u) := \max_x q_u(x).
\]
The function $f_0$ can take on the value $+\infty.$ 
Let 
\[S:=\left\{u:u^Te=0, Q+\Diag(u)\preceq 0\right\}.
\]
Then:
\[
\begin{array}{|c|}
\hline\\
  \mu^* \leq B_0 := \min_u  f_0(u) \\
    \left(= \min\limits_{u^Te=0}  f_0(u),
       \mbox{ if } S \neq \emptyset   \right).\\
\hline
\end{array}
\]
Using the hidden semidefinite constraint:
\[
  \mu^* \leq B_0 = \min_{Q+\Diag(u) \preceq 0}  f_0(u).
\]

\end{slide}
\begin{slide}{}
{\bf Bound 1:}
relax the feasible set to
the sphere of radius $\sqrt{n}$ (tractable trust region subproblem, TRS):
\[
\mu^* \leq f_1(u):= \max_{|| x ||^2 =n} ~ q_u(x)
\]
and
\[
\begin{array}{|c|}
\hline\\
  \mu^* \leq B_1 :=\min_u f_1(u).\\
\hline
\end{array}
\]


The inner maximization problem is called a
trust region subproblem and is tractable.
This bound provides the central tool in the proofs of equivalence.

\end{slide}
\begin{slide}{}
{\bf Bound 2:} box constraint:
\[
\mu^* \leq f_2(u):= \max_{| x_i | \leq 1} ~ q_u(x).
\]
add the semidefinite constraint to make bound
tractable.
\[
  \mu^* \leq \min_{u} f_2(u)
\]
and
\[
\begin{array}{|c|}
\hline\\
  \mu^* \leq B_2 :=\min_{Q+\Diag(u) \preceq 0} f_2(u).\\
\hline
\end{array}
\]


\end{slide}
\begin{slide}{}
{\bf Bound $B_1^c$:} lift to eigenvalue bound:

\[
 Q^c := \left[ \begin{array}{cc}
   0 &  - c^T \\
  - c &Q    \end{array}  \right]
\]
\[
 q^c_u(y) :=   y^T( Q^c+\diag(u))y-u^Te
\]
\[
  \mu^* \leq f_1^c(u) := \max_{||y||^2=n+1} q^c_u(y)
\]
where
\[
   \max_{||y||^2=n+1} q^c_u(y)
   = (n+1) \lambda_{\max} (Q^c + \diag (u) ) - u^Te
\]
\[
\begin{array}{|c|}
\hline\\
  \mu^* \leq B_1^c :=  \min_{u} f_1^c(u).\\
\hline
\end{array}
\]

Similarly, we get equivalent bounds $B_0^c$
and homogenized bounds for the other models.


\end{slide}
\begin{slide}{}
{\bf Bound $B_3$:} SDP bound:

After homogenization ( $c=0$), use
\[ x^TQx = \tr x^TQx = \tr Qxx^T\]
and, for $x \in \F,$ $y_{ij} = x_ix_j$ defines a
symmetric, rank one, positive semidefinite matrix $Y$ with diagonal elements 1.
\[
\begin{array}{|ccc|}
\hline\\
       B_3 :=&\max &\tr QY \\
 &  \mbox{~subject to~} &\diag(Y) = e \\
        && Y \succeq 0.\\
\hline
    \end{array}
\]


\end{slide}
\begin{slide}{}
Summary: ref. PW:
(without restrictions e.g. $u^Te=0$)


\[
\begin{array}{|rcl|}
\hline
 B_0 &=&\min_u \max_x q_u(x)\\
 B_1 &=&\min_u \max_{x^Tx=n} q_u(x)\\
 B_2 &=&\min_u \max_{-1 \leq x_i \leq 1} q_u(x)\\
 B_3 &=&\max \{\tr Q^cY : \diag(Y) = e,~ Y \succeq 0. \}\\
 B_1^c &=&\min_u \max_{y^Ty=n+1} q^c_u(y)\\
\hline
\end{array}
\]


\end{slide}
\begin{slide}{}

Now replace $\pm 1$ constraints with $x_i^2=1, \forall i.$
\[
\begin{array}{ccc}
       (P_E)&\max & q_0(x)=x^TQx-2c^Tx \\
 &  \mbox{~subject to~} &
        x_i^2 = 1,~~i=1, \cdots, n.  \\
    \end{array}
\]
$B_L$ denotes Lagrangian relaxation bound.

Following our theme:
\begin{center}
\fbox{
$B_L$ equals all above bounds.
}
\end{center}

 ref PRW.\\
(The proofs come from exploiting strong Lagrangian duality of TRS)


\end{slide}
\begin{slide}{}

{\bf A Strengthened SDP Bound for MC}


(motivated by the strong duality results to follow)

lifting procedure
\[ X=xx^T\]
implies
\[X^2=xx^Txx^T=nX.\]
equivalent quadratic matrix model for CQ
\begin{eqnarray*}
\mu^*:=&\max& \tr QX\\
&\mbox{s.t.}&\diag(X)=e\\
&&X^2-nX=0\\
&&X \succeq 0.
\end{eqnarray*}

\end{slide}
\begin{slide}{}

{\bf Convex Quadratic Program}
\begin{eqnarray*}
\mu^*:=&\min& q_0(x)\\
&\mbox{s.t.}& q_k(x) \leq 0,\quad k=1, \ldots m,
\end{eqnarray*}
where all $q_i(x)$ are convex quadratic functions.
The dual is
\[ 
\mbox{DCQP}\qquad
\nu^*:= \max\limits_{\lambda \geq 0}\ \min\limits_x\ q_0(x)
           + \sum_{k=1}^m \lambda_k q_k(x).
\]

$\bullet$(KKT) conditions are sufficient for global optimality

$\bullet$if the primal value of CQP is bounded then it is
attained and there is no duality gap, ref PE and T

$\bullet$ the dual may not be attained


\end{slide}
\begin{slide}{}
{\bf Nonconvex Cases}

{\bf 1. Rayleigh Quotient}

$A=A\tran \in \Sn$
\[
 \lambda_{\min} = \min \{x\tran Ax : x\tran  x = 1\}.
\]
tractable (nonconvex) problem

no (Lagrangian) duality gap for this nonconvex problem
\[
\begin{array}{rcl}
 \mu^*&:=& \max\limits_{\lambda}\ \min\limits_x\ 
      x\tran Ax -\lambda (x\tran  x - 1)\\
 &=& \max\limits_{A - \lambda I \succeq 0}\ \min\limits_x\ 
               x\tran (A -\lambda I)x  + \lambda\\
 &=& \max\limits_{A - \lambda I \succeq 0}\ \lambda\\
 &=& \lambda_{\min}
\end{array}
\]


\end{slide}
\begin{slide}{}

{\bf 2. Trust Region Subproblem}
\begin{eqnarray*}
 &\mu^*:=\min& q_0(x)\\
&\mbox{s.t.}& x\tran x - \delta^2\leq 0 \mbox{ (or } =0). \\
\end{eqnarray*}
for ``$\le$," the Lagrangian dual is:
\[ 
\mbox{DTRS}\qquad
 \nu^*:=\max\limits_{\lambda \geq 0}\ \min\limits_x\ q_0(x) + \lambda 
(x\tran x - \delta^2).
\]
strong duality holds and equivalent to (ref SW) the
(concave) nonlinear semidefinite program 
\begin{eqnarray*}
\mbox{DTRS}\qquad
&\nu^*:=\max& g_0\tran  (Q+\lambda I)^{\dagger} g_0 - \lambda \delta^2\\
&\mbox{s.t.}& Q+\lambda I \succeq 0\\
&&\lambda \geq  0.
\end{eqnarray*}

theme: tractable problem (ref MS) but strong duality holds.


\end{slide}
\begin{slide}{}
{\bf 3. Two Trust Region Subproblem}
TTRS consists in minimizing a
(possibly nonconvex) quadratic function subject to a norm and a
least squares constraint.

ref in SQP methods by CDT.

TTRS can have a nonzero duality gap, ref PY and Y.

if objective not convex,
then the primal may not be attained, ref LZ.

TRS can have at most one local and nonglobal optimum, ref M.

Still an open problem whether TTRS is an NP-hard or a polynomial time
problem.


\end{slide}
\begin{slide}{}


{\bf Orthogonally Constrained Programs with Zero Duality Gaps}

\[ \{X: X X\tran =I\} \]
Stiefel manifold ref EAS

$A$ and $B$ $n\times n$ symmetric matrices
\begin{equation}
\begin{array}{rcl}
{\rm QQP_O}\qquad  \mu^O:=&\min& \tr AXBX\tran\\
&{\rm s.t.}& XX\tran=I.
\end{array}
\end{equation}

Tractable problem using
classical Hoffman-Wielandt inequality.


But, Lagrangian dual has a duality gap, ref ZKRW.

\end{slide}
\begin{slide}{}


The constraints $XX\tran=I$ and
$X\tran X=I$ are equivalent. Add redundant constraints.
\begin{eqnarray*}
{\rm QQP_{OO}}\qquad  \mu^O:=&\min& \tr AXBX\tran\\
&{\rm s.t.}& XX\tran=I,\ X\tran X=I.
\end{eqnarray*}
dual problem is
\begin{eqnarray*}
{\rm DQQP_{OO}}\\
\mu^O \geq \mu^D:=&\max& \tr S+\tr T\\
&\mbox{s.t.}&
(I\Kprod S)+(T\Kprod I)\preceq (B\Kprod A)\\
&& S=S\tran,\ T=T\tran.
\end{eqnarray*}

{\bf Theorem}
Strong duality holds for $\rm QQP_{OO}$ and $\rm DQQP_{OO},$ 
i.e.  $\mu^D=\mu^O$ and both primal and dual are attained.
\QED

Theme again holds.

\end{slide}
\begin{slide}{}
Other applications:

Weighted Sums of Eigenvalues;

Graph Partitioning Problem;

TRS like constraints $\{X: XX\tran \preceq I\}$
(extension of Hoffman-Wielandt inequality)


\end{slide}
\begin{slide}{}

Concluding Remarks:

In each case of a tractable bound for a nonconvex problem, the structure
allows for redundant constraints to be added to close the Lagrangian
duality gap.

What is the correct question about ``best''? Can we close or reduce the
duality gap in general? (ref KT)
\end{slide}
\begin{slide}{}
\end{slide}
\begin{slide}{}
\end{slide}


\end{document}