Nonconvex Optimization for
Communication Systems
Mung Chiang
Electrical Engineering Department
Princeton University, Princeton, NJ 08544, USA
chiangm@princeton.edu
Summary. Convex optimization has provided both a powerful tool and an intrigu-
ing mentality to the analysis and design of communication systems over the last
few years. A main challenge today is on nonconvex problems in these application.
This paper presents an overview of some of the important nonconvex optimization
problems in point-to-point and networked communication systems. Three typical ap-
plications are covered: Internet congestion control through nonconcave network util-
ity maximization, wireless network power control through geometric and sigmoidal
programming, and DSL spectrum management through distributed nonconvex op-
timization. A variety of nonconvex optimization techniques are showcased: from
standard dual relaxation to sum-of-squares programming through successive SDP
relaxation, signomial programming through successive GP relaxation, and leveraging
the specific structures in problems for efficient and distributed heuristics.
Key words: Nonconvex optimization, Geometric programming, Semidefi-
nite programming, Sum of squares, Duality, Network utility maximization,
TCP/IP, Wireless network, Power control.
1 Introduction
There has been two major “waves” in the history of optimization theory: the
first started with linear programming and simplex method in late 1940s, and
the second with convex optimization and interior point method in late 1980s.
Each has been followed by a transforming period of appreciation-application
cycle: as more people appreciate the use of LP/convex optimization, more
look for their formulations in various applications; then more work on its the-
ory, efficient algorithms and softwares; the more powerful the tools become;
then more people appreciate its usage. Communication systems benefit sig-
nificantly from both waves, including multicommodity flow solutions (e.g.,
Bellman Ford algorithm) from LP, and basic network utility maximization
and robust transceiver design from convex optimization.
2 Chiang: Nonconvex Optimization for Communication Systems
Much of the current research frontier is about the potential of the third
wave, on nonconvex optimization. If one word is used to differentiate be-
tween easy and hard problems, convexity is probably the “watershed”. But if
a longer description length is allowed, much useful conclusions can be drawn
even for nonconvex optimization. Indeed, convexity is a very disturbing wa-
tershed, since it is not a topological invariant under change of variable (e.g.,
see geometric programming) or higher-dimension embedding (e.g., see sum of
squares method). A variety of approaches have been proposed, from nonlinear
transformation to turn an apparently nonconvex problem into a convex prob-
lem, to characterization of attraction regions and systematically jumping out
of a local optimum, from successive convex approximation to dualization, from
leveraging the specific structures of the problems (e.g., Difference of Convex
functions, concave minimization, low rank nonconvexity) to developing more
efficient branch-and-bound procedures.
Researchers in communications and networking have been examining non-
convex optimization using domain-specific structures in important problems
in the areas of wireless networking, Internet engineering, and communication
theory. Perhaps four typical topics best illustrate the variety of challenging
issues arising from nonconvex optimization in communication systems:
• Nonconvex objective to be minimized. An example is congestion control
for inelastic applications.
• Nonconvex constraint set. An example is power control in low SIR regimes.
• Integer constraints. Two important examples are single path routing and
multiuser detection.
• Constraint sets that are convex but require an exponential number of
inequalities to explicitly describe. An example is optimal scheduling.
This chapter overviews the latest results in recent publications about the
first two topics, with a particular focus on showing the connections between
the engineering intuitions about important problems in communication sys-
tems and the state-of-the-art algorithms in nonconvex optimization theory.
2 Internet Congestion Control
2.1 Introduction
Basic network utility maximization
Since the publication of the seminal paper [24] by Kelly, Maulloo, and Tan in
1998, the framework of Network Utility Maximization (NUM) has found many
applications in network rate allocation algorithms and Internet congestion
control protocols (e.g., surveyed in [32, 47]). It has also lead to a systematic
understanding the entire network protocol stack in the unifying framework
(e.g., surveyed in [11, 31]). By allowing nonlinear concave utility objective
Special Volume 3
functions, NUM substantially expands the scope of the classical LP-based
Network Flow Problems.
Consider a communication network with L links, each with a fixed capacity
of cl bps, and S sources (i.e., end users), each transmitting at a source rate
of xs bps. Each source s emits one flow, using a fixed set L(s) of links in its
path, and has a utility function Us(xs). Each link l is shared by a set S(l)
of sources. Network Utility Maximization (NUM), in its basic version, is the
following problem of maximizing the total utility of the network
∑
s Us(xs),
over the source rates x, subject to linear flow constraints
∑
s:l∈L(s) xs ≤ cl for
all links l:
maximize
∑
s Us(xs)
subject to
∑
s∈S(l) xs ≤ cl, ∀l,
x � 0
(1)
where the variables are x ∈ RS .
There are many nice properties of the basic NUM model due to several
simplifying assumptions of the utility functions and flow constraints, which
provide the mathematical tractability of problem (1) but also limit its ap-
plicability. In particular, the utility functions {Us} are often assumed to be
increasing and strictly concave functions.
Assuming that Us(xs) becomes concave for large enough xs is reasonable,
because the law of diminishing marginal utility eventually will be effective.
However, Us may not be concave throughout its domain. In his seminal paper
published a decade ago, Shenker [45] differentiated inelastic network traffic
from elastic traffic. Utility functions for elastic traffic were modeled as strictly
concave functions. While inelastic flows with nonconcave utility functions rep-
resent important applications in practice, they have received little attention
and rate allocation among them have scarcely any mathematical foundation,
except three recent publications [28, 12, 15] (see also earlier work in [54, 29, 30]
related to the approach in [28])
In this section, we investigate the extension of the basic NUM to max-
imization of nonconcave utilities, as in the approach of [15]. We provide a
centralized algorithm for off-line analysis and establishment of a performance
benchmark for nonconcave utility maximization when the utility function is a
polynomial or signomial. Based on the semialgebraic approach to polynomial
optimization, we employ convex sum-of-squares (SOS) relaxations solved by
a sequence of semidefinite programs (SDP), to obtain increasingly tighter up-
per bounds on total achievable utility for polynomial utilities. Surprisingly, in
all our experiments, a very low order and often a minimal order relaxation
yields not just a bound on attainable network utility, but the globally maxi-
mized network utility. When the bound is exact, which can be proved using
a sufficient test, we can also recover a globally optimal rate allocation.
4 Chiang: Nonconvex Optimization for Communication Systems
Canonical distributed algorithm
A reason that the the assumption of utility function’s concavity is upheld in
almost all papers on NUM is that it leads to three highly desirable mathe-
matical properties of the basic NUM:
• It is a convex optimization problem, therefore the global minimum can be
computed (at least in centralized algorithms) in worst-case polynomial-
time complexity [5].
• Strong duality holds for (1) and its Lagrange dual problem. Zero duality
gap enables a dual approach to solve (1).
• Minimization of a separable objective function over linear constraints can
be conducted by distributed algorithms based on the dual approach.
Indeed, the basic NUM (1) is such a “nice” optimization problem that
its theoretical and computational properties have been well studied since the
1960s in the field of monotropic programming, e.g., as summarized in [41].
For network rate allocation problems, a dual-based distributed algorithm has
been widely studied (e.g., in [24, 32]), and is summarized below.
Zero duality gap for (1) states that the solving the Lagrange dual problem
is equivalent to solving the primal problem (1). The Lagrange dual problem
is readily derived. We first form the Lagrangian of (1):
L(x,λ) =
∑
s
Us(xs) +
∑
l
λl
cl − ∑
s∈S(l)
xs
where λl ≥ 0 is the Lagrange multiplier (link congestion price) associated with
the linear flow constraint on link l. Additivity of total utility and linearity
of flow constraints lead to a Lagrangian dual decomposition into individual
source terms:
L(x,λ) =
∑
s
Us(xs)−
∑
l∈L(s)
λl
xs
+∑
l
clλl
=
∑
s
Ls(xs, λ
s) +
∑
l
clλl
where λs =
∑
l∈L(s) λl. For each source s, Ls(xs, λ
s) = Us(xs) − λsxs only
depends on local xs and the link prices λl on those links used by source s.
The Lagrange dual function g(λ) is defined as the maximized L(x,λ) over
x. This “net utility” maximization obviously can be conducted distributively
by the each source, as long as the aggregate link price λs =
∑
l∈L(s) λl is
available to source s, where source s maximizes a strictly concave function
Ls(xs, λ
s) over xs for a given λ
s:
x∗s(λ
s) = argmax [Us(xs)− λ
sxs] , ∀s. (2)
Special Volume 5
The Lagrange dual problem is
minimize g(λ) = L(x∗(λ),λ)
subject to λ � 0
(3)
where the optimization variable is λ. Any algorithms that find a pair of primal-
dual variables (x,λ) that satisfy the KKT optimality condition would solve
(1) and its dual problem (23). One possibility is a distributed, iterative sub-
gradient method, which updates the dual variables λ to solve the dual problem
(23):
λl(t+ 1) =
λl(t)− α(t)
cl − ∑
s∈S(l)
xs(λ
s(t))
+
, ∀l (4)
where t is the iteration number and α(t) > 0 are step sizes. Certain choices of
step sizes, such as α(t) = α0/t, α0 > 0, guarantee that the sequence of dual
variables λ(t) will converge to the dual optimal λ∗ as t → ∞. The primal
variable x(λ(t)) will also converge to the primal optimal variable x∗. For a
primal problem that is a convex optimization, the convergence is towards the
global optimum.
The sequence of the pair of algorithmic steps (2,4) forms a canonical dis-
tributed algorithm that globally solves network utility optimization problem
(1) and the dual (23) and computes the optimal rates x∗ and link prices λ∗.
Nonconcave Network Utility Maximization
It is known that for many multimedia applications, user satisfaction may
assume non-concave shape as a function of the allocated rate. For example, the
utility for streaming applications is better described by a sigmoidal function:
with a convex part at low rate and a concave part at high rate, and a single
inflexion point x0 (with U ′′s (x
0) = 0) separating the two parts. The concavity
assumption on Us is also related to the elasticity assumption on rate demands
by users. When demands for xs are not perfectly elastic, Us(xs) may not be
concave.
Suppose we remove the critical assumption that {Us} are concave func-
tions, and allow them to be any nonlinear functions. The resulting NUM
becomes nonconvex optimization and significantly harder to be analyzed and
solved, even by centralized computational methods. In particular, a local op-
timum may not be a global optimum and the duality gap can be strictly
positive. The standard distributive algorithms that solve the dual problem
may produce infeasible or suboptimal rate allocation.
Despite such difficulties, there have been two very recent publications on
distributed algorithm for nonconcave utility maximization. In [28], a “self-
regulation” heuristic is proposed to avoid the resulting oscillation in rate
allocation and shown to converges to an optimal rate allocation asymptot-
ically when the proportion of nonconcave utility sources vanishes. In [12], a
6 Chiang: Nonconvex Optimization for Communication Systems
0 2 4 6 8 10 12
0
0.5
1
1.5
2
2.5
3
x
U
(x
)
Fig. 1. Some examples of utility functions Us(xs): it can be concave or sigmoidal
as shown in the graph, or any general nonconcave function. If the bottleneck link
capacity used by the source is small enough, i.e., if the dotted vertical line is pushed
to the left, a sigmoidal utility function effectively becomes a convex utility function.
set of sufficient conditions and necessary conditions is presented under which
the canonical distributed algorithm still converges to the globally optimal so-
lution. However, these conditions may not hold in many cases. These two
approaches illustrate the choice between admission control and capacity plan-
ning to deal with nonconvexity (see also the discussion in [23]). But neither
approach provides a theoretically polynomial-time and practically efficient al-
gorithm (distributed or centralized) for nonconcave utility maximization.
In this section, we remove the concavity assumption on utility functions,
thus turning NUM into a nonconvex optimization problem with a strictly pos-
itive duality gap. Such problems in general are NP hard, thus extremely un-
likely to be polynomial-time solvable even by centralized computations. Using
a family of convex semidefinite programming (SDP) relaxations based on the
sum-of-squares (SOS) relaxation and the Positivstellensatz Theorem in real
algebraic geometry, we apply a centralized computational method to bound
the total network utility in polynomial-time. A surprising result is that for all
the examples we have tried, wherever we could verify the result, the tightest
possible bound (i.e., the globally optimal solution) of NUM with nonconcave
utilities is computed with a very low order relaxation. This efficient numer-
ical method for off-line analysis also provides the benchmark for distributed
heuristics.
These three different approaches: proposing distributed but suboptimal
heuristics (for sigmoidal utilities) in [28], determining optimality conditions
for the canonical distributed algorithm to converge globally (for all nonlinear
utilities) in [12], and proposing efficient but centralized method to compute
the global optimum (for a wide class of utilities that can be transformed into
Special Volume 7
polynomial utilities) in [15] and this section, are complementary in the study
of distributed rate allocation by nonconcave NUM.
2.2 Global maximization of nonconcave network utility
Sum-of-squares method
We would like to bound the maximum network utility by γ in polynomial time
and search for a tight bound. Had there been no link capacity constraints,
maximizing a polynomial is already an NP hard problem, but can be relaxed
into a SDP [46]. This is because testing if the following bounding inequality
holds γ ≥ p(x), where p(x) is a polynomial of degree d in n variables, is
equivalent to testing the positivity of γ−p(x), which can be relaxed into testing
if γ − p(x) can be written as a sum of squares (SOS): p(x) =
∑r
i=1 qi(x)
2 for
some polynomials qi, where the degree of qi is less than or equal to d/2. This
is referred to as the SOS relaxation. If a polynomial can be written as a sum
of squares, it must be non-negative, but not vice versa. Conditions under
which this relaxation is tight were studied since Hilbert. Determining if a
sum of squares decomposition exists can be formulated as an SDP feasibility
problem, thus polynomial-time solvable.
Constrained nonconcave NUM can be relaxed by a generalization of the
Lagrange duality theory, which involves nonlinear combinations of the con-
straints instead of linear combinations in the standard duality theory. The key
result is the Positivstellensatz, due to Stengle [48], in real algebraic geometry,
which states that for a system of polynomial inequalities, either there exists
a solution in Rn or there exists a polynomial which is a certificate that no
solution exists. This infeasibility certificate is recently shown to be also com-
putable by an SDP of sufficient size [38, 37], a process that is referred to as
the sum-of-squares method and automated by the software SOSTOOLS [39]
initiated by Parrilo in 2000. For a complete theory and many applications of
SOS methods, see [38] and references therein.
Furthermore, the bound γ itself can become an optimization variable in
the SDP and can be directly minimized. A nested family of SDP relaxations,
each indexed by the degree of the certificate polynomial, is guaranteed to
produce the exact global maximum. Of course, given the problem is NP hard,
it is not surprising that the worst-case degree of certificate (thus the number
of SDP relaxations needed) is exponential in the number of variables. What
is interesting is the observation that in applying SOSTOOLS to nonconcave
utility maximization, a very low order, often the minimum order relaxation
already produces the globally optimal solution.
Application of SOS method to nonconcave NUM
Using sum-of-squares and the Positivstellensatz, we set up the following prob-
lem whose objective value converges to the optimal value of problem (1), where
8 Chiang: Nonconvex Optimization for Communication Systems
{Ui} are now general polynomials, as the degree of the polynomials involved
is increased.
minimize γ
subject to
γ −
∑
s Us(xs)−
∑
l λl(x)(cl −
∑
s∈S(l) xs)
−
∑
j,k λjk(x)(cj −
∑
s∈S(j) xs)(ck −
∑
s∈S(k) xs)−
. . .− λ12...n(x)(c1 −
∑
s∈S(1) xs) . . . (cn −
∑
s∈S(n) xs)
is SOS,
λl(x), λjk(x), . . . , λ12...n(x) are SOS.
(5)
The optimization variables are γ and all of the coefficients in polynomials
λl(x), λjk(x), . . . , λ12...n(x). Note that x is not an optimization variable; the
constraints hold for all x, therefore imposing constraints on the coefficients.
This formulation uses Schmu¨dgen’s representation of positive polynomials
over compact sets [44].1 Two alternative representations are discussed in [15].
Let D be the degree of the expression in the first constraint in (5). We
refer to problem (5) as the SOS relaxation of order D for the constrained
NUM. For a fixed D, the problem can be solved via SDP. As D is increased,
the expression includes more terms, the corresponding SDP becomes larger,
and the relaxation gives tighter bounds. An important property of this nested
family of relaxations is guaranteed convergence of the bound to the global
maximum.
Regarding the choice of degree D for each level of relaxation, clearly a
polynomial of odd degree cannot be SOS, so we need to consider only the
cases where the expression has even degree. Therefore, the degree of the first
non-trivial relaxation is the largest even number greater than or equal to
degree of
∑
s Us(xs), and the degree is increased by 2 for the next level.
A key question now becomes: How do we find out, after solving an SOS
relaxation, if the bound happens to be exact? Fortunately, there is a sufficient
test that can reveal this, using the properties of the SDP and its dual solu-
tion. In [19, 26], a parallel set of relaxations, equivalent to the SOS ones, is
developed in the dual framework. The dual of checking the nonnegativity of
a polynomial over a semi-algebraic set turns out to be finding a sequence of
moments that represent a probability measure with support in that set. To
be a valid set of moments, the sequence should form a positive semidefinite
moment matrix. Then, each level of relaxation fixes the size of this
本文档为【nonconvex】,请使用软件OFFICE或WPS软件打开。作品中的文字与图均可以修改和编辑,
图片更改请在作品中右键图片并更换,文字修改请直接点击文字进行修改,也可以新增和删除文档中的内容。
该文档来自用户分享,如有侵权行为请发邮件ishare@vip.sina.com联系网站客服,我们会及时删除。
[版权声明] 本站所有资料为用户分享产生,若发现您的权利被侵害,请联系客服邮件isharekefu@iask.cn,我们尽快处理。
本作品所展示的图片、画像、字体、音乐的版权可能需版权方额外授权,请谨慎使用。
网站提供的党政主题相关内容(国旗、国徽、党徽..)目的在于配合国家政策宣传,仅限个人学习分享使用,禁止用于任何广告和商用目的。