首页 切锥及其应用

切锥及其应用

举报
开通vip

切锥及其应用 JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS: Vol. 13, No. 4, 1974 On the Cones of Tangents with Applications to Mathematical Programming 1 M. S. BAZARAA, ~ J. J. GOODE, 3 AND NI. Z. NASHED ~ Communicated by M. R. Hestenes Abstract. In this study...

切锥及其应用
JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS: Vol. 13, No. 4, 1974 On the Cones of Tangents with Applications to Mathematical Programming 1 M. S. BAZARAA, ~ J. J. GOODE, 3 AND NI. Z. NASHED ~ Communicated by M. R. Hestenes Abstract. In this study, we present a unifying framework for the cones of tangents to an arbitrary set and some of its applications. We highlight the significance of these cones and their polars both from the point of view of differentiability and subdifferentiability theory and the point of view of mathematical programming. This leads to a generalized definition of a subgradient which extends the well-known definition of a subgradient of a convex function to the nonconvex case. As an application, we develop necessary optimality conditions for a min-max problem and show that these conditions are also sufficient under moderate convexity assumptions. Key Words. Tangent cones, mathematical programming, in- equality constraints, minimax problems, optimization theorems, constraint qualification. 1. Introduction The notion of a tangent cone goes back to the work of Bouligand (Ref. 1), who used the term contingent. Many authors have investigated recently the relationships between differentiability of a given function and the cone of tangents to the graph of the function. The reader may refer to Flett (Refs. 2-3), Roetman (Ref. 4), Hesteness (Ref. 5), and Mashed (Ref. 6). 1 The work of the third author was aponsored by the United States Army, Contract No. DA-31-124-ARO-D-462, while he was at the Mathemat ics Research Center, The University of Wisconsin, Madison, Wisconsin. Associate Professor, School of Industr ial and Systems Engineering, Georgia Inst itute of Technology, Atlanta, Georgia. Associate Professor, School of Mathematics, Georgia Inst i tute of Technology, Atlanta, Georgia. Professor, School of Mathemat ics , Georgia Inst itute of Technology, Atlanta, Georgia. 389 © 1974 Plenum Publishing Corporation, 227 West 17th Street, New York, N.Y. 1(}011. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, microfilming, recording, or otherwise, without written permission of the publisher. 390 JOTA: VOL. 13, NO. 4, 1974 From a different point of view, Abadie (Ref. 7) used the cone of tangents to develop a constraint qualification in a mathematical pro- gramming problem. Since then, these cones have been used by different authors to impose various constraint qualifications and develop optimality conditions. See, for example, Varaiya (Ref. 8), Guignard (Ref. 9), Gould and Tolle (Ref. 10), Bazaraa et al. (Refs. 11-12), Neustadt (Ref. 13), Halkin (Ref. 14), and Dubovitskii and Milyutin (Ref. 15). In this study, we present a unifying framework for the cone of tangents and some of its applications. Along the way, various new results are obtained. We consider the tangent cones to general sets and develop some results involving these cones. Also, the cones of tangents to epigraphs and hypographs of a given function, as well as sets defined by inequality and equality constraints, are considered. We develop the relationships between elements of the polars of these cones and one-sided directional derivatives when the latter exist. In particular, we introduce a generalized notion of subgradients for functions which are not neces- sarily differentiable. This extends the definition of a subgradient intro- duced by Moreau (Ref. 29) and further studied by Rockafellar (Refs. 16-17) and others to nonconvex functions. When the funtion under consideration is differentiable, then the cones of tangents to its epigraph and hypograph are halfspaces that generate the whole space. As an application we consider the problem: minimize f(x), x ~ S. If the set S is defined via equality and/or inequality constraints, then, by using suitable characterization of the cone of tangents of the set S, the Kuhn-Tucker (Fritz John) conditions for optimality follow in an easy and natural fashion. We are also able to partially relax the differentiability of the functionf and develop a suitable necessary condition for optimality where subgradients play the role of the gradient vector. Also, we consider a rain-max problem and develop suitable necessary conditions similar to those of Demyanov (Ref. 18) and Demyanov and Rubinov (Ref. 19). Under moderate convexity assmnptions, it turns out that the conditions are also sufficient for a minimum. 2. Cones of Tangents Consider the following definition of the cone of tangents to a set A at a point x 0 E cl A, where cl A denotes the closure of A. If A is empty, we interpret the cone of tangents as the empty set. Definition 2.1. Let A be a nonempty set in R~ and let x 0 E cl A. JOTA: VOL. 13, NO. 4, t974 39t The cone of tangents to A at Xo, denoted by T(A, Xo), is the set of all vectors x such that x ...... lim )n(x~ -- x0), where ~ > 0, x~ E A for all n and x~ --~ x 0 . I t can be easily verified that each of the following is an equivalent characterization of the cone of tangents: (i) x ~ T(A, Xo) iff there is a sequence of positive numbers ~,~ converging to zero and a function 0 : R -~- R~ + with O(t) -~ 0 as t --~ 0 such that the sequence x~ = x o ÷ ~x + ~ O(a~) is in d . (ii) x ~ T(A, Xo) iff, for each 3 > 0, there is A ~ (0, 3) and a z ~ R~ with I! z II < a such that x 0 + Ax + Az~A. Let A and B be nonempty sets in R~ and let x 0 ~ cl A (~ cl B. The following two facts follow directly from the above definition. (a) I f A C B, then T(A, xo) C T(B, Xo). (b) I f x 0 ~ int B, then T(A c3 B, x0) == T(A, xo). We will give an important characterization of the cone of tangents. This characterization was given by Varaiya (Ref. 8) without a proof. As a corollary to this result, it becomes evident that the cone of tangents is a closed set, which was originally proved by Abadie (Ref. 7). First, we need the following definition of the cone of a set A at some point x 0 . Def in i t ion 2.2. Let A be a nonempty set in R~. The cone of A at Xo, denoted by C(A, xo), is defined to be the set of all points y that can be written in the form y = ;~(x - - x0) for some A > 0 and some x~A. I t is obvious that C(A, Xo) is a cone with vertex zero. I t can be easily shown that C(A, Xo) is the minimal cone with vertex zero that contains A - - x o . Theorem 2.1. Then, Let d be a nonempty set in R~ and x 0 ~ cl A. T(A, xo) ..... 0 ci C(A c~ N, ~0), NeJV" 392 JOTA: VOL. 13, NO. 4, 1974 where ~ is the class of all open balls about x o . P roo f . I f x ~ T(A, xo) , then x = lim A~(x~ - - Xo) , n-~ct~ with x~ ~ A, A~ > 0 and x n --~ x o . Since x~ -+ x o , then, given any open ball N about x o , there exists a posit ive integer nN such that x~ ~ A n N for all n > nN. Therefore, A~(x~ -- x0) E C(A n N, Xo) for all n > nN and, hence, x = IiI~ A,(x, - - x0) ~ ct C(A n N, Xo). But, since this is true for all N ~ J¢', it is true that Conversely, let T(A, Xo) C N cl C(A n N, xo). NsJV" x ~ N cl C(A n .~, xo). N~M/" Given any posit ive integer k, choose an open ball N k about x 0 of radius 1/k. x ~ cl C(A n Nk, Xo) impl ies that x is the l imit of vectors of the fo rm At(x~ - - xo), where A t > 0 and xl ~ A n N k . Now, choose/ i , such that II x - - Xo)[! < 1/k. By varying k, we generate the sequences {A~} and {xt~}. Note that At~ > 0, xt~ e A, and 2%(:%~ -- Xo) -+ x, that is, x e T(A, xo). Therefore, T(A, Xo) 0 cl c(A n N, Xo), and the proof is complete. Coro l la ry 2.1. T(A, xo) is a closed cone. Under some convexity qualif ication of the set A, we will show that C(A, Xo) = C(A n N, Xo) for each N~JV ; hence, by Theorem 2.1 above, it fol lows that T(A, Xo) = cl C(A, Xo). We need the fol lowing definit ion of a star -shaped set at a point. JOTA: VOL. 13, NO. 4, 1974 393 Def in i t ion 2.3 [see, for example, Valentine (Ref. 20)]. Let A be a nonempty set in Rn and x o ~cl A. A is said to be star shaped at x 0 if x e A implies that ~x + (1 -- A) x o~A for each A ~ (0, 1). We will now show that, if A is star shaped at Xo, then e(A, Xo) = c(A n N, Xo) for each N e Jg'. Let N be an open ball of radius r about x0. It is evident that C(A n N, ~o) C C(A, Xo). Conversely, let y E C(A, Xo); then, y = A(x - Xo) with x¢A and h > 0. Without loss of generality, assume that [Ix--Xol I =k~r . It is obvious that y can be represented as y :=/x(z -- xo) , where = 2~lr, z = ~xl~ + (t - - Alx.)xo. By the star-shaped property of A at x o , and since ?~//x e (0, 1), it follows that zEA . Moreover, t [ z - -x o]1 ~ ~}, B - - {(~, y) : y ~< -x~}. It can be easily verified that T(A n B, Xo) = {(0, 0)}, 394 JOTA: VOL. 13, NO. 4, 1974 whereas T(A, xo) n T(B, Xo) -: {(x, y ) : y = 0}. The following theorem gives a sufficient condition for T(A, x0) n T(B, x0) = r(A n B, Xo) to hold. Here, A and B are required to be convex and have at least one point common to their relative interiors (denoted by ri). Theorem 2.2. Let A and B be nonempty convex sets in R n and x o~c lAnc l .B . I f r iAnr iB=~ ~, then T(A, Xo) n T(B, Xo) = T(A n B, ~o). Proof . To prove the theorem, and in view of the discussion following Theorem 2.1, we need to show that cl C(A, xo) n cl C(B, xo) = cl C(A t3 B, Xo). It is routine to check that c(A, ~o) n C(B, ~o) = C(.4 n B, xo); hence, it is obvious that cl C(A n B, Xo) C cl C(A, Xo) n el C(B, Xo). To show the reverse inclusion, let x e ct C(A, xo) nct C(B, Xo), and let y e ri C(A, xe) n ri C(B, xo) , noting that such a y exists, since r iA n r ib :/: d. Now, consider x~ = hy + (1 - A)x for ~ E (0, 1). By convexity of C(A, xo) and C(B, Xo), it follows that x~ ~ C(A, Xo) n C(B, Xo) -= C(A n B, xo) for each ~(0, 1). JOTA: VOL. 13, NO. 4, 1974 395 See, for example, Rockafellar (ReL 2t). Hence, x :=t imx a belongs to clC(Ac~B, xo). a-+0 This completes the proof. In order to prove the next theorem about the cone of tangents, we need the following two lemmas and the definition of a polar cone. The following lemma shows that the cone of tangents to a set A and the cone of tangents to the complement of the set (denoted by A e) generate the whole space. Lemma 2.1. Let A and B be nonempty subsets of R~ such that AwB=R~,and le tx oCctAnc lB .Then , T(A, .0) to T(B, .0) = R,~. Proof . Let x ~ R n and x~ ....... x o + (1/n)x. Obviously, x~ --~ x 0 . If {x,,} has a subsequence in A, then clearly x ~ T(A , Xo) by letting Z~ == n in the definition of the cone of tangents. On the other hand, if no such subsequence exists, then there exists a positive integer N such that XN+i ~ B for alI i > 0. It then follows that x e T (B , Xo) , that is, x e T(A, Xo) u T(B, xo). This completes the proof. Definit ion 2.4. Let A be a nonempty set in R~. A* = {~ : (x, ~} ~ 0 for all x ~ A} is said to be the polar of A. It is obvious that A* is a closed convex cone with vertex zero. Note that A* is nonempty, since it always contains the origin. If A is empty, we will interpret A* as R,~. The following lemma shows that, if two sets whose union is R n each have a nonzero polar, then each is essentially a halfspace. Lemma 2.2. Let A and B be subsets of R~ such that A ~ B = R~. Suppose that there exist a nonzero a e A* and a nonzero b e B*. Then, (i) c lA ={x:~/x ,a} ~<0}andctB = {x: (x ,b} ~0}, (ii) a = ;~b for some negative scalar A, (iii) kera=kerb =c lAc~c lB , wherekera={x: (x ,a} =0}. Proof. Consider the halfspaces H~ -- {x : (x, a) ~< 0}, H~ = {x : (x, b) ~< 0}. Since ACH~ and BCHb, then HavaHv=R~. Let x~kera . If (x, b} > 0, then there is an open ball N about x such that N C Hb e. 396 JOTA: VOL. 13, NO. 4, 1974 But, since x is on the boundary of H , , there is a point x' ~ Ha~ c3 N. Thus, we constructed a point x' in Ha ~ ca Hs ~ which is a contradiction since Ha c ca Hb s -~ 24. If (x, b) < O, then the same argument applies for --x and therefore (x, b) -- O. This implies that ker a C ker b, and by sym- metry they are equal. We conclude that, for some nonzero )t, a = ~b. Since Ha u Hb ~ R~, it is clear that ~ < 0. This shows (ii). It then follows that any point not in cl A must be in Ha c and hence (i). Then, (iii) is obvious. The following theorem shows that, if the polars of the cones of tangents to a set and to its complement are both nonzero, then the cones of tangents are halfspaces and the polars are rays through the origin. The proof follows from Lemmas 2.1 and 2.2. Theorem 2.3. Let A and B be proper subsets of R~ such that AuB~-R~, and let Xo~c lAc~c lB . If T*(A, xo) ~{0} and T*(B, xo) ~ {0}, then there exists a nonzero vector y such that T(A, xo) = {x : (x, y ) <~ 0), T(B, xo) = {x : (x, y) >~ 0). Moreover, T*(A, xo) = {Ay : h >~ 0}, T*(B, Xo) = {•y: h ~< 0}. 3. t> and ~ Gradients In this section, we introduce a generalized definition of a subgradient which extends the definition of a subgradient for convex (concave) functions as well as supportable functions to the nonconvex case. In particular, we introduce >/and ~< gradients and relate them to the cones of tangents and their polar cones. The reader may refer to the work of Moreau (Ref. 29), Rockafeltar (Refs. 16, 17, 21), Van Slyke and Wets (Ref. 22), Brondsted and Rockafellar (Ref. 23), and Bazaraa et al. (Ref. 24) for some aspects of the theory of subgradients. Consider the follo~ing definitions of a >~ gradient 5 and a ~< gradient of a real-valued function f at a point x o . The name ~> gradient is motivated by the following more general notion (see Ref. 25). Le t f : X ~ Y, where X and Y are linear spaces, and let C be a closed convex cone in Y. Let xo E X. A bounded linear operator A(xo ; ") : X -+ Y is said to be a C-gradient o f f at x0 if there exists a map 0 : X --~ Y such that O(t) --+ 0 as t ~ 0 and f (xo + t) - - f (xo) - - A (xo ; t) - - IIt l[ O(t) E C for each t ~ X. In the definition of ~ gradient, C is the nonnegative cone. JOTA: VOL. 13, NO. 4, 1974 397 Def in i t ion 3.1. Let f : Rn-+R and xoeR,~. ~ sR~ is said to be a ) gradient o f f at x 0 if f (x o q- t) -- f(xo) -- (t, ~) - - ;l t if O(t) >/0 for al! t e R,~, where O(t) --* 0 as t -+ 0. ~ ~ R~, is said to be a ~ gradient o f f at x 0 if the above inequality is reversed. The reader may note that, if 0 is identically zero, then the above definitions of >/and ~< gradients reduce to the well-known notion of a subgradient for convex and concave functions. Also, when the inequality above is changed to an equality, one obtains the definition of a gradient vector. From now on, when no confusion arises, the word subgradient may be used in place of ~> and ~< gradients. The following lemma gives a useful equivalent sequential definition of a >/gradient o f f at x 0 . Needless to say, the inequality is reversed in case of a ~< gradient. The lemma below will be used later to characterize subgradients in terms of polars of the cones of tangents. Lemma 3.1. ~ is a >~ gradient o f f at x 0 iff, for each sequence {xr~ } with x,~-+ 0, there exists a subsequence {x~'} and a sequence y~ -+ 0 such that f (x o -~ :%') >~ f(xo) ÷ (x~', ~) + I[ x~,' !iY~. Proof . Let ~ be a >/gradient o f f at Xo, that is, f(x0 + x) >>f(xo) + (x, g> + {{ x it O(x), where O(x) -> 0 as x ~ 0. Clearly, by taking x~' = x,~ and y~ = O(x**), the sequential characterization of the remark follows. Now, suppose that the sequential characterization holds and, by contradiction, assume that is not a ) gradient. In particular, 0 defined by O(x) = rain(0, [f(x o -{-x) - - f (xo) -- (x, ~)]/ti x I]) does not go to zero as x --+ 0. Hence, there is"a 3 ~> 0 and a sequence x,~ --* 0 such that [ f (x o + x,~) -- f(xo) -- (x,~, ~)]/tl xn ]i < --3. Now, consider a sequence x~' and a sequence y~ satisfying the sequential characterization stated above. We get y~ <~ [f(xo - /x , ( ) -- f(xo) - - (xn', g>}/tI x,( Ii < -a , 398 JOTA: VOL. 13, NO. 4, 1974 which violates the fact that y . - - * O. This shows that ~ is indeed a >~ gradient and the proof is complete. From the above definition, one can easily obtain lower and upper bounds for each component of a subgradient. Of course, when the partial derivatives exist, then there is only one subgradient, namely, the vector whose components are the partial derivatives. This result is given below. Theorem 3.1. Let f :R n~R be continuous at x o~R n, and y ~ R~ with 11Y 11 ~ 1. I f ~ is a ~ gradient o f f at Xo, then lim [f(x o + c~y) --f(xo)]/c~ <~ (y, ~) <~ lira [f(x o + o~y) --f(xo)]/o. a+O- c~-~O + On the other hand, if ~ is a ~ gradient o f f at x o , then [ f (x o + c~y) --f(xo)]/o~ ~ (y, ~) <~ lim [f(x o + c~y) --f(xo)]/a. a~0 + cz-~0- Proof . Let ~ be a >/grad ient o f f at x o , that is, f(x) -- f(xo) >/(x -- xo, ~) + [I x -- x o I! O(x - Xo), where O(t)-+ 0 as t--+ O. Let x = x o -!-e~y, where a > O. Therefore, [f(x 0 q- ay) --f(xo)l/o~ • (y, g> + O(o~y) for each ~ > 0. This shows that lim [f(x o + ~y) --f(xo)]/cx >/(y, ~). a->0 + Similarly, by considering x = x o + ~y, where a < 0, it follows that 1-~ [f(x o + off) -- f(xo)]/c~ ~ (y, ~) ~->0 - - and, hence, the result follows. The case when ~ is a ~ gradient can be similarly proved. Coro l la ry 3.1. Let e i ~ R~ be the vector with zero components except one at the ith position. Let ~ = (~1 ,.--, ~n) be a ~ gradient of f at x 0 . Then, [f(x o + ~e~) --f(xo)l/o, ~ g~ ~ lira [f(x o + ~e~) --f(xo)]/a c~->0-- a->0 + for i = 1, 2,..., n. JOTA: VOL. 13, NO. 4, 1974 399 On the other hand, if ~ is a ~< gradient o f f at Xo, then li--~ [f(x o 4- ~e~) --f(xo)]/o~ ~ ~ ~ lira [f(xo 4- ~e~) -- f(xo)]/~ a-~0 + a ->0- for i = 1, 2 ..... n. In particular, if the partial derivatives ~f(Xo)/Ox i exist, then, for any >/ or ~ gradient vector ~, we must have ~ = ?f(Xo)/ax~ , i = 1, 2,..., n. To illustrate, let us consider the following simple example. Let f (x , y) = ] x [ and (Xo, Yo) = (0, 0). We are interested in finding the set of subgradients o f f at (x o , Yo). First of all, we have to decide whether f has ~> or ~< gradient vectors. We will show later that, unless f is dig ferentiable at (Xo, Yo), f cannot have both ~ and ~ gradients at (xo, Y0). The reader can easily convince himself that f has ~> gradients at (x 0 , Yo). So let ~ = (~1, ~2) be a >~ gradient. By the above corollary, it is im- mediate that --1 ~ ~1 ~ 1 and ~2 ----- 0. In a sense, this says that f is differentiable in the y-direction but not in the x-direction. The above theorem gives a necessary condition for a vector ~ to be a ~ or a ~ gradient. Now we show that this condition is, in general, not sufficient. Let us consider the case of a ~> gradient. Suppose that there is a vector ~ such that lim [ f(x o 4- ay) --f(xo)]/~ ~ (y, ~) ~ 1_~_ f [ (x o + ay) --f(Xo)]/o~ o~-->0-- o~0 + for each y e R~ with l] Y II = 1. Since (y, ~) ~< li_~_m [f(x 0 4- o~y) --f(xo)]/% a->0 + then (y, g) ~ [f(x o 4- c~y) --f(Xo)]/e~ + O((e 0 for ~ > O, where Oy'(o 0 is a real-valued function with Ou'(o 0 --+ 0 as a -+ O. Therefore,
本文档为【切锥及其应用】,请使用软件OFFICE或WPS软件打开。作品中的文字与图均可以修改和编辑, 图片更改请在作品中右键图片并更换,文字修改请直接点击文字进行修改,也可以新增和删除文档中的内容。
该文档来自用户分享,如有侵权行为请发邮件ishare@vip.sina.com联系网站客服,我们会及时删除。
[版权声明] 本站所有资料为用户分享产生,若发现您的权利被侵害,请联系客服邮件isharekefu@iask.cn,我们尽快处理。
本作品所展示的图片、画像、字体、音乐的版权可能需版权方额外授权,请谨慎使用。
网站提供的党政主题相关内容(国旗、国徽、党徽..)目的在于配合国家政策宣传,仅限个人学习分享使用,禁止用于任何广告和商用目的。
下载需要: 免费 已有0 人下载
最新资料
资料动态
专题动态
is_648217
暂无简介~
格式:pdf
大小:1MB
软件:PDF阅读器
页数:38
分类:理学
上传时间:2014-01-12
浏览量:32