数值优化homework-4

数值优化homework-4 purdue university · cs 52000 computational methods in optimization HOMEWORK David F. Gleich March 12, 2014 Homework 4 by Ning Zhang: Acknowledge the help of Tao Wu and Zhiwei Zhang. Problem 1: Steepest descent (Nocedal and Wright, Exercise 3.6) Let’s con...

purdue university · cs 52000 computational methods in optimization HOMEWORK David F. Gleich March 12, 2014 Homework 4 by Ning Zhang: Acknowledge the help of Tao Wu and Zhiwei Zhang. Problem 1: Steepest descent (Nocedal and Wright, Exercise 3.6) Let’s conclude with a quick problem to show that steepest descent can converge very rapidly! Consider the steepest descent method with exact line search for the function f(x) = (1/2)xTQx� xTb. Suppose that we know x0 � x⇤ is parallel to an eigenvector of Q. Show that the method will converge in a single iteration. Answer: Since x0 � x⇤ is parallel to an eigenvector of Q, g(x0) = Qx0 � b = Qx0 �Qx⇤ +Qx⇤ � b = Q(x0 � x⇤) + g(x⇤) = �(x0 � x⇤) Then g(x0) Tg(x0) = � 2(x0 � x⇤)T (x0 � x⇤) g(x0) TQg(x0) = � 3(x0 � x⇤)T (x0 � x⇤) And so the equation 3.26 turns to be x1 = x0 � g(x0) Tg(x0) g(x0)TQg(x0) g(x0) = x0 � 1 � �(x0 � x⇤) = x⇤ End proof. Problem 2: Some optimizaiton Non-negative least squares is an important variant. Formally, it is: minimize kAx� bk2 subject to x � 0. We’ll see this problem again when we study constrained optimization. Here, we’ll investigate a log-barrier function to approximate it in an unconstrained manner: minimize kAx� bk2 � µeT log(x), where log is an elementwise function. Use the definition where: log(x) = �1 if x  0. 1. Determine the gradient and Hessian. Answer: g(x) = 2AT (Ax� b)� µ · column vectorn( 1xi ). H(x) = 2ATA+ µ · diag matrixn⇥n( 1x2i ). 2. Modify the gradient_descent_1.m and newtons_method_1.m functions to use a backgracking line search routine from Poblano. Answer: Just replace the x updating sentence with 1 [ x , ˜ , ˜ , ˜ , ˜ , n fev ] = pob l ano l i n e s e a r ch ( fg , x , f , g ,0 ,�g , params ) ; for the gradient descent method and with [ x , ˜ , ˜ , ˜ , ˜ , n fev ] = pob l ano l i n e s e a r ch ( fg , x , f , g ,0 ,�H\g , params ) ; for the Newton’s method. The parameters are set as default with params = ncg ( ’ d e f a u l t s ’ ) ; 3. Suppose that x(0) is strictly positive. Does x(k) stay strictly positive with these two algorithms? Discuss or prove. Answer: Yes, xk stay strictly positive. Since H(x) is positive definite for x > 0 so if there exist x⇤ > 0 st. g(x) = 0, we must have g(x) < 0 for 0 < x < x⇤ and g(x) > 0 for x > x⇤. So if 0 < xk < x⇤, xk+1 > xk > 0 holds. If 0 < x⇤ < xk, xk+1 > 0 will still hold otherwise the su�cient condition will be violated (log(x) = �1 if x  0, so f(xk+1) > f(xk)). Because x0 > 0, xk will stay positive by induction. 4. Show plots of the convergence in terms of function values and of infinity norms of the gradients of the methods for the matrices: A = [ 0.0372 0.2869 0.6861 0.7071 0.6233 0.6245 0.6344 0.6170]; b = [ 0.8587 0.1781 0.0747 0.8405]; mu = 0.1; Answer: The solution given by the two methods is ⇥ 0.2034 0.5663 ⇤ . 0 5 10 15 201 1.5 2 2.5 3 3.5 iteration fun va l Gradient descent gradient descent 0 5 10 15 200 1 2 3 4 iteration gra d i nf no rm Gradient descent gradient descent 0 1 2 3 4 5 61 1.5 2 2.5 3 3.5 iteration fun va l Newtons method newton method 0 1 2 3 4 5 60 1 2 3 4 iteration gra d i nf no rm Newtons method newton method The plots are as above and the script is as follows 2 x0 = [ 1 , 1 ] ’ ; [ x1 , f1 , g1 , h i s t1 , h i s t x1 ] = g r ad i en t d e s c en t 1 (@fgh , x0 ) ; [ x2 , f2 , g2 , h , h i s t2 , h i s t x2 ] = newtons method 1 (@fgh , x0 ) ; f igure ; subplot ( 2 , 2 , 1 ) , plot ( h i s t 1 ( 1 , : ) ) ; xlabel ( ’ i t e r a t i o n ’ ) ; ylabel ( ’ fun va l ’ ) ; legend ( ’ g rad i en t descent ’ ) ; t i t l e ( ’ Gradient descent ’ ) subplot ( 2 , 2 , 2 ) , plot ( h i s t 1 ( 2 , : ) ) ; xlabel ( ’ i t e r a t i o n ’ ) ; ylabel ( ’ grad i n f norm ’ ) ; legend ( ’ g rad i en t descent ’ ) ; t i t l e ( ’ Gradient descent ’ ) subplot ( 2 , 2 , 3 ) , plot ( h i s t 2 ( 1 , : ) ) ; xlabel ( ’ i t e r a t i o n ’ ) ; ylabel ( ’ fun va l ’ ) ; legend ( ’ newton method ’ ) ; t i t l e ( ’ Newtons method ’ ) subplot ( 2 , 2 , 4 ) , plot ( h i s t 2 ( 2 , : ) ) ; xlabel ( ’ i t e r a t i o n ’ ) ; ylabel ( ’ grad i n f norm ’ ) ; legend ( ’ newton method ’ ) ; t i t l e ( ’ Newtons method ’ ) function [ f , g , H] = fgh (x ) %func t i on f o r hw4 , prb2 .4 A = [ 0 . 0 3 7 2 , 0 . 2 8 6 9 ; 0 . 6 861 , 0 . 7 071 ; 0 . 6 233 , 0 . 6 245 ; 0 . 6 3 4 4 , 0 . 6 1 7 0 ] ; b = [ 0 . 8 5 8 7 , 0 . 1 7 8 1 , 0 . 0 7 4 7 , 0 . 8 4 0 5 ] ’ ; mu = 0 . 1 ; i f sum(x<=0) ˜= 0 f = Inf ; [ n , ˜ ] = s ize ( x ) ; g = NaN(n , 1) ; H = NaN(n , n ) ; else l = A⇤x�b ; f = ( l ’ ) ⇤ l � mu⇤sum( log ( x ) ) ; g = 2⇤(A’ ) ⇤ l � mu⇤ ( 1 . / x ) ; H = 2⇤(A’ ) ⇤A + mu⇤diag ( 1 . / ( x . ˆ 2 ) ) ; end return end 5. Suppose we say that a method converges if the infinity norm of the gradient is less than 10�4. Plot the number of function evaluations of your new steepest descent method and the Newton method for the values of µ = {10�1, 10�2, 10�3, 10�4, 10�5, 10�6}. Show your solutions for each of these cases. Compare the solutions to Matlab’s fminunc routine. Answer: The solutions is as follows, 6 columns correspond to di↵erent µs. Basically both of the two methods converge to the solutions of fminunc routine. xgrad = Columns 1 through 3 0.203401385491268 0.031356879716420 0.003302544299047 0.566276021849502 0.668855553977323 0.690319947024598 Columns 4 through 6 0.000331866443660 0.000088542751414 0.000018547582083 0.692642815622834 0.688367917395477 0.663899018138591 xnewtons = Columns 1 through 3 0.203364839208255 0.031345907965919 0.003300555473141 3 0.566308538239243 0.668885696192224 0.690355769488330 Columns 4 through 6 0.000331785428427 0.000033196339712 0.000003319771431 0.692674723090668 0.692908531414492 0.692931798383219 xfmin = Columns 1 through 3 0.203363682216457 0.031340098780883 0.003300040277398 0.566323772686660 0.668891184389050 0.690356255223426 Columns 4 through 6 0.000331024692903 0.000033195856239 0.000027181578248 0.692675440756985 0.692908411592569 0.638452502290494 The plot is as follows 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 60 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 x 10 4 mu Fu n E va l N um be r Gradient descent Newtons method And the code is as follows x0 = [ 1 , 1 ] ’ ; mus = [1 e�1 1e�2 1e�3 1e�4 1e�5 1e�6] ; global mu; nfev = zeros (2 , length (mus) ) ; xgrad = zeros (2 , length (mus) ) ; xnewtons = zeros (2 , length (mus) ) ; xfmin = zeros (2 , length (mus) ) ; for i = 1 : length (mus) mu = mus( i ) ; param . t o l = 1e�4; [ x1 , f1 , g1 , h i s t1 , h i s tx1 , f eva ln1 ] = g r ad i en t d e s c en t 1 (@fgh , x0 , param) ; [ x2 , f2 , g2 , ˜ , h i s t2 , h i s tx2 , f eva ln2 ] = newtons method 1 (@fgh , x0 , param) ; opt i ons = opt imset ( ’GradObj ’ , ’ on ’ , ’ Hess ian ’ , ’ on ’ ) ; [ x3 ] = fminunc (@fgh , x0 , opt ions ) ; n fev (1 , i ) = f eva ln1 ; nfev (2 , i ) = f eva ln2 ; xgrad ( : , i ) = x1 ; xnewtons ( : , i ) = x2 ; 4 xfmin ( : , i ) = x3 ; end figure ; plot ( 1 : length (mus) , nfev ( 1 , : ) , ’ r�.ˆ ’ ) ; hold on ; plot ( 1 : length (mus) , nfev ( 2 , : ) , ’ b : o ’ ) ; xlabel ( ’mu ’ ) ; ylabel ( ’Fun Eval Number ’ ) ; legend ( ’ Gradient descent ’ , ’ Newtons method ’ ) ; The mu in the objective functions is set as a global variable. And a cumulative counter of the number of function evaluations is added into gradient decent 1.m and newtons mehtod 1.m. Problem 3: Necessary and su�cient conditions Let f(x) = 12x TQx� xT c. 1. Write down the necessary conditions for the problem: minimize f(x) subject to x � 0 . Answer: Add the lagrange multiplier, 1 2 xTQx� xT c� �Tx Then the KKT condition is Qx� c� � = 0 s � 0,x � 0,�Tx = 0 And the KKT condition is the first order necessary condition. 2. Write down the su�cient conditions for the same problem. Answer: The KKT conditions plus Q � 0 are su�cient. 3. Consider the two-dimensional case with Q =  1 2 2 1 � c =  0 �1.5 � . Determine the solution to this problem by any means you can, and justify your work. Answer: I use the fmincon routine and the solution is (0, 0) with function value 0. The code is as follows [ x , fv ] = fmincon (@f , [ 1 , 1 ] ’ , [ ] , [ ] , [ ] , [ ] , [ 0 , 0 ] ’ ) function [ f v ] = f ( x ) %ob j e c t i v e func o f hw4 p3 Q = [ 1 , 2 ; 2 , 1 ] ; c = [ 0 ; �1 .5 ] ; fv = 0 . 5⇤ ( x ’ ) ⇤Q⇤x � (x ’ ) ⇤c ; end The output is as follows 5 x = 0 0 fv = 0 f lag = 1 output = i t e r a t i o n s : 2 funcCount : 6 l s s t e p l e n g t h : 1 s t e p s i z e : 0 a lgor i thm : ’medium�s c a l e : SQP, Quasi�Newton , l i n e�search ’ f i r s t o r d e r o p t : 0 c o n s t r v i o l a t i o n : 0 message : [ 1 x783 char ] lambda . lower : 0 .000000007450581 1.500000007450581 grad = 0.000000007450581 1.500000007450581 According to the output, the first order necessary condition or say the KKT condition is satisfied with � = (0, 1.5). 4. Produce a Matlab or hand illustration of the solution showing the function contours, gradient, the constraint normal. What are the active constraints at the solution? What is the value of � in AT� = g? Answer: The contour is as follows As shown in last subproblem, the gradient at (0, 0) is (0, 1.5). x1 x2 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 50 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 10 20 30 40 50 60 70 80 The active constraints are both x1 > 0, x2 > 0 and the constraint normal is 0. The � for the lower constraints is (0, 1.5). 6 Ning Zhang Ning Zhang Gradient

                    本文档为【数值优化homework-4】，请使用软件OFFICE或WPS软件打开。作品中的文字与图均可以修改和编辑，
                    图片更改请在作品中右键图片并更换，文字修改请直接点击文字进行修改，也可以新增和删除文档中的内容。 
 该文档来自用户分享，如有侵权行为请发邮件ishare@vip.sina.com联系网站客服，我们会及时删除。

                    [版权声明] 本站所有资料为用户分享产生，若发现您的权利被侵害，请联系客服邮件isharekefu@iask.cn，我们尽快处理。

                    本作品所展示的图片、画像、字体、音乐的版权可能需版权方额外授权，请谨慎使用。

                    网站提供的党政主题相关内容(国旗、国徽、党徽..)目的在于配合国家政策宣传，仅限个人学习分享使用，禁止用于任何广告和商用目的。
                

下载需要：免费已有0 人下载

立即下载

数值优化homework-4

你可能还喜欢