非线性最小二乘法Levenberg-Marquardtmethod
Levenberg-Marquardt Method(麦夸尔特法)
Levenberg-Marquardt is a popular alternative to the Gauss-Newton method of finding the minimum of a function that is a sum of squares of nonlinear functions,
Let the Jacobian of be denoted , then the Levenberg-Marquardt method searches in the direction given by the solution to the equations
where are nonnegative scalars and is the identity matrix. The method has the nice property that, for some scalar related to , the vector is the solution of the constrained subproblem of minimizing
subject to (Gill et al. 1981, p. 136).
The method is used by the command FindMinimum[f, x, x0] when given the Method ->
Levenberg Marquardt option.
SEE ALSO: Minimum, Optimization
REFERENCES:
Bates, D. M. and Watts, D. G. Nonlinear Regression and Its Applications. New York: Wiley, 1988.
Gill, P. R.; Murray, W.; and Wright, M. H. "The Levenberg-Marquardt Method." ?4.7.3 in Practical Optimization. London:
Academic Press, pp. 136-137, 1981.
Levenberg, K. "A Method for the Solution of Certain Problems in Least Squares." Quart. Appl. Math. 2, 164-168, 1944.
Marquardt, D. "An Algorithm for Least-Squares Estimation of Nonlinear Parameters." SIAM J. Appl. Math. 11, 431-441, 1963.
Levenberg–Marquardt algorithm
From Wikipedia, the free encyclopedia
Jump to: navigation, search
[1]In mathematics and computing, the Levenberg–Marquardt algorithm (LMA)
provides a numerical solution to the problem of minimizing a function, generally nonlinear, over a space of parameters of the function. These minimization problems arise especially in least squares curve fitting and
nonlinear programming.
The LMA interpolates between the Gauss–Newton algorithm (GNA) and the
method of gradient descent. The LMA is more robust than the GNA, which
means that in many cases it finds a solution even if it starts very far off the final minimum. For well-behaved functions and reasonable starting
the LMA tends to be a bit slower than the GNA. LMA can also parameters,
be viewed as Gauss–Newton using a trust region approach.
The LMA is a very popular curve-fitting algorithm used in many software applications for solving generic curve-fitting problems. However, the LMA finds only a local minimum, not a global minimum.
Contents
[hide]
, 1 Caveat Emptor
, 2 The problem
, 3 The solution
o 3.1 Choice of damping parameter
, 4 Example
, 5 Notes
, 6 See also
, 7 References
, 8 External links
o 8.1 Descriptions
o 8.2 Implementations
[edit] Caveat Emptor
One important limitation that is very often over-looked is that it only optimises for residual errors in the dependant variable (y). It thereby implicitly assumes that any errors in the independent variable are zero or at least ratio of the two is so small as to be negligible. This is not a defect, it is intentional, but it must be taken into account when deciding whether to use this technique to do a fit. While this may be suitable in context of a controlled experiment there are many situations where this assumption cannot be made. In such situations either non-least squares methods should be used or the least-squares fit should be done in proportion to the relative errors in the two variables, not simply the vertical "y" error. Failing to recognise this can lead to a fit which is significantly incorrect and fundamentally wrong. It will usually underestimate the slope. This may or may not be obvious to the eye. MicroSoft Excel's chart offers a trend fit that has this limitation that is undocumented. Users often fall into this trap assuming the fit is correctly calculated for all situations. OpenOffice spreadsheet copied this feature and presents the same problem.
[edit] The problem
The primary application of the Levenberg–Marquardt algorithm is in the
least squares curve fitting problem: given a set of m empirical datum pairs of independent and dependent variables, (x, y), optimize the parameters ii
β of the model curve f(x,β) so that the sum of the squares of the
deviations
becomes minimal.
[edit] The solution
Like other numeric minimization algorithms, the Levenberg–Marquardt
algorithm is an iterative procedure. To start a minimization, the user
has to provide an initial guess for the parameter vector, β. In many Tcases, an uninformed standard guess like β=(1,1,...,1) will work fine;
in other cases, the algorithm converges only if the initial guess is
already somewhat close to the final solution.
In each iteration step, the parameter vector, β, is replaced by a new estimate, β + δ. To determine δ, the functions are approximated by their linearizations
where
is the gradient (row-vector in this case) of f with respect to
β.
At its minimum, the sum of squares, S(β), the gradient of S with
respect to δ will be zero. The above first-order approximation
of gives
.
Or in vector notation,
.
Taking the derivative with respect to δ and setting the
result to zero gives:
thwhere is the Jacobian matrix whose i row equals J, i
thand where and are vectors with i component
and y, respectively. This is a set of linear i
equations which can be solved for δ.
Levenberg's contribution is to replace this equation
by a "damped version",
where I is the identity matrix, giving as the increment, δ, to the estimated parameter vector, β.
The (non-negative) damping factor, λ, is
adjusted at each iteration. If reduction of S is
rapid, a smaller value can be used, bringing the algorithm closer to the Gauss–Newton algorithm,
whereas if an iteration gives insufficient reduction in the residual, λ can be increased,
giving a step closer to the gradient descent direction. Note that the gradient of S with
respect to β equals .
Therefore, for large values of λ, the step will
be taken approximately in the direction of the gradient. If either the length of the calculated step, δ, or the reduction of sum of squares from the latest parameter vector, β + δ, fall below
predefined limits, iteration stops and the last parameter vector, β, is considered to be the
solution.
Levenberg's algorithm has the disadvantage that if the value of damping factor, λ, is large, Tinverting JJ + λI is not used at all. Marquardt provided the insight that we can scale each component of the gradient according to the curvature so that there is larger movement along the directions where the gradient is smaller. This avoids slow convergence in the direction of small gradient. Therefore, Marquardt replaced the identity matrix, I, with the diagonal matrix Tconsisting of the diagonal elements of JJ,
resulting in the Levenberg–Marquardt algorithm:
.
A similar damping factor appears in Tikhonov
regularization, which is used to solve linear ill-posed problems, as well as in ridge
regression, an estimation technique in
statistics.
[edit] Choice of damping parameter
Various more-or-less heuristic arguments have been put forward for the best choice for the damping parameter λ. Theoretical arguments
exist showing why some of these choices guaranteed local convergence of the algorithm; however these choices can make the global convergence of the algorithm suffer from the undesirable properties of steepest-descent,
in particular very slow convergence close to the optimum.
The absolute values of any choice depends on how well-scaled the initial problem is. Marquardt recommended starting with a value λ 0
and a factor ν>1. Initially setting λ=λ and 0
computing the residual sum of squares S(β)
after one step from the starting point with the damping factor of λ=λ and secondly with 0
λ/ν. If both of these are worse than the 0
initial point then the damping is increased by successive multiplication by ν until a better
point is found with a new damping factor of kλν for some k. 0
If use of the damping factor λ/ν results in
a reduction in squared residual then this is taken as the new value of λ (and the new
optimum location is taken as that obtained with this damping factor) and the process continues; if using λ/ν resulted in a worse residual, but using λ resulted in a better residual then λ is left unchanged and the new optimum is taken as the value obtained with λ as damping
factor.
[edit] Example
Poor Fit
Better Fit
Best Fit
In this example we try to fit the function y
= acos(bX) + bsin(aX) using the
Levenberg–Marquardt algorithm implemented in GNU Octave as the leasqr function. The 3 graphs
Fig 1,2,3 show progressively better fitting for the parameters a=100, b=102 used in the
initial curve. Only when the parameters in Fig 3 are chosen closest to the original, are the
curves fitting exactly. This equation is an example of very sensitive initial conditions for the Levenberg–Marquardt algorithm. One reason for this sensitivity is the existence of multiple minima — the function cos(βx) has
minima at parameter value and
[edit] Notes
1. ^ The algorithm was first published by
Kenneth Levenberg, while working at the
Frankford Army Arsenal. It was rediscovered
by Donald Marquardt who worked as a
statistician at DuPont and independently by
Girard, Wynn and Morrison.
[edit] See also
, Trust region
[edit] References
, Kenneth Levenberg (1944). "A Method for the
Solution of Certain Non-Linear Problems in
Least Squares". The Quarterly of Applied
2: 164–168. Mathematics
, A. Girard (1958). Rev. Opt 37: 225, 397.
, C.G. Wynne (1959). "Lens Designing by
Electronic Digital Computer: I". Proc.
73 (5): 777. Phys. Soc. London
doi:10.1088/0370-1328/73/5/310.
, Jorje J. Moré and Daniel C. Sorensen (1983).
"Computing a Trust-Region Step". SIAM J.
(4): 553–572. Sci. Stat. Comput.
, D.D. Morrison (1960). Jet Propulsion
. Laboratory Seminar proceedings, Donald Marquardt (1963). "An Algorithm for
Least-Squares Estimation of Nonlinear
Parameters". SIAM Journal on Applied
Mathematics 11 (2): 431–441.
doi:10.1137/0111030.
, Philip E. Gill and Walter Murray (1978).
"Algorithms for the solution of the
nonlinear least-squares problem". SIAM
15 (5): Journal on Numerical Analysis
977–992. doi:10.1137/0715063.
, Nocedal, Jorge; Wright, Stephen J. (2006).
. Numerical Optimization, 2nd Edition
Springer. ISBN 0-387-30303-0.
[edit] External links
[edit] Descriptions
, Detailed description of the algorithm can be
Numerical Recipes in C, Chapter found in
15.5: Nonlinear models
, C. T. Kelley, Iterative Methods for
, SIAM Frontiers in Applied Optimization
Mathematics, no 18, 1999, ISBN
0-89871-433-8. Online copy
, History of the algorithm in SIAM news
, A tutorial by Ananth Ranganathan
, Methods for Non-Linear Least Squares
Problems by K. Madsen, H.B. Nielsen, O.
Tingleff is a tutorial discussing
non-linear least-squares in general and
the Levenberg-Marquardt method in
particular
, T. Strutz: Data Fitting and Uncertainty (A
practical introduction to weighted least
. Vieweg+Teubner, ISBN squares and beyond)
978-3-8348-1022-9.
[edit] Implementations
, Levenberg-Marquardt is a built-in algorithm
with Mathematica
, Levenberg-Marquardt is a built-in algorithm
with Matlab
, The oldest implementation still in use is lmdif, from MINPACK, in Fortran, in the
public domain. See also:
o lmfit, a translation of lmdif into C/C++
with an easy-to-use wrapper for curve
fitting, public domain.
o The GNU Scientific Library library has
a C interface to MINPACK.
o C/C++ Minpack includes the
Levenberg–Marquardt algorithm.
o Several high-level languages and
mathematical packages have wrappers for
the MINPACK routines, among them:
, Python library scipy, module
scipy.optimize.leastsq,
, IDL, add-on MPFIT.
, R (programming language) has the
minpack.lm package.
, levmar is an implementation in C/C++ with
support for constraints, distributed under the GNU General Public License.
o levmar includes a MEX file interface for
MATLAB
o Perl (PDL), python and Haskell
interfaces to levmar are available: see
PDL::Fit::Levmar, PyLevmar and
HackageDB levmar.
, sparseLM is a C implementation aimed at minimizing functions with large, arbitrarily sparse Jacobians. Includes a MATLAB MEX interface.
, ALGLIB has implementations of improved LMA in C# / C++ / Delphi / Visual Basic. Improved algorithm takes less time to converge and can use either Jacobian or exact Hessian.
, NMath has an implementation for the .NET
Framework.
, gnuplot uses its own implementation gnuplot.info.
, Java programming language implementations:
1) Javanumerics, 2) LMA-package (a small,
user friendly and well documented
implementation with examples and support),
3) Apache Commons Math
, OOoConv implements the L-M algorithm as an
OpenOffice.org Calc spreadsheet.
, SAS, there are multiple ways to access SAS's
implementation of the Levenberg-Marquardt
algorithm: it can be accessed via NLPLM
Call in PROC IML and it can also be accessed
through the LSQ statement in PROC NLP, and
the METHOD=MARQUARDT option in PROC NLIN.
本文档为【非线性最小二乘法Levenberg-Marquardtmethod】,请使用软件OFFICE或WPS软件打开。作品中的文字与图均可以修改和编辑,
图片更改请在作品中右键图片并更换,文字修改请直接点击文字进行修改,也可以新增和删除文档中的内容。
该文档来自用户分享,如有侵权行为请发邮件ishare@vip.sina.com联系网站客服,我们会及时删除。
[版权声明] 本站所有资料为用户分享产生,若发现您的权利被侵害,请联系客服邮件isharekefu@iask.cn,我们尽快处理。
本作品所展示的图片、画像、字体、音乐的版权可能需版权方额外授权,请谨慎使用。
网站提供的党政主题相关内容(国旗、国徽、党徽..)目的在于配合国家政策宣传,仅限个人学习分享使用,禁止用于任何广告和商用目的。