INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN ENGINEERING, VOL. 19, 1593-1619 (1983)
A POSTERIORI ERROR ANALYSIS AND ADAPTIVE
PROCESSES IN THE FINITE ELEMENT METHOD:
PART I-ERROR ANALYSIS
D. W. KELLY, J. P. DE S. R. G A G 0 AND 0. C. ZIENKIEWICZ
Department of Civil Engineering, University College of Swansea, U.K.
I . BABUSKA
Institute for Physical Science and Technology, University of Maryland, U.S.A.
SUMMARY
This is a paper presented in two parts dealing respectively with error analysis and adaptive processes
applied to finite element calculations. Part I contains the basic theory and methods of deriving error
estimates for second-order problems. Part I1 of the paper deals with the strategy for adaptive refinement
and concentrates on the p-convergent methods. It is shown that an extremely high rate of convergence
is reached in practical problems using such procedures. Applications to realistic stress analysis and
potential problems are presented.
INTRODUCTION
This paper is concerned with the identification of the discretization error in the finite element
method and the definition of optimal refinement processes. Recently Babuska, Rheinboldt
and their co-workers'-'4 have developed a posteriori error estimates for the bilinear element
which are sufficiently robust to reliably indicate when the error is small. Closely related to
this work has been the development of adaptive schemes which automatically enrich the finite
element mesh on the basis of local error indicators to give convergence to near-optimal meshes
and re-solve until a prespecified accuracy is achieved. These mathematical processes lead to
the question of the local as well as global interpretation of the error and the definition of
optimality in the finite element mesh. We will address these questions in this two-part paper,
aiming in the first part to simplify and extend the derivation of the error estimators and assess
their reliability.
In part I1 of the paper17 we discuss the adaptive processes and the question of optimal
algorithms to achieve a given accuracy. An attempt will be made to identify both the strengths
and weaknesses of the approaches which are currently available, whether the aim is to achieve
a process for assessing the error in a given solution or the development of fully automated
procedures in which the accuracy required is prespecified in the data.
A posteriori error analysis
One of the most difficult aspects of numerical modelling is the validation of the results.
Benchmark problems provide an invaluable step in this process by ensuring that, at least for
0029-598l/83/111593-27$02.70
@ 1983 by John Wiley & Sons, Ltd.
Received June 1982
Revised September 1982
1594 D. W. KELLY ET AL
a certain class of problems, the natural and physical laws represented in the analysis for
reasonable discretization of the domain give acceptable results. The difficulty, however, is to
ensure that a given discretization is adequate for the particular problem at hand, which may
not necessarily be closely related to the benchmark problem.
The classical approaches to error analysis are computationally expensive. It is possible by
re-analysis on a sequence of meshes of increasing refinement to determine the convergence
rate of the method and predict the exact solution by Richardson’s extrapolation, thereby
obtaining a measure against which to judge the accuracy of the solution obtained.
Alternatively, for self-adjoint problems and methods in which the process is essentially one
of energy minimization, complementary or dual analysis procedures allow a theoretical bound
on the energy norm of the exact solution to be obtained (see, for example, the work of Fraeijs
de Veubecke”). This approach requires, however, a second solution with a different computer
program for the complementary formulation, a solution notoriously difficult in implementation.
Both classical approaches, requiring multiple analyses, would therefore appear to be too
expensive for extensive practical implementation.
In recent years, the existence and efficiency of a method of asymptotic a posteriori error
analysis for the finite element method has been studied and proved for linear elements by
Babuska and his co-workers. ‘-14 The error estimates are based on information obtained during
the solution process itself. In this respect, they are different from the theoretical a priori error
estimates which predict the rate of convergence but say little about the error in a particular
solution on a finite mesh. Being a posteriori in nature, they also.offer an inexpensive alternative
to the classical approaches mentioned above.
One of the main features of these a posteriori error estimates is that they involve local,
rather than global, computations. They are given in an asymptotic form which guarantees
accuracy when linked to adaptive refinement algorithms as h + O , where h is a measure of
the size of the elements: but experience has shown that they can provide information about
the errors on a finite grid. The work for linear, self-adjoint boundary value problems is now
well founded and extensions to other classes of problems are being made.
A difficulty, however, is that these error estimates are historically related to adaptive mesh
schemes so that the asymptotic character of the error estimates is accounted for. Also the
mathematical character of the proofs so far existent make it difficult to interpret their
shortcomings, a necessary prerequisite for their adoption by the engineering community. We
therefore attempt an alternative analysis which will derive the error estimates proposed in
References 1-14 for one- and two-dimensional problems, utilizing a minimum of the complex
mathematics on which the previous proofs depended.
The estimates have only been formally derived for linear elements and the h- convergence
process in which the mesh is subdivided with the polynomial order of the elements held
constant. The analysis here will offer a simple extension to suggest an error estimator for
higher order elements and for the p-version of the finite element method in which the order
of the elements is increased on a fixed mesh. We then approach the question of defining
reliable estimates for coarse meshes and estimating the local accuracy of stresses.
The work, however, would not be complete if procedures were not available for indicating
how the error, now identified, can be reduced to an acceptable level. We therefore include
the notion of error indication as well as error estimation, the former identifying how the mesh
can be adjusted in a local and near-optimal manner.
Part I1 of this paper” will then address the question of optimality of the finite element
mesh and suggest procedures for fully automatic mesh adaptation and re-solution (see also
References 16, 18 and 28).
A POSTERIORI ERROR ANALYSIS-PART I 1595
DISCRETIZATION ERROR IN A FINITE ELEMENT SOLUTION
Identification of the error
Consider the approximate solution of a model problem
- Q T a V u + b u + f = O o n R
where
with boundary conditions
where r is the boundary of R. We will assume for simplicity that r D Z 0, so that the solution
is unique.
By HD we denote the set of all functions which have square integrable first derivatives and
are zero on r D .
We assume now that the solution of (1) and (2) belongs to HD. Multiplying (1) by arbitrary
weighting functions u E HD, and integrating by parts, we see that the equation
holds for any v E HD.
Now we take (3) as the definition of (1) and (2) and see that we can now deal with functions
which have discontinuous derivatives (Co continuity), because the order of differentiation has
been reduced by the integration by parts.
The finite element method is as usual based on using the approximate form
We determine the parameters Gi from (3) with v = Ni, j = 1, . . . , M, and Ni, i = 1, . . . , M, being
the finite element basis functions.
This will lead to the well-known finite element approximation equations for the determina-
tion of parameters ii:
= - J n f N i d ~ + J N q d F j = I , . . . , M
r
1596 D. W. KELLY E T A L
The finite element basis functions Ni are assumed smooth inside the elements but have
discontinuous derivatives on the element interfaces.
Further, we note that i is an exact solution of the original problem with perturbed f and
q. We show this by integrating ( 5 ) by parts with Nj = v which gives
Here we denote by [ J ( a au'/an)Ir, the jump of normal derivatives on the element boundaries
not contained in r. Ri is the element domain.
From (6) we immediately see that we have in R
-VTaVii + bii + f + p = r (7)
and on rN
a i
an
a - = q - [
where on f i i
r i = -VTaVii+bii+f
and, on the interfaces, p is the singular function such that on r k
with S k being a Dirac function concentrated on r k .
Finally on rN we get
a i
,$=q-a-
an
(9)
In physical interpretation ri is equivalent to a distributed 'load', p is the 'load' concentrated
Coming back to ( 5 ) we see now that
on the interfaces of the elements and 5 is a boundary 'traction'.
is a measure of the global error self-equilibration. Also, we note by subtracting (7) from (1)
that denoting the error in u, e = u - 6, e satisfies the original boundary value problem with
r + p and 5 instead of f and q.
The error norm
We saw that the error e satisfies the original boundary value problem with appropriate
residuals replacing the originally prescribed 'body' forces f and boundary traction
conditions q.
A POSTERIORI ERROR ANALYSIS-PART I 1597
Now the question arises how to measure the magnitude of the error e. We have various
possibilities, for example to measure
or
The most natural norm for this kind of problem is the energy norm. In this we simply use
the form of equation (3) and substitute u = z, = e. Thus for e c HD
Ilelli= In (a(Vre V e ) + b e z ) dR (15)
In this case we get e = 0 when 11e11; = 0.
can be written in terms of the residuals determined in (9)-( 11):
The energy norm has many useful properties which we need not discuss here. Obviously it
In this paper we will concentrate only on the estimates in this energy norm.
ERROR INDICATION AND ERROR ESTIMATION IN ONE DIMENSION
The model problem and its properties
Consider now the one-dimensional model problem
with boundary conditions
u(0) = u(L) = 0
Assume that a, 6, f, are sufficiently smooth and that
0 < g d a ( x ) G d
I ChPllellE (22)
where x, is the nodal point and h is the maximal element dimension.
This ‘superconvergence’ effect has been identified in the engineering literature,26 where it
is shown that exact nodal results are given when the shape functions are solutions of the
homogeneous part of equation (17). Equation (22) is more general and allows the error at
nodes to be disregarded. The proof is given in Appendix I.
The error indication in one dimension
Let us consider the special case when e ( x i ) = 0. We know that e ( x ) satisfies the second-order
differential equation (17) with a ( x ) = constant and b ( x ) = 0. If we assume e ( x i ) = 0 then the
singular term p loses its significance and we can solve the differential equation for e, element
by element only.
First we elaborate on a family of finite elements which are especially advantageous for
our purpose. It is possible to generate a family which is shown in Figure 1. Only for the linear
element are the nodal variables the nodal values of u. For quadratic and higher nodes the
modes are hierarchical functions whose amplitudes are prescribed by nodeless variables (see
Zienkiewicz et al.*’).
Advantageously we select for N2, N3, etc., the integrals of Legendre polynomials so that
for on the interval [ - 1, 13 we have, for j 3 2,
d 1 d“
dx 2 n ! dx - N j = - - - ( [ -1)
and
Ni( - 1) = Nj(1) = 0 (24)
Figure 1. One-dimensional hierarchical elements for the p-version of the finite element method
A POSTERIORI ERROR ANALYSIS-PART I 1599
p 1 krll -
1,l 1.2 0 . 0 0 0
k['l k['l k['1
0 k[ll k['l
2.1 2.2 2.3 . 0 0 0
3.2 3.3 9 0 0 0
kr'l 0 0 0 . n,n 0 0
0 0 0 . 0 k\:! 0
0 0 0 . 0 0 kL2' 2.2 -
Due to the orthogonality of Legendre polynomials
This choice leads to a stiffness matrix which has a special structure when a = const, b = 0. Due
to orthogonality of the derivatives of N2, etc., the higher terms are diagonal as shown below:
a3
e = 1 ciNj
j = p + l
where ci are unknown parameters and p the order of the hierarchic modes in the finite element
solution. From equation (27) we observe that
noting that
riNj dx = 0 for j = 2, . . . , p I,
because of the particular structure of the stiffness matrix.
term in the series is majorant. We can now take from (30)
This series is a general estimate of error, but when f is smooth we can assume that the first
and get
1600 D. W. KELLY ET AL
The total energy error based on these arguments is
This error indicator was presented in References 19 and 25. We can give to this result the
following interpretation: If we solve the problem twice with polynomials of degree p and p + 1
and denote Gcp’ and G[p+l’ as these two solutions, we get
(35) = ;Ip+11 - G I P l
Let us now consider the general case. We could once more compute together with ClP’ the
as before. But this would be costly. So, we will solve 3[p+11 in an iterative way solution
(of Gauss-Seidel type) and using only one iterative step, starting with Z I P ’ .
Then we get on I,
where
as before. Using here one term only we get exactly the same result as before, namely
Let us now summarize where we did the approximations preventing us from getting a
guaranteed upper estimate.
1. We assumed that the error in the nodal points is zero, i.e. e(xi) = 0, which is not true in
general although a superconvergence effect is present.
2. We assumed that the first term in the expansion is majorant.
3. We assumed that one iterative step in the general case is sufficient.
First let us address assumption 2. The importance of this approximation is immediately
apparent. If the residual ri happens to be orthogonal to Np+l we get qi = 0. This shows that
the estimate could be deceptive.
The effects of the approximations 1 and 3 are not so obvious, but as we will see later these
approximations are not so important. Experiment will show that the qi give excellent indications
of the energy of the error which will be absorbed by the addition of the corresponding hierarchic
mode to the finite element basis. We will therefore call the qi error indicators and make
extensive use of them in the adaptive procedures in Part I1 of the paper.
We can however elaborate on formula (38), by considering properties of the finite element
approximation. Suppose that we are considering linear elements only, i.e. p = 1, and reducing
the size of the elements h + 0. We know that in this case the residual r in each element will
become approximately constant and that all the energy error should be, in the limit correspond-
ing to quadratic modes of the form
N*=x(x - h i ) (39)
A POSTERIORI ERROR ANALYSIS-PART I 1601
on the interval [0, hi ] , hi = xi+l -xi. Formula (38) will then become
We can thus implement an estimator
h?rf
12
-- -
which will be assymptotically correct as hi + 0.
The error estimator in one dimension
We will address now in more detail the problem associated with the identification of an
We know that on l i [ x i , xi+J the error e satisfies the differential equation
error estimator.
with
Le = r
e(xi) = e ( x i + l ) = 0
(43)
(44)
with ri known and with a priori knowledge that
following from the finite element approach.
We would like to estimate [lellE(r,l in the form
Recalling the well-known property of the Rayleigh equation for establishing eigenvalues we
have (see Appendix 11)
where hmi, is the smallest eigenvalue of the problem
implying that
We can now consider the particular case of a (x) = a and b (x) = 0. Then
d' u Lu = - a 7
dx
1602 D. W. KELLY ET AL.
and we have to consider the eigenvalue problem
d2u
dx2 -
-a---Au o n &
In this case it is obvious that the eigenfunction is sin n/hi (x - x i ) with hi = x i + l - x i and
hmin = (.rr2/h’)a
We have thus
(49)
For the general case when a, b are smooth a, b, are nearly constant on I, and it is easy to
(51)
where amin is the minimum of a on 1;.
This estimate is correct for any degree of elements. Of course Ci is pessimistic in general
because we did not use (45).
For j = 2 we get easily that
show that
2 A m i n z r /h?a,i,
and for j = 3
So approximately we can take
h? c. =-
‘ r2ap2 (54)
where p is the degree of polynomial solution used.
In the general case we have to deal with eigenfunctions and eigenvalues for the complete
operator. So far we have analysed the worst possible case and it is easy to see that for the
general operator the same bounds apply (see Appendix 11).
For the linear elements and f smooth, we have seen that since the residual is roughly
constant we can replace r by 12, and the upper bound becomes approximately an equality,
exact in the limit h --* 0.
This approach deals not only with the second but also the third approximation when estimates
for Ci based on the eigenvalue problems are used.
We have to address now, only approximation 1 of the previous section. It has been proved
that
Ie(xi)I ChPllellE ( 5 5 )
and we estimated the error assuming the error at the nodal points is zero. So denoting by the
piecewise linear function which coincides with e ( x i ) , we neglected the error \ I + I I E . It is shown
A POSTERIORI ERROR ANALYSIS-PART I 1603
in Appendix I that
where C is independent of the solution, depending only on a and b.
We see that our approximation is of magnitude ChPllellE decreasing both with the use of
higher order polynomials and finer meshes.
Examples show that this term can be safely neglected in practice when compared with IlellE.
Summarizing, we therefore implement
as our error estimator for one-dimensional problems and low order of p .
Examples in one dimension
The error estimates derived above have been applied to the problem
+bu = x " o n x E[O, 11
d2u _-
dx2
with boundary conditions u (0) = u (1) = 0.
The results given in Tables I and I1 are for b = 0. The finite element solutions are exact at
the nodes, and therefore (in this special case) the hierarchic projections, if taken to the order
of the polynomial of the exact solution and summed, determine exactly the energy of the
error. In the tables,
where E E: is the estimate (57).
The results given in Tables I11 and IV are for b = 1.
ERROR INDICATION AND ESTIMATION IN TWO DIMENSIONS
In two dimensions the hierarchical functions have their support on either one or two elements,
as shown in Figure 2 . Again we can use the hierarchical modes to make a projection as in
Table I. Linear elements
No. of
elements n (z E ')l" Ile IIE B
2 0
2
4
4 0
2
4
8 0
2
4
0.144E 0
0.6468 - 1
0.481E - 1
0.7226 0
0.323E - 1
0.241E - 1
0.361E - 1
0.161E - 1
0.120E - 1
0.144E 0
0.601E - 1
0.385E - 1
0.7228 - 1
0.311E - 1
0.227E -