Super-Resolution
Image Reconstruction:
A Technical Overview
MAY 2003 IEEE SIGNAL PROCESSING MAGAZINE 21
1053-5888/03/$17.00©2003IEEE
In most electronic imaging applica-tions, images with high resolution(HR) are desired and often re-quired. HR means that pixel den-
sity within an image is high, and
therefore an HR image can offer more
details that may be critical in various ap-
plications. For example, HR medical images are very
helpful for a doctor to make a correct diagnosis. It may be
easy to distinguish an object from similar ones using HR
satellite images, and the performance of pattern recogni-
tion in computer vision can be improved if an HR image
is provided. Since the 1970s, charge-coupled device
(CCD) and CMOS image sensors have been widely used
to capture digital images. Although these sensors are suit-
able for most imaging applications, the current resolution
level and consumer price will not satisfy the future de-
mand. For example, people want an inexpensive HR digi-
tal camera/camcorder or see the price gradually reduce,
and scientists often need a very HR level close to that of
an analog 35 mm film that has no visible artifacts when an
image is magnified. Thus, finding a way to increase the
current resolution level is needed.
The most direct solution to increase spatial resolution
is to reduce the pixel size (i.e., increase the number of pix-
els per unit area) by sensor manufacturing techniques. As
the pixel size decreases, however, the amount of light
available also decreases. It generates shot noise that de-
grades the image quality severely. To reduce the pixel size
without suffering the effects of shot noise, therefore,
there exists the limitation of the pixel size reduction, and
the optimally limited pixel size is estimated at about 40
µm2 for a 0.35 µm CMOS process. The current image
sensor technology has almost reached this level.
Another approach for enhancing the spatial resolution
is to increase the chip size, which leads to an increase in ca-
pacitance [1]. Since large capacitance makes it difficult to
speed up a charge transfer rate, this approach is not con-
sidered effective. The high cost for high precision optics
and image sensors is also an important concern in many
commercial applications regarding HR imaging. There-
fore, a new approach toward increasing spatial resolution
is required to overcome these limitations of the sensors
and optics manufacturing technology.
One promising approach is to use signal processing
techniques to obtain an HR image (or sequence) from
observed multiple low-resolution (LR) images. Recently,
such a resolution enhancement approach has been one of
the most active research areas, and it is called super resolu-
tion (SR) (or HR) image reconstruction or simply reso-
lution enhancement in the literature [1]-[61]. In this
article, we use the term “SR image reconstruction” to re-
fer to a signal processing approach toward resolution en-
hancement because the term “super” in “super
©DIGITAL VISION, LTD.
Sung Cheol Park, Min Kyu Park,
and Moon Gi Kang
resolution” represents very well the characteristics of the
technique overcoming the inherent resolution limitation
of LR imaging systems. The major advantage of the sig-
nal processing approach is that it may cost less and the ex-
isting LR imaging systems can be still utilized. The SR
image reconstruction is proved to be useful in many prac-
tical cases where multiple frames of the same scene can be
obtained, including medical imaging, satellite imaging,
and video applications. One application is to reconstruct
a higher quality digital image from LR images obtained
with an inexpensive LR camera/camcorder for printing
or frame freeze purposes. Typically, with a camcorder, it is
also possible to display enlarged frames successively. Syn-
thetic zooming of region of interest (ROI) is another im-
portant application in surveillance, forensic, scientific,
medical, and satellite imaging. For surveillance or foren-
sic purposes, a digital video recorder (DVR) is currently
replacing the CCTV system, and it is often needed to
magnify objects in the scene such as the face of a criminal
or the licence plate of a car. The SR technique is also use-
ful in medical imaging such as computed tomography
(CT) and magnetic resonance imaging (MRI) since the
acquisition of multiple images is possible while the reso-
lution quality is limited. In satellite imaging applications
such as remote sensing and LANDSAT, several images of
the same area are usually provided, and the SR technique
to improve the resolution of target can be considered. An-
other application is conversion from an NTSC video sig-
nal to an HDTV signal since there is a clear and present
need to display a SDTV signal on the HDTV without vi-
sual artifacts.
How can we obtain an HR image from multiple LR
images? The basic premise for increasing the spatial reso-
lution in SR techniques is the availability of multiple LR
images captured from the same scene (see [4, chap. 4] for
details). In SR, typically, the LR images represent differ-
ent “looks” at the same scene. That is, LR images are
subsampled (aliased) as well as shifted with subpixel pre-
cision. If the LR images are shifted by integer units, then
each image contains the same information, and thus there
is no new information that can be used to reconstruct an
HR image. If the LR images have different subpixel shifts
from each other and if aliasing is present, however, then
each image cannot be obtained from the others. In this
case, the new information contained in each LR image
can be exploited to obtain an HR image. To obtain differ-
ent looks at the same scene, some relative scene motions
must exist from frame to frame via multiple scenes or
video sequences. Multiple scenes can be obtained from
one camera with several captures or from multiple cam-
eras located in different positions. These scene motions
can occur due to the controlled motions in imaging sys-
tems, e.g., images acquired from orbiting satellites. The
same is true of uncontrolled motions, e.g., movement of
local objects or vibrating imaging systems. If these scene
motions are known or can be estimated within subpixel
accuracy and if we combine these LR images, SR image
reconstuction is possible as illustrated in Figure 1.
In the process of recording a digital image, there is a
natural loss of spatial resolution caused by the optical dis-
tortions (out of focus, diffraction limit, etc.), motion blur
due to limited shutter speed, noise that occurs within the
sensor or during transmission, and insufficient sensor
density as shown in Figure 2. Thus, the recorded image
usually suffers from blur, noise, and aliasing effects. Al-
though the main concern of an SR algorithm is to recon-
struct HR images from undersampled LR images, it
covers image restoration techniques that produce high
quality images from noisy, blurred images. Therefore, the
goal of SR techniques is to restore an HR image from sev-
eral degraded and aliased LR images.
A related problem to SR techniques is image restora-
tion, which is a well-established area
in image processing applications
[62]-[63]. The goal of image restora-
tion is to recover a degraded (e.g.,
blurred, noisy) image, but it does not
change the size of image. In fact, res-
toration and SR reconstruction are
closely related theoretically, and SR
reconstruction can be considered as a
second-generation problem of image
restoration.
Another problem related to SR re-
construction is image interpolation
that has been used to increase the size
of a single image. Although this field
has been extensively studied
[64]-[66], the quality of an image
magnified from an aliased LR image
is inherently limited even though the
ideal sinc basis function is employed.
That is, single image interpolation
22 IEEE SIGNAL PROCESSING MAGAZINE MAY 2003
Scene
Camera
Subpixel
Shift
Integer Pixel
Shift
: Reference LR Image
If There Exist Subpixel Shifts
Between LR Images,
SR Reconstruction Is Possible
Scene Scene
Scene
Scene
Camera CameraCamera
Video Sequence
...
� 1. Basic premise for super resolution.
wu
高亮
wu
高亮
wu
高亮
cannot recover the high-frequency
components lost or degraded during
the LR sampling process. For this
reason, image interpolation methods
are not considered as SR techniques.
To achieve further improvements in
this field, the next step requires the
utilization of multiple data sets in
which additional data constraints
from several observations of the same
scene can be used. The fusion of in-
formation from various observations
of the same scene allows us SR recon-
struction of the scene.
The goal of this article is to intro-
duce the concept of SR algorithms
to readers who are unfamiliar with
this area and to provide a review for
experts. To this purpose, we present
the technical review of various exist-
ing SR methodologies which are of-
ten employed. Before presenting the review of existing
SR algorithms, we first model the LR image acquisi-
tion process.
Observation Model
The first step to comprehensively analyze the SR image
reconstruction problem is to formulate an observation
model that relates the original HR image to the observed
LR images. Several observation models have been pro-
posed in the literature, and they can be broadly divided
into the models for still images and for video sequence. To
present a basic concept of SR reconstruction techniques,
we employ the observation model for still images in this
article, since it is rather straightforward to extend the still
image model to the video sequence model.
Consider the desired HR image of size L N L N1 1 2 2×
written in lexicographical notation as the vector
x = [ , ,...., ]x x xN
T
1 2 , where N L N L N= ×1 1 2 2 . Namely,
x is the ideal undegraded image that is sampled at or
above the Nyquist rate from a continuous scene which is
assumed to be bandlimited. Now, let the parameters L1
and L2 represent the down-sampling factors in the obser-
vation model for the horizontal and vertical directions, re-
spectively. Thus, each observed LR image is of size
N N1 2× . Let the kth LR image be denoted in lexico-
graphic notation as y k k k k M
Ty y y= [ , ,...., ], , ,1 2 , for
k p=1 2, ,..., and M N N= ×1 2 . Now, it is assumed that x
remains constant during the acquisition of the multiple
LR images, except for any motion and degradation al-
lowed by the model. Therefore, the observed LR images
result from warping, blurring, and subsampling opera-
tors performed on the HR image x. Assuming that each
LR image is corrupted by additive noise, we can then rep-
resent the observation model as [30], [48]
y DB M x nk k k k= + ≤ ≤for 1 k p (1)
where M k is a warp matrix of size L1N1L2N2�L1N1L2N2,
Bk represents a L N L N L N L N1 1 2 2 1 1 2 2× blur matrix,
D is a ( )N N L N L N1 2
2
1 1 2 2× subsampling matrix, and
n k represents a lexicographically ordered noise vector. A
block diagram for the observation model is illustrated in
Figure 3.
Let us consider the system matrix involved in (1). The
motion that occurs during the image acquisition is repre-
sented by warp matrix M k . It may contain global or local
translation, rotation, and so on. Since this information is
MAY 2003 IEEE SIGNAL PROCESSING MAGAZINE 23
Common Imaging System
Original Scene Blurred, Noisy,
Aliased LR ImageEnvironment
OTA
CCD Sensor Preprocessor
Optical
Distortion
Aliasing Motion Blur Noise
� 2. Common image acquisition system.
Desired HR Image x kth Warped HR Image xk
Continuous
Scene
Sampling
Continuous to
Discrete Without
Aliasing
Warping
−
−
Translation
Rotation, Etc.
Blur
−
−
−
Optical Blur
Motion Blur
Sensor PSF, Etc.
Down Sampling
Undersampling
( , )L L1 2
Noise ( )nk
kth Observed
LR Image yk
� 3. Observation model relating LR images to HR images.
wu
高亮
wu
高亮
wu
打字机
wu
打字机
wu
打字机
wu
高亮
wu
高亮
wu
高亮
generally unknown, we need to estimate the scene mo-
tion for each frame with reference to one particular frame.
The warping process performed on HR image x is actu-
ally defined in terms of LR pixel spacing when we esti-
mate it. Thus, this step requires interpolation when the
fractional unit of motion is not equal to the HR sensor
grid. An example for global translation is shown in Figure
4. Here, a circle (�) represents the original (reference)
HR image x, and a triangle (�) and a diamond (�) are
globally shifted versions of x. If the down-sampling factor
is two, a diamond (�) has (0.5, 0.5) subpixel shift for the
horizontal and vertical directions and a triangle (�) has a
shift which is less than (0.5,0.5). As shown in Figure 4, a
diamond (�) does not need interpolation, but a triangle
(�) should be interpolated from x since it is not located
on the HR grid. Although one could use ideal interpola-
tion theoretically, in practice, simple methods such as
zero-order hold or bilinear interpolation methods have
been used in many literatures.
Blurring may be caused by an optical system (e.g., out
of focus, diffraction limit, aberration, etc.), relative motion
between the imaging system and the original scene, and
the point spread function (PSF) of the LR sensor. It can be
modeled as linear space invariant (LSI) or linear space vari-
ant (LSV), and its effects on HR images are represented by
the matrix Bk . In single image restoration applications,
the optical or motion blur is usually considered. In the SR
image reconstruction, however, the finiteness of a physical
dimension in LR sensors is an important factor of blur.
This LR sensor PSF is usually modeled as a spatial averag-
ing operator as shown in Figure 5. In the use of SR recon-
struction methods, the characteristics of the blur are
assumed to be known. However, if it is difficult to obtain
this information, blur identification should be incorpo-
rated into the reconstruction procedure.
The subsampling matrix D generates aliased LR im-
ages from the warped and blurred HR image. Although
the size of LR images is the same here, in more general
cases, we can address the different size of LR images by
using a different subsampling matrix (e.g., Dk ). Al-
though the blurring acts more or less as an anti-aliasing
filter, in SR image reconstruction, it is assumed that
aliasing is always present in LR images.
A slightly different LR image acquisition model can be
derived by discretizing a continuous warped, blurred
scene [24]-[28]. In this case, the observation model must
include the fractional pixels at the border of the blur sup-
port. Although there are some different considerations
between this model and the one in (1), these models can
be unified in a simple matirx-vector form since the LR
pixels are defined as a weighted sum of the related HR
pixels with additive noise [18]. Therefore, we can express
these models without loss of generality as follows:
y W x nk k k= + =, ,...,for k p1 , (2)
where matrix Wk of size ( )N N L N L N1 2
2
1 1 2 2× repre-
sents, via blurring, motion, and subsampling, the contri-
bution of HR pixels in x to the LR pixels in y k . Based on
the observation model in (2), the aim of the SR image re-
construction is to estimate the HR image x from the LR
images y k for k p=1,..., .
Most of the SR image reconstruc-
tion methods proposed in the litera-
ture consist of the three stages
illustrated in Figure 6: registration,
interpolation, and restoration (i.e.,
inverse procedure). These steps can
be implemented separately or simul-
taneously according to the recon-
struction methods adopted. The
estimation of motion information is
referred to as registration, and it is ex-
tensively studied in various fields of
image processing [67]-[70]. In the
24 IEEE SIGNAL PROCESSING MAGAZINE MAY 2003
: Original HR Grid
: Original HR Pixels
: Shifted HR Pixels,
� 4. The necessity of interpolation in HR sensor grid.
The SR image reconstruction is
proved to be useful in many
practical cases where multiple
frames of the same scene can
be obtained, including medical
imaging, satellite imaging, and
video applications.
HR Grid
LR Grid
HR Pixel
a1a0
a2 a3
HR Image
LR Pixel =( )
LR Image
4
Σ ai
� 5. Low-resolution sensor PSF.
registration stage, the relative shifts between LR images
compared to the reference LR image are estimated with
fractional pixel accuracy. Obviously, accurate subpixel
motion estimation is a very important factor in the suc-
cess of the SR image reconstruction algorithm. Since the
shifts between LR images are arbitrary, the registered HR
image will not always match up to a uniformly spaced
HR grid. Thus, nonuniform interpolation is necessary to
obtain a uniformly spaced HR image from a
nonuniformly spaced composite of LR images. Finally,
image restoration is applied to the upsampled image to
remove blurring and noise.
The differences among the several proposed works are
subject to what type of reconstruction method is em-
ployed, which observation model is assumed, in which
particular domain (spatial or frequency) the algorithm is
applied, what kind of methods is used to capture LR im-
ages, and so on. The technical report by Borman and
Stevenson [2] provides a comprehensive and complete
overview on the SR image reconstruction algorithms un-
til around 1998, and a brief overview of the SR tech-
niques appears in [3] and [4].
Based on the observation model in (2), existing SR
algorithms are reviewed in the following sections. We
first present a nonuniform interpolation approach that
conveys an intuitive comprehension of the SR image re-
construction. Then, we explain a frequency domain ap-
proach that is helpful to see how to exploit the aliasing
relationship between LR images. Next, we present de-
terministic and stochastic regularization approaches, the
projection onto convex sets (POCS) approach, as well as
other approaches. Finally, we discuss advanced issues to
improve the performance of the SR algorithm.
SR Image Reconstruction Algorithms
Nonuniform Interpolation Approach
This approach is the most intuitive method for SR im-
age reconstruction. The three stages presented in Figure
6 are performed successively in this approach: i) estima-
tion of relative motion, i.e., registration (if the motion
information is not known), ii) nonuniform interpola-
tion to produce an improved resolution image, and iii)
deblurring process (depending on the observation
model). The pictorial example is shown in Figure 7.
With the relative motion information
est imated, the HR image on
nonuniformly spaced sampling
points is obtained. Then, the direct or
iterative reconstruction procedure is
followed to produce uniformly
spaced sampling points [71]-[74].
Once an HR image is obtained by
nonuniform interpolation, we ad-
dress the restoration problem to re-
move blurring and noise. Restoration
can be performed by applying any
deconvolution method that considers the presence of
noise.
The reconstruction results of this approach appear in
Figure 8. In this simulation, four LR images are gener-
ated by a decimation factor of two in both the horizontal
and vertical directions from the 256 256× HR image.
Only sensor blur is considered here, and a 20-dB Gaussi-
an noise is added to these LR images. In Figure 8, part (a)
shows the image interpolated by the nearest neighbor-
hood method from one LR observation, and part (b) is
the image produced by bilinear interpolation; a
nonuniformly interpolated image from four LR images
appears in part (c), and a deblurred image using the
Wiener restoration filter from part (c) is shown in part
(d). As shown in Figure 8, significant improvement is ob-
served in parts (c) and (d) when viewed in comparison
with parts (a) and (b).
Ur and Gross [5] performed a nonuniform interpola-
tion of an ensemble of spatially shifted LR images by uti-
lizing the generalized multichannel sampling theorem of
Papoulis [73] and Brown [74]. The interpolation is fol-
lowed by a deblurring process, and the relative shifts are
a
本文档为【Super-resolution-image-reconstruction-a-technical-overview】,请使用软件OFFICE或WPS软件打开。作品中的文字与图均可以修改和编辑,
图片更改请在作品中右键图片并更换,文字修改请直接点击文字进行修改,也可以新增和删除文档中的内容。
该文档来自用户分享,如有侵权行为请发邮件ishare@vip.sina.com联系网站客服,我们会及时删除。
[版权声明] 本站所有资料为用户分享产生,若发现您的权利被侵害,请联系客服邮件isharekefu@iask.cn,我们尽快处理。
本作品所展示的图片、画像、字体、音乐的版权可能需版权方额外授权,请谨慎使用。
网站提供的党政主题相关内容(国旗、国徽、党徽..)目的在于配合国家政策宣传,仅限个人学习分享使用,禁止用于任何广告和商用目的。