5842 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 59, NO. 12, DECEMBER 2011
Statistical Compressed Sensing of
Gaussian Mixture Models
Guoshen Yu and Guillermo Sapiro
Abstract—A novel framework of compressed sensing, namely
statistical compressed sensing (SCS), that aims at efficiently sam-
pling a collection of signals that follow a statistical distribution,
and achieving accurate reconstruction on average, is introduced.
SCS based on Gaussian models is investigated in depth. For signals
that follow a single Gaussian model, with Gaussian or Bernoulli
sensing matrices of � � measurements, considerably smaller
than the � ���� �� required by conventional CS based on
sparse models, where is the signal dimension, and with an op-
timal decoder implemented via linear filtering, significantly faster
than the pursuit decoders applied in conventional CS, the error
of SCS is shown tightly upper bounded by a constant times the
best -term approximation error, with overwhelming probability.
The failure probability is also significantly smaller than that of
conventional sparsity-oriented CS. Stronger yet simpler results
further show that for any sensing matrix, the error of Gaussian
SCS is upper bounded by a constant times the best -term ap-
proximation with probability one, and the bound constant can
be efficiently calculated. For Gaussian mixture models (GMMs),
that assume multiple Gaussian distributions and that each signal
follows one of them with an unknown index, a piecewise linear
estimator is introduced to decode SCS. The accuracy of model
selection, at the heart of the piecewise linear decoder, is analyzed
in terms of the properties of the Gaussian distributions and the
number of sensing measurements. A maximization-maximization
(Max-Max) algorithm that iteratively estimates the Gaussian
models parameters, the signals model selection, and decodes the
signals, is presented for GMM-based SCS. In real image sensing
applications, GMM-based SCS is shown to lead to improved
results compared to conventional CS, at a considerably lower
computational cost.
Index Terms—Compressed sensing, Gaussian mixture models,
sampling, statistical compressed sensing.
I. INTRODUCTION
C OMPRESSED sensing (CS) aims at achieving accuratesignal reconstruction while sampling signals at a low sam-
pling rate, typically far smaller than that of Nyquist/Shannon.
Let be a signal of interest, a nonadap-
tive sensing matrix (encoder), consisting of measure-
ments, a measured signal, and a decoder used
to reconstruct from . CS develops encoder-decoder pairs
Manuscript received January 30, 2011; revised May 12, 2011, July 24, 2011,
and September 04, 2011; accepted September 05, 2011. Date of publication
September 19, 2011; date of current version November 16, 2011. The associate
editor coordinating the review of this manuscript and approving it for publi-
cation was Prof. Jean-Christophe Pesquet. This work is partially supported by
NSF, ONR, NGA, ARO, DARPA, and NSSEFF.
The authors are with the Department of Electrical and Computer En-
gineering, University of Minnesota, Minneapolis, Minnesota, 55414 USA
(e-mail: yu@cmap.polytechnique.fr).
Color versions of one or more of the figures in this paper are available online
at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/TSP.2011.2168521
such that a small reconstruction error can be
achieved.
Reconstructing from is an ill-posed problem whose
solution requires some prior information on the signal. Instead
of the frequency band-limit signal model assumed in classic
Shannon sampling theory, conventional CS adopts a sparse
signal model, i.e., there exists a dictionary, typically an orthog-
onal basis , a linear combination of whose columns
generates an accurate approximation of the signal, ,
the coefficients , having their amplitude
decay fast after being sorted. For signals following the sparse
model, it has been shown that using some random sensing
matrices such as Gaussian and Bernoulli matrices with
measurements, and an minimization
or a greedy matching pursuit decoder promoting sparsity,
with high probability CS leads to accurate signal reconstruc-
tion: The obtained approximation error is tightly upper bounded
by a constant times the best -term approximation error, the
minimum error that one may achieve by keeping the largest
coefficients in [12], [13], [17], [18]. Redundant and signal
adaptive dictionaries that further improve the CS performance
with respect to orthogonal bases have been investigated [11],
[19], [32]. Let us note that the underlying idea of CS, that of
sampling significantly below the Nyquist rate and the signal di-
mension, does not necessarily rely on sparse prior as considered
in the original papers. In addition to sparse models, manifold
models have been considered for CS as well [6], [16]. In par-
ticular, [16] studies CS on manifolds using low-rank Gaussian
mixture models, providing a complementary approach and
additional theoretical results to the present work.
The present paper introduces a novel general framework of
CS, namely statistical compressed sensing (SCS). As opposed
to conventional CS that deals with one signal at a time, SCS aims
at efficiently sampling a collection of signals and having accu-
rate reconstruction on average. Instead of restricting to sparse
models, SCS works with general Bayesian models. Assuming
that the signals follow a distribution with probability density
function (pdf) , SCS designs encoder-decoder pairs
so that the average error
where is a norm, is small. As an important example,
SCS with Gaussian models is here shown to have improved
performance (bounds) relative to conventional CS, the signal
reconstruction calculated with an optimal decoder imple-
mented via a fast linear filtering. Moreover, for Gaussian
mixture models (GMMs) that better describe most real signals,
SCS with a piecewise linear decoder is investigated. As further
detailed in Section I-A and across the manuscript, the shown
1053-587X/$26.00 © 2011 IEEE
YU AND SAPIRO: STATISTICAL COMPRESSED SENSING OF GAUSSIAN MIXTURE MODELS 5843
improved performance of the proposed SCS model when com-
pared with standard CS based on sparse models results from the
stronger prior assumed by SCS. As explained next, this prior
also leads to better results in practice.
The motivation of SCS with Gaussian models is twofold.
First, controlling the average error over a collection of signals is
useful in signal acquisition, not only because one is often inter-
ested in acquiring a collection of signals in real applications, but
also because more effective processing of an individual signal,
an image or a sound for example, is usually achieved by dividing
the signal in (often overlapping) local subparts, patches (see
Fig. 6) or short-time windows for instance, so a signal can be
regarded as a collection of subpart signals [2], [8], [36], [38]. In
addition, Gaussian mixture models (GMMs), which model sig-
nals or subpart signals with a collection of Gaussians, assuming
each signal drawn from one of them, have been shown effective
in describing real signals, leading to state-of-the-art results in
image inverse problems [38] and missing data estimation [24].
SCS based on a single Gaussian model is first developed in
Section II. Following a similar mathematical approach as the
one adopted in conventional CS performance analysis [17], it is
shown that with the same random matrices as in conventional
CS, but with a considerably reduced number of
measurements, and with the optimal decoder implemented via
linear filtering, significantly faster than the decoders applied in
conventional CS, the average error of Gaussian SCS is tightly
upper bounded by a constant times the best -term approxi-
mation error with overwhelming probability, the failure prob-
ability being orders of magnitude smaller than that of conven-
tional CS. Moreover, stronger yet simpler results further show
that for any sensing matrix, the average error of Gaussian SCS
is upper bounded by a constant times the best -term approxi-
mation with probability one, and the bound constant can be ef-
ficiently calculated.
Section III extends SCS to GMMs. A piecewise linear GMM-
based SCS decoder, which essentially consists of estimating the
signal using each Gaussian model included in the GMM and
then selecting the best model, is introduced. The accuracy of
the model selection, at the heart of the scheme, is analyzed in
detail in terms of the properties of the Gaussian distributions and
the number of sensing measurements, in a simple but important
setting of GMM composed of two Gaussian distributions. Given
a correct model selection, the theoretical bounds analyzed for a
single Gaussian model in Section II apply.
The parameters of a GMM are typically unknown in real ap-
plications. Following [38], Section IV presents a maximization-
maximization (Max-Max) algorithm that iteratively estimates
the Gaussian models and decodes the signals. GMM-based SCS
calculated with the Max-Max algorithm is applied in real image
sensing, leading to improved results with respect to conven-
tional CS, at a considerably lower computational cost.
The main contribution of the paper consists in introducing
SCS, a new compressive sensing framework, in particular when
considering ensembles of signals and the important GMM
models, which are different than the standard sparse one, and
providing some first SCS results on Gaussian models, both
theoretical and experimental. Preliminary theoretical results for
the GMM in the case of two Gaussians, as well as experimental
results complementing those in [38], are presented as well.
Most proof techniques of the theoretical results follow the
fundamental conventional CS works [4], [17], while the SCS
framework and the results are novel. As both the models here
proposed and the sparse models are used as prior for CS, and
are applied for the same applications of image sensing and
restoration, SCS is systematically compared with conventional
CS.
Since we are replacing conventional sparse modeling with
Gaussian models in CS, let us provide before we proceed some
basic comments on the relationship between these two impor-
tant models.
A. Gaussian Models Versus Sparse Models
Sparse models assume that a signal can be accurately rep-
resented in some dictionary with a few nonzero coefficients,
whose support is unknown, i.e., an accurate approximation
of the signal lives in a low-dimension subspace (each set of
nonzero coefficients defines a subspace). As the support is as-
sumed unknown, the signal approximation is nonlinear, and is
typically calculated with an or greedy pursuit in conventional
CS, attempting to explore all the possible subspaces [27].
Gaussian models, on the other hand, assume that a collection
of signals follows a Gaussian distribution, whose mean and co-
variance are known (in the proposed SCS, they will be estimated
from the measured signals). In particular, as will be detailed in
the next section, the eigenvalues of the Gaussian covariance give
the average energy of the signals in each dimension of the cor-
responding principal component analysis (PCA) basis. The esti-
mation of the signals following a Gaussian distribution is linear,
and it is calculated with the same linear filter for all the signals.
Let us note that unless the Gaussian is degenerate, the resulting
signal estimate lives in the whole space, and not in a subspace
as in conventional sparse models. Compared with sparse models
assuming for each signal a few nonzero coefficients in an un-
known support, Gaussian models can be considered more con-
strained, as the latter assume the average energy of all the signals
in each PCA dimension, i.e., the eigenvalues of the Gaussian
covariance, are known. Such extra prior information leads the
proposed SCS with Gaussian models to achieve better perfor-
mance bounds than conventional CS based on sparse models, as
will be described in the next section. The theoretical improve-
ment is manifested in practice as well.
The nonlinear signal estimation based on conventional sparse
models enjoys the full freedom of selecting a subspace among
any possible combination of dimensions, whose number is
huge. For inverse problems such as CS, such full degree of
freedom may lead in practice to unstable and thus inaccurate
estimation [5], [28]. In part to overcome this problem, struc-
tured sparse models are introduced, i.e., sparse models are a
priori constrained by predefining some subspaces, in which the
nonlinear estimation is limited [5], [21], [28], [37]. A particular
case of this is one-block sparsity, where the dictionary atoms are
pre-grouped in (nonoverlapping) blocks and a single block can
be used at a time. The optimization is based on extending the
norm to take into account such grouping structure. Compared
with conventional sparse models, improved CS performance
bounds have been shown with structured sparse models [5],
[21].
A single Gaussian model does not provide accurate enough
description of most real signals, and the resulting linear estima-
tion may be therefore inaccurate. To overcome this problem, for
Administrator
高亮
Administrator
高亮
Administrator
高亮
5844 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 59, NO. 12, DECEMBER 2011
solving inverse problems, Gaussian mixture models (GMMs)
are introduced to enrich single Gaussian models, i.e., a number
of Gaussian distributions are assumed and, out of the whole
collection of signals, each one is assumed generated by one of
them. This results in piecewise linear estimation, i.e., a linear
estimation is calculated with each Gaussian model and the
best model is then selected [38]. While both structured sparse
models and GMMs predefine some hierarchical structure of
the signals, the distinction between the two models should be
highlighted. The former is a special case of conventional sparse
models which constrains the nonlinear subspace selection
from a fully free subspace search to a more limited search in
some predefined subspaces, whereas the latter extends single
Gaussian models turning linear signal estimation to piecewise
linear estimation, and the resulting signal estimate lives in
general in the whole space and not in a subspace as opposed
to the former. Although structured sparse models constrain
standard sparse models whereas GMMs extend single Gaussian
models, GMMs remain more constrained than structured sparse
models, because the piecewise linear estimation has typically
a degree of freedom much smaller than that of a nonlinear
estimation, which is equal to the number of possible subspaces,
even if this number is reduced by the imposed structure with
respect to a full subspace search.
The more constrained Gaussian models lead as expected to
better theoretical performance bounds than sparse models for
CS, as will be shown in the next section. For sensing real sig-
nals, remarkably and as will be shown in Section IV, the simple
piecewise linear estimation derived from GMMs lead to compa-
rable or even better results than nonlinear estimations based on
more elaborated sparse models, at a considerably lower compu-
tational complexity.
II. PERFORMANCE BOUNDS FOR A SINGLE GAUSSIAN MODEL
This section analyzes the performance bounds of SCS based
on a single Gaussian model. Perfect reconstruction of degen-
erate Gaussian signals is briefly discussed. After reviewing basic
properties of linear approximation for Gaussian signals, the rest
of the section shows that for Gaussian signals with fast eigen-
value decay, the average error of SCS using measurements and
decoded by a linear estimator is tightly upper bounded by that
of best -term approximation.
Signals are assumed to follow a Gaussian distri-
bution in this section. Principal Component Analysis
(PCA) calculates a basis change of the data
, with the orthonormal PCA basis that diagonalizes the data
covariance matrix
(1)
where is a diagonal matrix whose diag-
onal elements are the sorted eigenvalues,
and the PCA coefficient vector [27]. In this sec-
tion, for most of the time we will assume without loss of gen-
erality that by looking in the PCA domain. For
Gaussian and Bernoulli matrices that are known to be universal,
analyzing CS in canonical basis or PCA basis is equivalent [4].
A. Degenerate Gaussians
Conventional CS is able to perfectly reconstruct -sparse sig-
nals, i.e., with at most nonzero entries (typically
), with measurements [17]. Degenerate Gaussian distribu-
tions , where with at
most nonzero eigenvalues, give the counterpart of -sparsity
for the Gaussian signal models considered in this paper. Such
signals belong to a linear subspace
. While the signal support is unknown in the nonlinear
sparse signal models, in the linear Gaussian models it is assumed
known. This extra piece of information allows SCS to achieve
perfect signal reconstruction with less measurements than CS,
as will be implied by the next lemma. The simple lemma, fol-
lowing a proof similar to that of [17, Lemma 3.1], gives a con-
dition for perfect reconstruction of signals in .
Lemma 1: If is any matrix and is a positive
integer, then there is a decoder such that , for all
, if and only if .
Proof: Suppose there is a decoder such that ,
for all . Let . We can write
where both . Since . Plugging
and into the decoder , we obtain and then
.
Suppose . If with
, then , so . is thus a
one-to-one map. Therefore, there must exist a decoder such
that .
It is possible to construct matrices of size with
which satisfies the requirement of the Lemma. A trivial ex-
ample is , where is the identity
matrix of size and is a zero matrix of
size . Comparing with conventional compressed
sensing that requires measurements for exact reconstruction
of -sparse signals, with only measurements signals in can
be exactly reconstructed. The decrease in the number of mea-
surements from the conventional CS to SCS is due to the extra
prior assumed in the Gaussian models. Indeed, in contrast to the
conventional -sparse signals where the positions of the nonzero
entries are unknown, as described in Section I.A, with the de-
generate Gaussian model , the position of the nonzero
coefficients are known a priori to be the first ones. measure-
ments thus suffice for perfect reconstruction.
In the following, we will concentrate on the more general
case of nondegenerate Gaussian signals (i.e., Gaussians with
full-rank covariance matrices ) with fast eigenvalue decay, in
analogy to compressible signals for conventional CS. As men-
tioned before and will be further experimented in this paper,
such simple models not only lead to improved theoretical
bounds, but also provide state-of-the-art image reconstruction
results.
B. Optimal Decoder
To simplify the notation, we assume without loss of generality
that the Gaussian has zero mean , as one can always center
the signal with respect to the mean.
It is well known that the optimal decoders for Gaussian sig-
nals are calculated with linear filtering.
Theorem 1 [23]: Let be a random vector with prior
pdf , and , be a sensing matrix.
From the measured signal , the optimal de-
coder that minimizes the mean-square error (MSE)
, as well as the mean
absolute error (MAE)
YU AND SAPIRO: STATISTICAL COMPRESSED SENSING OF GAUSSIAN MIXTURE MODELS 5845
, where , is obtained with a linear max-
imum a posteriori (MAP) estimator,
(2)
and the resulting error is Gaussian with
mean zero and with covariance matrix
, whose trace yields the MSE of SCS.
(3)
In contrast to conventional CS, for which the minimization
or greedy matching pursuit decoders, calculated with iterative
procedures, have been shown optimal under certain conditions
on and the signal sparsity [8], [12] Gaussian SCS enjoys the
advantage of having an optimal decoder (2) calculated fast via
a closed-form linear filtering for any .
Corollary 1: If a random matrix is drawn in-
dependently to sense each , with
本文档为【SCS-混合高斯模型】,请使用软件OFFICE或WPS软件打开。作品中的文字与图均可以修改和编辑,
图片更改请在作品中右键图片并更换,文字修改请直接点击文字进行修改,也可以新增和删除文档中的内容。
该文档来自用户分享,如有侵权行为请发邮件ishare@vip.sina.com联系网站客服,我们会及时删除。
[版权声明] 本站所有资料为用户分享产生,若发现您的权利被侵害,请联系客服邮件isharekefu@iask.cn,我们尽快处理。
本作品所展示的图片、画像、字体、音乐的版权可能需版权方额外授权,请谨慎使用。
网站提供的党政主题相关内容(国旗、国徽、党徽..)目的在于配合国家政策宣传,仅限个人学习分享使用,禁止用于任何广告和商用目的。