Scholarly article on 2-Methyl-5-vinylpyridine 140-76-1 from

DOI: 10.1016/S1566-2535(02)00091-X

Source and publish data:

(1957)

Update date:2022-08-16

Topics:

Authors:

Yoshida Yoshida

Kumagae Kumagae

Read Full Text PDF DownLoad Join now for total 90,000,000 free articles

Article abstract of DOI:10.1016/S1566-2535(02)00091-X

Full text of DOI:10.1016/S1566-2535(02)00091-X

Information Fusion 3 (2002) 289–297

www.elsevier.com/locate/inﬀus

Combining parametric and non-parametric algorithms

for a partially unsupervised classiﬁcation

of multitemporal remote-sensing images

Lorenzo Bruzzone , Roberto Cossu, Gianni Vernazza

Department of Information and Communication Technologies, University of Trento, Via Sommarive, 14, I-38050, Povo, Trento, Italy

Received 26 January 2002; received in revised form 3 June 2002; accepted 12 August 2002

Abstract

In this paper, we propose a classiﬁcation system based on a multiple-classiﬁer architecture, which is aimed at updating land-cover

maps by using multisensor and/or multisource remote-sensing images. The proposed system is composed of an ensemble of clas-

siﬁers that, once trained in a supervised way on a speciﬁc image of a given area, can be retrained in an unsupervised way to classify a

new image of the considered site. In this context, two techniques are presented for the unsupervised updating of the parameters of a

maximum-likelihood classiﬁer and a radial basis function neural-network classiﬁer, on the basis of the distribution of the new image

to be classiﬁed. Experimental results carried out on a multitemporal and multisource remote-sensing data set conﬁrm the eﬀec-

tiveness of the proposed system.

Keywords: Multiple-classiﬁer system; Unsupervised retraining algorithm; Maximum-likelihood classiﬁer; Radial basis function neural network;

Expectation-maximization algorithm

. Introduction

classiﬁcation methods capable of analysing the images

of the considered site for which no training data are

available, thus increasing the eﬀectiveness of monitoring

systems based on the use of remote-sensing images.

Recently, the authors faced this problem by pro-

posing an unsupervised retraining technique for maxi-

mum-likelihood (ML) classiﬁers capable of producing

accurate land-cover maps even for images for which

ground-truth information is not available [6]. This tech-

nique allows the unsupervised updating of the parame-

ters of an already trained classiﬁer on the basis of the

distribution of the new image to be classiﬁed. How-

ever, given the complexity inherent with the task of

unsupervised retraining, the resulting classiﬁer may be

intrinsically less reliable and less accurate than the cor-

responding supervised one, especially for complex data

sets.

In this paper, in order to deﬁne a robust classiﬁca-

tion system for an unsupervised updating of land-

cover maps, we propose: (i) to extend the unsupervised

retraining technique proposed in [6] to radial basis

function (RBF) neural network classiﬁers; (ii) to inte-

grate the resulting unsupervised retraining classiﬁers in

the framework of multiple-classiﬁer systems. In greater

The increasing availability of remote-sensing images,

acquired periodically by satellite sensors on the same

geographical area, makes it extremely interesting to

develop monitoring systems capable of automatically

producing and regularly updating land-cover maps of

the considered site. The monitoring task can be ac-

complished by supervised classiﬁcation techniques,

which have proven to be eﬀective categorisation tools

[

1–5]. Unfortunately, these techniques require the

availability of a suitable training set (and hence of

ground-truth information) for each new image of the

considered area to be classiﬁed. However, in real ap-

plications, it is not possible to rely on suitable ground

truth information for each of the available images of the

analysed site. Consequently, not all the remote-sensing

images acquired on the investigated area at diﬀerent

times can be used for updating the related land-cover

maps. In this context, it would be important to develop

Corresponding author. Tel.: +39-0461-882623/882056; fax: +39-

461-882671/881696.

E-mail address: lorenzo.bruzzone@ing.unitn.it (L. Bruzzone).

PII: S1566-2535(02)00091-X

290

L. Bruzzone et al. / Information Fusion 3 (2002) 289–297

detail, the proposed system is based on two diﬀerent

unsupervised retraining classiﬁcation algorithms: a para-

metric ML classiﬁer and a non-parametric RBF neural-

network classiﬁer. Both techniques allow the existing

on several factors (e.g., diﬀerences in the atmospheric

and light conditions at the image-acquisition dates,

sensor non-linearities, diﬀerent levels of soil moisture,

etc.) that alter the spectral signatures of land-cover

classes in diﬀerent images and consequently the distri-

butions of such classes in the feature space.

‘

‘knowledge’’ of the classiﬁers (i.e., the parameters of the

classiﬁers obtained by supervised learning on a ﬁrst im-

age, for which a training set is assumed available) to be

updated in an unsupervised way, on the basis of the

distribution of the new image to be categorised. The

combination of the above-mentioned classiﬁcation al-

gorithms is used as a tool for increasing the accuracy

and the reliability of the classiﬁcation maps obtained by

each single classiﬁer. Classical approaches to classiﬁer

combination are adopted. As compared to previous

works [6], the main novelty of this paper consists in the

original retraining technique proposed for the RBF

classiﬁer and in the multiple-classiﬁer architecture used

in the context of partially unsupervised classiﬁcation.

The paper is organized into seven sections. In Section

It is worth noting that the proposed approach is

based on a separate analysis of the two images X

. Consequently, it does not require that the images are

accurately co-registrated.

and

3. Description of the architecture of the proposed

classiﬁcation system

The proposed classiﬁcation system is based on a

multiple-classiﬁer architecture. The choice of this ar-

chitecture mainly depends on the intrinsic complexity of

the unsupervised retraining procedures, which may re-

sult in less reliable and less accurate classiﬁers than the

corresponding supervised ones, especially for complex

data sets. In this context, the use of a multiple-classiﬁer

approach allows one to integrate the complementary

information provided by an ensemble of diﬀerent clas-

siﬁers, thus involving a more robust and reliable classi-

ﬁcation system.

the considered problem is formulated. The architec-

ture of the proposed system is described in Section 3.

The unsupervised retraining classiﬁers are described in

Section 4. Section 5 presents the strategies adopted for

the combination of the ensemble of unsupervised re-

training classiﬁers considered. Experimental results are

given in Section 6. Finally, in Section 7, discussion is

provided and conclusions are drawn.

The classiﬁers composing the ensemble are developed

within the framework of the Bayes decision theory.

Consequently, the decision rule adopted to classify a

generic pixel x of the image X can be expressed as [10]:

. Formulation of the problem

x 2 x if x ¼ arg maxfP ðx =x Þg

ð1Þ

xi2X

Let X ¼ fx ; x ; . . . ; x g and X ¼ fx ; x ; . . . ; x g

denote two multispectral images composed of B pixels

and acquired in the area under analysis at the time t and

, respectively. Let x be the 1 Â d feature vector asso-

where P ðx =x Þ is the estimate of the posterior proba-

bility of the class x

(1), the classiﬁcation of the image X

mation of the posterior probabilities P

classes x

2 X. These estimates involve the computation

of a parameter vector # , which represents the ‘‘knowl-

at t , given the pixel x . According to

requires the esti-

ðx =X Þ for all

ciated with the jth pixel of the image X

dimensionality of the input space). Let X

variate random variable that represents the pixel values

i.e., the feature vector values) in X . Let us assume that

(where d is the

be a multi-

(

edge’’ of the classiﬁer concerning the distributions of the

classes in the feature space (i.e., the status of the clas-

the same set X ¼ fx ; x ; . . . ; x g of C land-cover

classes characterizes the considered geographical area at

both t and t . This means that in our system only the

siﬁer at t ). The number and nature of the vector com-

ponents will be diﬀerent depending on the speciﬁc

classiﬁer used. In our system, we propose to consider

two diﬀerent unsupervised retraining approaches: the

former is a parametric approach, which is based on the

ML classiﬁer; the latter consists of a non-parametric

spatial and spectral distributions of such land-covers

classes are supposed to vary (i.e., the set of land-cover

classes that characterize the considered site is ﬁxed over

time). This assumption is quite realistic in several real

applications of classiﬁcation of remote-sensing data [7–

technique, which is based on RBF neural networks.

]. Finally, let us assume that a reliable training set Y is

Both techniques allow the parameter vectors # (corre-

sponding to the parametric approach) and # (corre-

sponding to the non-parametric approach), which are

available at t , whereas a training set is not available at

. This prevents the generation of the t

as the training of the classiﬁer on the image X

performed. At the same time, it is not possible to apply

the classiﬁer trained on the image X to the image X

land-cover map,

cannot be

obtained by supervised learning on the ﬁrst image X , to

be updated in an unsupervised way.

In the proposed multiple-classiﬁer approach, N dif-

ferent classiﬁers are trained at the time t by using the

information contained in the available training set Y . In

because, in general, the estimates of the statistical pa-

rameters of the classes at t do not provide accurate

approximations for the same terms at t . This depends

particular, a classical parametric ML classiﬁer [10] and

L. Bruzzone et al. / Information Fusion 3 (2002) 289–297

291

N À 1 diﬀerent architectures of non-parametric RBF

where the mixing parameters and the component den-

sities are the a priori probabilities and the conditional

density functions of the classes, respectively. In this

context, the retraining of the ML classiﬁer at the time t₂

becomes a mixture density estimation problem, which

can be solved by exploiting the iterative expectation-

maximization (EM) algorithm [11–14]. The iterative

equations to be used are the following:

neural networks [5] are used. As a result, a parameter

vector # corresponding to the parametric approach,

and the N À 1 parameter vectors # (r ¼ 1; . . . ; N À 1)

n;r

corresponding to the non-parametric RBF neural ap-

proach, are derived. Then, at time t

retrained in an unsupervised way by using the infor-

mation contained in the distribution pðX Þ of the new

image X . At the end of the retraining phase, a new

, the classiﬁers are

tþ1

ðx Þ ¼

P ðx =x Þ

ð4Þ

parameter vector # is obtained for the ML classiﬁer

n;r

x 2X2

and N À 1 new parameter vectors # r ¼ 1; . . . ; N À 1

are obtained for the N À 1 RBF neural-network archi-

tectures considered. Finally, the results provided by

diﬀerent unsupervised retraining classiﬁers are combined

by using a classical multiple-classiﬁer approach.

P ðx

=x Þx

x 2X2

tþ1

¼ P

ð5Þ

ð6Þ

P ðx

=x Þ

x 2X

tþ1

2;k

P ðx

=x Þðx À l Þ ðx À l

x 2X2

2;k

tþ1

P ðx

=x Þ

x 2X2

. The proposed unsupervised retraining classiﬁers

where the superscripts t and t þ 1 refer to the values of

the parameters at the current and next iterations, re-

spectively, the superscript T refers to the vector trans-

pose operation, and the estimated posterior probability

The main idea of the proposed unsupervised re-

training approach is that rough estimates of the pa-

rameter values that characterize the classes considered at

the time t

can be obtained by exploiting the parameters

P ðx

=x Þ is equal to:

of the classiﬁers estimated at the time t by supervised

ÞP ðx

p ðx =x

i¼1

learning. Such estimates are then updated in an unsu-

pervised way by using the information contained in the

P ðx =x Þ ¼ P

ð7Þ

p ðx =x

ÞP ðx

distribution pðX

Þ of the new image X

. In the following,

where the density function p ðx =x

Þ is computed by

a detailed description of the proposed unsupervised re-

training technique for the RBF neural-network classiﬁ-

ers is given. Concerning the retraining technique for the

ML classiﬁer, we provide only a brief description since it

was already proposed in [6].

2;i

using the estimates of the terms l and R obtained at

current iteration.

2;i

For each class x 2 X, the estimates obtained at

convergence of the EM algorithm are the new parame-

ters of the ML classiﬁer at the time t . Since the unsu-

pervised retraining approach for the ML classiﬁer is not

the novel aspect of this paper, we refer the reader to [6]

for greater details on this method.

4.1. The proposed retraining technique for the ML

classiﬁer

In the case of a parametric ML classiﬁer, the vector of

parameters that should be estimated for classifying the

new image X is given by:

4.2. The proposed unsupervised retraining technique for

RBF neural-network classiﬁers

2;1

2;2

2;C

^#2

¼ ½h ; P ðx Þ; h ; P ðx Þ; . . . ; h ; P ðx Þ

ð2Þ

The proposed non-parametric classiﬁer is based on

Gaussian RBF neural networks, which consist of three

layers: an input layer, a hidden layer, and an output

layer (see Fig. 1). The input layer relies on as many

neurons as the input features. The input neurons just

propagate the input features to the next layer. Each one

of the Q neurons in the hidden layer is associated with a

Gaussian kernel function. The output layer is made up

of as many neurons as the classes to be recognised. Each

output neuron computes a simple weighted summation

over the responses of the hidden units for a given input

pattern (we refer the reader to [5] for more details on

RBF neural-network classiﬁers).

where h₂is the vector of the parameters that charac-

terize the conditional density function p

class x (e.g., the mean vector l₂_;_iand the covariance

matrix R2;i in the Gaussian case). For each class x

ðX

Þ of the

2 X,

the initial values of both the prior probability P ðx

Þ and

the conditional density function p ðX =x Þ can be ap-

proximated by the values computed in the supervised

training phase at t : Then, such estimates can be im-

proved by exploiting the information associated with the

distribution p

the proposed method is based on the observation that the

statistical distribution of the pixel values in X can be

ðX

Þ of the new image X . In particular,

In the context of RBF neural classiﬁers, the condi-

tional densities of Eq. (3) can be written as a sum of

described by the following mixed-density distribution:

contributes due to the Q kernel functions u of the

ðX

Þ ¼

ðx

Þp

ðX

ð3Þ

i¼1

neural architecture [14]:

292

L. Bruzzone et al. / Information Fusion 3 (2002) 289–297

chitecture of the proposed system (and, in particular,

some of the results provided by the ML classiﬁer) for

estimating such parameters. For simplicity, let us as-

sume that all the Q kernel functions /₂_;_qare character-

ized by the same width r . Under the above-mentioned

assumptions, it is possible to prove that the following

equations (derived by exploiting the EM algorithm) can

be applied iteratively to update the RBF neural-network

classiﬁer parameters:

tþ1

ðu Þ ¼

P ðu =x Þ

ð12Þ

x 2X2

Fig. 1. Standard architecture of a supervised RBF neural-network

classiﬁer.

P ðu =x Þx

x 2X2

tþ1

¼ P

ð13Þ

ð14Þ

2;q

P ðu =x Þ

x 2X2

p ðX Þ ¼

P ðu Þp ðX =u Þ

ð8Þ

1 X X

tþ1

2;q

P ðu =x Þkx À p

q¼1

x 2X2

where the mixing parameters and the component den-

sities are the a priori probabilities and the conditional

density functions of the kernels. Eq. (8) can be rewritten

as:

where the superscripts t and t þ 1 refer to the values of

the parameters at the current and next iterations, re-

spectively, and the estimated posterior probability

P ðu =x Þ is given by:

ðX

Þ ¼

ðx

=u ÞP

ðu Þp

ðX

=u_qÞ

ð9Þ

p ðx =u ÞP ðu Þ

i¼1

i¼1 q¼1

P ðu =x Þ ¼ P

ð15Þ

p ðx =u ÞP ðu Þ

where the mixing parameter P

ðx

=u Þ is the conditional

probability that the kernel u belongs to class x

. In this

where the density function p ðx =u Þ is computed by

t t

2;i 2

formulation, kernels are not deterministically owned by

classes; so the formulation can be considered as a gen-

eralization of a standard mixture model [14]. The value

using the estimates of the terms p and r obtained at

current iteration.

All the components of # are initialized according to

the values obtained in a supervised way on the t image.

It is possible to prove that at each iteration, the log-

of the weight w that connects the qth hidden unit to the

ith output node, can be computed as [14]:

likelihood function of the estimates increases until a

maximum is reached. Although the EM algorithm may

converge to a local maximum, its convergence is guar-

anteed [11–14]. The values of the parameters obtained at

convergence for each RBF neural classiﬁer are used to

analyse the new image to be classiﬁed.

w ¼ Pðx =u ÞPðu Þ

ð10Þ

By analysing Eq. (9), it can be noticed that, as for the

ML classiﬁer, the retraining of the RBF classiﬁer at time

t becomes a parameter estimation problem. In partic-

ular, the parameter vector to be estimated is given by:

#₂

¼½/ ;Pðu Þ;P ðx =u Þ;...;P ðx =u Þ;...;/ ;Pðu Þ;

2;1

2;Q

P ðx =u Þ;...;P ðx =u Þ

ð11Þ

. Multiple-classiﬁer strategies

where /₂_;_qis the vector of parameters that characterises

the density function p ðX =u Þ (e.g., if Gaussian kernel

We propose the use of diﬀerent combination strate-

functions are considered, /₂is composed of the mean

gies to integrate the complementary information pro-

vided by the ensemble of unsupervised retraining

parametric and non-parametric classiﬁers described in

the previous section. The use of these strategies for

combining the decisions provided by each single classi-

ﬁer results in a more robust behaviour in terms of ac-

curacy and reliability of the ﬁnal classiﬁcation system.

As stated in Section 3, let us assume that a set of N

classiﬁers (an unsupervised retraining ML classiﬁer and

N À 1 unsupervised retraining RBF neural classiﬁers

with diﬀerent architectures) are retrained on the X₂

image in order to update the corresponding parameters

2;q and the width r2;q characterizing the qth kernel).

However, the parameter vector # is more complex to be

estimated than the parameter vector # related to the

ML classiﬁer. In particular, the presence of the mixing

terms P ðx =u Þ do not allow the new estimates to be

accomplished in a fully unsupervised way. Hence, ad-

ditional information should be available in order to

compute such statistical terms. In the following, we will

assume to know the values of the mixing parameters

Pðx

=u Þ; we refer the reader to the Appendices A and B

for the description of a technique that exploits the ar-

L. Bruzzone et al. / Information Fusion 3 (2002) 289–297

293

by using the procedures described in Section 4. In this

context, several strategies for combining the decisions of

the diﬀerent classiﬁers can be adopted [15,16]. We will

focus on three widely used combination strategies: the

majority voting [15], the combination by Bayesian average

servation of the previous one. However, in this case, the

strategy for combining classiﬁers consists in a winner-

takes-all approach: the land-cover class that has the

larger posterior probability among all classiﬁers is taken

as the class of the input pattern.

[

16], and the maximum posterior probability strategies. It

is worth noting that, in our case, the use of these un-

supervised combination strategies is mandatory because

6. Experimental results

a training set is not available at t , and therefore more

complex supervised approaches cannot be adopted.

The majority voting principle faces the combination

problem by considering the results of each single clas-

siﬁer in terms of the class labels assigned to the patterns.

Hence, a given input pattern receives N classiﬁcation

labels from the multiple-classiﬁer system, each label

corresponding to one of the C classes considered. The

combination method is based on the interpretation of

the classiﬁcation label resulting from each classiﬁer as a

In order to assess the eﬀectiveness of the proposed

approach, diﬀerent experiments were carried out on a

data set made up of two multispectral images acquired

by the thematic mapper (TM) multispectral sensor of

the Landsat 5 satellite. The selected test site was a sec-

tion (412 Â 382pixels) of a scene including Lake Mu-

largias on the Island of Sardinia, Italy. The two images

used in the experiments were acquired in September

995 (t

) and July 1996 (t ). Fig. 2shows channels 5 of

‘

‘vote’’ for one of the C land-cover classes. The data

both images.

class that receives the largest number of votes is taken as

the class of the input pattern.

The second method considered, the combination by

The available ground truth was used to derive a

training set and a test set for each image. Five land-

cover classes (i.e., urban area, forest, pasture, water

body, and vineyard), which characterize the test site at

the above-mentioned dates, were considered. A detailed

description of the training and test sets of both images is

given in Table 1. To carry out the experiments, we as-

sumed that only the training set associated with the

image acquired in September 1995 was available. It is

worth noting that the images considered were acquired

in diﬀerent periods of the year. Therefore, in this case,

the unsupervised retraining problem turned out to be

rather complex.

Bayesian average strategy, is based on the observation

that for a given pixel x in the image X

the N classiﬁers

considered provide an approximation of the posterior

probability P ðx =x Þ for each class x 2 X. Therefore, a

possible strategy for combining these classiﬁers consists

in the computation of the average posterior probabili-

ties, i.e.,

ave

^P2 ðx =x Þ ¼

P ðx =x Þ

ð16Þ

n¼1

where P ðx

=x Þ is the approximation of the posterior

An ML and two RBF classiﬁers (one with 60 hidden

neurons, i.e., RBF-1, the other with 80 hidden neu-

rons, i.e., RBF-2) were trained in a supervised way on

the September 1995 image to estimate the parameters

that characterize the density functions of the classes

probability P

ðx

=x Þ provided by the nth classiﬁer. The

classiﬁcation is then carried out according to the Bayes

rule by selecting the land-cover class associated with the

maximum average posterior probability.

The third method considered (i.e., the maximum

posterior probability strategy) is based on the same ob-

at the time t . For the ML classiﬁer, the assumption

of Gaussian distributions was made for the density

Fig. 2. Channel 5 of the Landsat-5 TM images utilized for the experiments: (a) image acquired in September 1995; (b) image acquired in July 1996.

294

L. Bruzzone et al. / Information Fusion 3 (2002) 289–297

Table 1

Number of patterns in the training and test sets of both the September

995 and July 1996 images

Table 3

Classiﬁcation accuracies exhibited by the considered classiﬁers on the

July 1996 test set after the unsupervised retraining

Land-cover class

Number of patterns

Land-cover class Classiﬁcation accuracy (%) (July 1996 test set)

Training set

Test set

RBF-1

RBF-2

Pasture

Forest

554

304

408

804

179

589

274

418

551

117

Pasture

Forest

94.06

87.22

93.06

99.83

98.54

98.56

100.00

98.90

98.56

Urban area

Water body

Vineyard

Urban area

Water body

Vineyard

100.00

64.10

100.00

31.6231.62

100.00

Overall

2249

1949

Overall

92.76

95.34

95.44

functions of the classes (this was a reasonable assump-

tion, as we considered TM images). In order to exploit

the non-parametric characteristic of the two RBF neural

classiﬁers, they were trained using not only the 6 avail-

able spectral channels, but also 5 texture features based

on the gray-level co-occurrence matrix (i.e., sum vari-

ance, sum average, correlation, entropy and diﬀerence

variance) [17]. These features were computed by using

a window size equal to 7 Â 7 and an interpixel dis-

accuracies provided by the considered unsupervised re-

training classiﬁers for the July 1996 test set are sharply

higher than the ones exhibited by the single classiﬁers

trained on the September 1995 image (i.e., 92.76% vs.

50.43%, 95.34% vs. 71.27%, 95.44% vs. 69.78% for the

ML, the RBF-1, the RBF-2classiﬁers, respectively). In

greater detail, the retrained classiﬁers exhibited high

accuracies on all land-cover classes, with exception of

the vineyard class, which is a minority one.

At this point, the three classiﬁers were combined ac-

cording to the strategies described in Section 5. In order

to evaluate the accuracy of the resulting classiﬁcation

system, it was applied to the July 1996 test set. The

overall and class-by-class accuracies yielded are given in

Table 4. As one can see, the overall accuracies provided

by all the considered combination strategies (i.e.,

95.58%, 95.39%, and 95.75% for the majority voting, the

Bayesian average, and the maximum posterior proba-

bility strategies, respectively) are similar to the one

yielded by the best-performing classiﬁer composing the

ensemble (i.e., 95.44% obtained by the RBF-2classiﬁer).

It is worth stressing that the objective of the multiple-

classiﬁer architecture is not only to increase the accuracy

of the classiﬁcation system but also to increase its ro-

bustness. In particular, the combination strategy should

allow one to recover the possible failure of a single un-

supervised retraining classiﬁer of the ensemble by ex-

ploiting the results provided by the other considered

classiﬁers. In order to assess this last issue, an experi-

ment was carried out in which the failure of the re-

tance equal to 1. After the supervised training on the X

image, the eﬀectiveness of the classiﬁers was evaluated

on the test sets related to both images (see Table 2). On

the one hand, as expected, the classiﬁers provided high

overall classiﬁcation accuracies for the test set related to

the September 1995 image (i.e., 90.97%, 81.79% and

1.74% for the ML, the RBF-1, and the RBF-2classi-

ﬁers, respectively). On the other hand, they exhibited

very poor performances on the July 1996 test set. In

particular, the overall classiﬁcation accuracy provided

by the ML classiﬁer for the July test set was equal to

0.43%, which is not an acceptable result. Also the ac-

curacies exhibited by the two RBF neural classiﬁers

considered are not suﬃciently high (i.e., 69.78% and

1.27%).

At this point, the considered classiﬁers were retrained

on the t image (July 1996) by using the proposed un-

supervised retraining techniques. The ML and RBF re-

training processes converged in 11 and 15 iterations,

respectively, taking few minutes of processing on a Sun

Ultra80 workstation. The overall and class-by-class ac-

curacies exhibited by the diﬀerent classiﬁers after the

retraining phase are given in Table 3. By a comparisons

of Tables 2and 3, one can see that the classiﬁcation

Table 4

Classiﬁcation accuracies exhibited by the proposed multiple-classiﬁer

system on the July 1996 test set

Land-cover

class

Classiﬁcation accuracy (%) (July 1996 test set)

Table 2

Overall classiﬁcation accuracies exhibited by the considered classiﬁers

Majority

voting

Bayesian

average

Maximum posterior

probability

(trained in a supervised way on the September 1995 image) before the

unsupervised retraining

Pasture

Forest

100.00

98.90

98.56

100.00

34.18

99.83

98.90

98.56

99.32

98.54

98.08

Classiﬁcation

technique

Overall classiﬁcation accuracy (%)

Test set (September 1995) Test set (July 1996)

Urban area

Water body

Vineyard

100.00

.7 33 1.6242

100.00

RBF-1

RBF-281.74

90.97

81.79

50.43

71.27

69.78

Overall

95.58

95.39

95.75

L. Bruzzone et al. / Information Fusion 3 (2002) 289–297

295

Table 5

intrinsically less reliable and less accurate than the cor-

responding supervised approaches, especially for com-

plex data sets. Therefore, the use of methodologies for

the combination of classiﬁers has been proposed in

order to increase the reliability and the accuracy of

single unsupervised retraining classiﬁers.

Classiﬁcation accuracies exhibited by the proposed multiple-classiﬁer

system on the July 1996 test set when the failure of the unsupervised

retraining of RBF-3 was simulated

Land-cover

class

Classiﬁcation accuracy

(%) (July 1996 test set)

Majority

voting

Bayesian

average

Maximum posterior

probability

Although extensive experiments on other data sets

are necessary for a ﬁnal validation of the method, the

results we obtained on the considered data set are very

interesting. In particular, they pointed out that the

proposed system is a promising tool for attaining high

classiﬁcation accuracies also for images of a given area

for which an updated training set is not available.

The presented method is based on the assumption

that the estimates of the classiﬁer parameters derived

from a supervised training on a previous image of the

considered area can represent rough estimates of the

class distributions in the new image to be categorised.

Then the EM algorithm is applied in order to iteratively

improve such estimates on the basis of the global density

function of the new image.

It is worth noting that the initial estimates usually

cannot be directly used to classify the new image to be

analyzed. In fact in practical situation, depending on

diﬀerences in the atmospheric or light conditions exist-

ing between the two acquisition dates, such initial esti-

mates may be signiﬁcantly diﬀerent from the true ones.

The proposed method copes with this situation, i.e., the

EM algorithm is able to improve the initial estimates so

that the classiﬁcation of the new image can be accurately

performed. However, in order to minimize the possi-

bility that the retraining does not converge to accurate

estimates, if possible, we recommend the application of

a pre-processing phase aimed at reducing the diﬀerences

between images due to the above-mentioned factors

Pasture

Forest

98.47

98.90

98.56

100

96.43

98.90

97.84

100

90.83

99.27

98.08

100

Urban area

Water body

Vineyard

58.11

52.13

58.11

Overall

96.56

95.43

94.20

training process of one of the RBF classiﬁers (i.e.,

RBF-1) was simulated. To this end, the RBF classiﬁer

with 60 hidden neurons, after being trained on the X₁

image, was not retrained on the X image (let us indicate

this classiﬁer as RBF-3). In this condition, the classiﬁ-

cation accuracy exhibited by the RBF-3 classiﬁer on the

July 1996 test set results equal to the one yielded by the

RBF-1 classiﬁer on the same test set before the unsu-

pervised retraining phase (see Table 2). As already

observed, this overall accuracy (i.e., 71.27%) is not ac-

ceptable. At this point, the ML classiﬁer and the RBF-2

and RBF-3 neural classiﬁers were combined according

to the strategies described in Section 5. The accuracies

exhibited by the resulting multiple-classiﬁer system are

reported in Table 5. As one can see, even though RBF-3

provided low accuracy on the July 1996 test set, all the

combination strategies resulted in high classiﬁcation

accuracies, so recovering the simulated failure of the

unsupervised retraining process. In greater detail, the

obtained accuracies are comparable to the ones achieved

by combining the three ‘‘well-retrained’’ classiﬁers (i.e.,

ML, RBF-1, and RBF-2).

(simple correction algorithms can be adopted).

At the present, the authors are addressing the prob-

lem of deﬁning criteria suitable to identify the cases in

which the initial estimates of the class distributions are

so diﬀerent from the true ones that may involve a failure

of the retraining process.

. Discussion and conclusions

In this paper, the problem of unsupervised retraining

of classiﬁers for the updating of land-cover maps has

been addressed in the framework of a multiple-classiﬁer

system. The proposed system produces accurate land-

cover maps of a speciﬁc study area also from images for

which a reliable ground truth (and hence a suitable

training set) is not available. This is made possible by an

unsupervised updating of the parameters of an ensemble

of parametric and non-parametric classiﬁers on the basis

of the new image to be classiﬁed. In particular, an ML

parametric classiﬁer and RBF neural network non-

parametric classiﬁers have been considered. However,

given the complexity inherent with the task of un-

supervised retraining, the resulting classiﬁers are

Acknowledgement

This research was supported by the Italian Space

Agency (ASI).

Appendix A. Estimation of the mixing parameters

classiﬁers

ðx

=u Þ for the retraining of RBF neural-network

In this appendix, we propose a method for estimating

the values of the mixing parameters P ðx =u Þ of the

296

L. Bruzzone et al. / Information Fusion 3 (2002) 289–297

RBF neural classiﬁers (see Section 4.2). These parame-

ters can be estimated by exploiting the multiple-classiﬁer

architecture of the proposed system. In particular, they

can be derived by using the updated parameter vector of

_D_E_¼_Etþ1ðX

=# ÞÀE ðX

=# Þ¼

ꢀ

ꢁ

ðu_q=x²_j

q¼1

tþ1

ðx =u ÞP ðu Þꢀ

^ð^uq⁼^x²j

log

r¼1

½p ðx =u ÞP ðu Þ

the ML classiﬁer. The strategy adopted is the following.

ꢀ

ꢁ9

Pt ðu =x2Þ

tþ1

ðx =u ÞP ðu ÞP^t^þ¹ðx

tþ1

q¼1

=u Þꢀ

Let L

be the set of pixels x that are most likely cor-

rectly classiﬁed by the ML classiﬁer. This set can be

ðu =x_j

log

½p ðx =u ÞP ðu ÞP ðx

=u Þ

i¼1 >: x22Li

r¼1

identiﬁed by analysing the estimates of the posterior

probability P

ðx

=x Þ provided by the ML classiﬁcation

ð20Þ

tþ1

algorithm. Let us consider the jth pixel x of the image

X and let us assume that x is classiﬁed by the ML

where E ðX

=# Þ and E ðX

=# Þ are the error functions

computed with the parameters estimated at the current

classiﬁer as belonging to the class x (i.e., x ¼

and next iterations, respectively. The terms P ðu =x Þ are

^a^r^g^m^a^xxi2XfP

ðx =x Þg). The pixel x is likely to be

correctly classiﬁed by the ML classiﬁer (and thus is as-

introduced in order to apply the JensenÕs inequality.

Thanks to such inequality, the following upper-bound

can be obtained:

signed to the set L

and labelled as belonging to the class

) if its estimated posterior probability is above a given

tþ1

X X

ðx =u ÞP ðu Þ

DE 6 À _x

P ðu =x Þ log P

threshold (i.e., P

ðx

=x Þ P a, where 0:5 < a < 1 is a real

r¼1

number usually close to 1). The set L is then used to

q¼1

ðx

ÞP

ðu

ÞP

ðu

X X X

estimate the mixing parameters P ðx =u Þ according to

P ðu =x Þ ꢀ

the following iterative equation:

i¼1 x22Li q¼1

tþ1

ðx =u ÞP ðu ÞPtþ1_ð_x

tþ1

=u Þ

P ðu =x Þ

x 2L

tþ1_ð_x

ꢀ log P

ð21Þ

=u Þ ¼ P

ð17Þ

r¼1

ðx

ÞP

ðu

ÞP

ðx

ÞP

ðu

P ðu =x Þ

x 2L2

We aim at minimizing this bound with respect to the

values of the parameters computed at the next iteration.

Dropping the terms which depends only on the ‘‘old’’ pa-

rameters, the right-hand side of (21) can be rewritten as:

where L is the subset of L containing the pixels x

labelled as belonging to the class x

EM algorithm used for the unsupervised estimation of

the other RBF neural-network parameters (see Eqs.

: At each step of the

X X

tþ1

H ¼ À_x

P ðu =x Þ log½p ðx =u ÞP ðu Þ þ

(

12)–(14)), also the Eq. (17) is iterated in order to in-

q¼1

crease the accuracy in the estimation of the mixing pa-

rameters.

(

X X X

P ðu =x Þꢀ

i¼1 x22Li q¼1

)

Appendix B. Derivation of the equations for estimating

the parameters of RBF neural-network classiﬁers

ꢀ log½p₂^t^þ¹ðx =u ÞP ðu ÞP ðx =u Þ

tþ1

ð22Þ

Eqs. (12)–(14) and (17) can be derived by maximizing

the following log-likelihood function:

and for the Gaussian case:

(

X X

H ¼ À_x

P ðu =x Þꢀ

WðX

=# Þ ¼

log

½p

ðx =u ÞP

ðu Þ

q¼1

2L2

tþ1

2;q

kx À p

tþ1

ꢀ log P ðu Þ À d log r

tþ1

²^ð^r2

log

ðx =u ÞP

ðu ÞP

ðx =u_qÞ

;

(

i¼1

x22Li

q¼1

P ðu =x Þꢀ

ð18Þ

i¼1 x22Li q¼1

which is equivalent to minimizing the error function

ꢀ log P^t^þ¹ðu

Þ þ log P₂^t^þ¹ðx

EðX

=# Þ:

EðX =# Þ ¼ ÀWðX =# Þ

ð19Þ

tþ1

kx À p

À d log r^t₂^þ¹

2;q

ð23Þ

This task can be achieved by means of the technique

described in [18]. In particular, let us consider the

change DE in the function (19) when replacing the pa-

rameter values of the current iteration with the one of

the next iteration:

tþ1

ðr

At this point it is possible to minimize H (and hence the

tþ1

error function E ðX =# Þ) with respect to the ‘‘new’’

parameters. Concerning the parameters r and p2;q the

L. Bruzzone et al. / Information Fusion 3 (2002) 289–297

297

minimization is straightforward and leads to Eqs. (13)

and (14). Concerning the parameters P ðu Þ and

remote-sensing images, IEEE Transactions on Geoscience and

Remote Sensing 39 (2001) 456–460.

[

7] F. Maselli, M.A. Gilabert, C. Conese, Integration of high and low

resolution NDVI data for monitoring vegetation in mediterranean

environments, Remote Sensing of Environment 63 (1998) 208–

218.

P ðx =u Þ the following constraints should be considered:

P ðu Þ ¼ 1

ð24Þ

8] A. Grignetti, R. Salvatori, R. Casacchia, F. Manes, Mediterra-

nean vegetation analysis by multi-temporal satellite sensor data,

International Journal of Remote Sensing 18 (1997) 1307–1318.

q¼1

ðx =u Þ ¼ 1

ð25Þ

[9] M.A. Friedl, C.E. Brodley, A.H. Strahler, Maximizing land cover

accuracies produced by decision trees at continental to global

scales, IEEE Transactions on Geoscience and Remote-Sensing 37

i¼1

This can be easily done by introducing two Lagrange

multipliers. Accordingly equations (12) and (17) can be

obtained.

(

1999) 969–977.

[

10] J.T. Tou, R.C. Gonzalez, Pattern Recognition Principles, Addi-

son, Reading, MA, 1974.

11] A.P. Dempster, N.M. Laird, D.B. Rubin, Maximum likelihood

from incomplete data via the EM algorithm, Journal of Royal

Statistical Society 39 (1977) 1–38.

References

[

12] B.M. Shahshahani, D. Landgrebe, The eﬀect of unlabeled samples

in reducing the small sample size problem and mitigating the

Hughes phenomenon, IEEE Transactions on Geoscience and

Remote-Sensing 32(1994) 1087–1095.

[

1] J.A. Richards, Remote Sensing Digital Image Analysis, second

ed., Springer-Verlag, New York, 1993.

2] J.A. Benediktsson, P.H. Swain, O.K. Ersoy, Neural networks

approaches versus statistical methods in classiﬁcation of multi-

source remote sensing data, IEEE Transactions on Geoscience

and Remote Sensing 28 (1990) 540–552.

[13] T.K. Moon, The expectation-maximization algorithm, Signal

Processing Magazine 13 (1996) 47–60.

[14] D.J. Miller, S.U. Hasan, Combined learning and use for a mixture

model equivalent to the RBF classiﬁer, Neural Computation 10

(1998) 281–293.

[

3] J.A. Benediktsson, P.H. Swain, Consensus theoretic classiﬁcation

methods, IEEE Transactions on Systems, Man and Cybernetics 22

(1992) 688–704.

[15] L. Lam, C.Y. Suen, Application of majority voting to pattern

recognition: An analysis of its behavior and performance, IEEE

Transactions on System, man and Cybernetics 27 (1997) 553–568.

[16] J. Kittler, M. Hatef, R.P.W. Duin, J. Mates, On combining

classiﬁers, IEEE Transactions on pattern Analysis and Machine

Intelligence 20 (1998) 126–239.

4] L. Bruzzone, D. Fern ꢀa ndez Prieto, S.B. Serpico, A neural

statistical approach to multitemporal and multisource remote-

sensing image classiﬁcation, IEEE Transactions on Geoscience

and Remote Sensing 37 (1999) 1350–1359.

[

5] L. Bruzzone, D. Fern ꢀa ndez Prieto, A technique for the selection of

kernel-function parameters in RBF neural networks for classiﬁ-

cation of remote-sensing images, IEEE Transactions on Geo-

science and Remote-Sensing 37 (1999) 1179–1184.

[17] R.M. Haralick, K. Shanmugan, I. Dinstein, Textural features for

image classiﬁcation, IEEE Transactions on System, Man and

Cybernetics 3 (1973) 610–621.

6] L. Bruzzone, D. Fern ꢁa ndez Prieto, Unsupervised retraining of a

maximum-likelihood classiﬁer for the analysis of multitemporal

[18] C. Bishop, Neural Networks for Pattern Recognition, Clarendon

Press, Oxford, 1995.

Products guided by the article

Product name:2-Methyl-5-vinylpyridine

Cas No:140-76-1

R&D Labs maybe for 140-76-1

SHANGHAI BIOSUNDRUG CO.,LTD.

Contact:86-21-34622192，13917187091，21-34622765

Address:No. 500 Caobao Road Shanghai P.R China
HANWAYS CHEMPHARM CO.,LIMITED

website:http://www.hanwayschem.com

Contact:+86-18502787239(whatsapp)-

Address:18-1-802, Green Garden, Jianghan District, Wuhan 430023, China
Chengdu Gelipu Biotechnology Co., Ltd.

website:http://www.glp-china.com

Contact:86-28-82610909

Address:chegndu
Taixing Joxin Bio-tec Co.,Ltd.

website:http://www.joxbio.com

Contact:86-523-87558858 87612088

Address:No.88, chengdong industrial park
Huludao Tianqi Shengye Chemical Co.,Ltd.

Contact:0086 429 2075777

Address:Area B,Shipbuilding Industry Park,Beigang District,Huludao City,Liaoning prov.,China

Relevant to this article

Delayed fluorescence from a zirconium(iv) photosensitizer with ligand-to-metal charge-transfer excited states

Doi:10.1038/s41557-020-0430-7
(2020)
Highly chemoselective deoxygenation of N-heterocyclic: N -oxides under transition metal-free conditions

Doi:10.1039/d1ob00260k
(2021)
Synthesis, characterization and study of catalytic activity of Silver doped ZnO nanocomposite as an efficient catalyst for selective oxidation of benzyl alcohol

Doi:10.1007/s12039-015-0795-0
(2015)
OLIGOMERISATION OF PHENYL ACETYLENE OVER TITANIUM OXIDE SUPPORTED ON SILICA - ALUMINA CATALYST

Doi:10.1016/S0040-4039(00)95064-0
(1985)
First example of selenium transfer reaction of primary selenoamides and selenourea. Novel synthesis of dialkyl diselenides from alkyl halides

Doi:10.1016/0022-328X(94)24765-B
(1995)
Doi:10.1021/ja01866a039
(1940)

Article Doi

DOI: 10.1016/S1566-2535(02)00091-X

Source and publish data:

Authors:

Article abstract of DOI:10.1016/S1566-2535(02)00091-X

Full text of DOI:10.1016/S1566-2535(02)00091-X

Products guided by the article

R&D Labs maybe for 140-76-1

Relevant to this article

Hot Product