finite mixture distributions
945
p
∗
Proof: The proof proceeds in two steps: (a) L ꢅsˆ ꢆ−→ Lꢅs ꢆ; and (b) convergence of L ꢅsˆ ꢆ
I
m
m
I
m
implies convergence of sˆ. In the interest of brevity, we assume that all measurability conditions are
satisfied. (See also the discussion of NM’s Theorem 2.1.).
p
∗
To get step (a), note that by condition (i) and NM’s Theorem 2.1, sˆ −→ s . Then it follows from
m
m
p
∗
m
condition (iv) and NM’s Lemma 4.3 that L ꢅsˆ ꢆ −→ Lꢅs ꢆ.
I
m
It remains to show convergence of sˆ. It follows from consistency of sˆ and condition (ii) that
m
with probability approaching 1 (w.p.a.1) sˆ is in the interior of S and thus a local maximizer,
m
ꢃ
so that w.p.a.1 sˆ ∈ S . We now proceed as in NM’s Theorem 2.1. ∀ꢏ > 0, we have w.p.a.1:
m
ꢁ
ꢃ
(
(
1) Lꢅsˆꢆ > L ꢅsˆꢆ−ꢏ/3 (from condition (iii)); (2) L ꢅsˆꢆ > L ꢅsˆ ꢆ−ꢏ/3 (by condition (vi) and sˆ ∈ S );
I
I
I
m
m
ꢁ
∗
3) L ꢅsˆ ꢆ > Lꢅs ꢆ − ꢏ/3 (from step (a)). Together these conditions imply that w.p.a.1 Lꢅsˆꢆ >
I
m
m
∗
∗
∗
Lꢅs ꢆ−ꢏ. But by condition (v) s = s . It then follows from condition (iii) and arguments in NM’s
m
m
p
∗
Theorem 2.1 that sˆ −→ s .
Q.E.D.
Unless one of the local GMM maximizers is also the maximum likelihood estimator, it is essential
ꢃ
to include local as well as global maximizers in the set S . This can be illustrated with a simple
ꢁ
example. Suppose that S can be partitioned into two disjoint compact subsets, S and S , and that
1
2
∗
∗
in addition to satisfying the conditions for Theorem 1, the maximizers of each subset, s and s , are
1
2
∗
∗
global maximizers of Qꢅsꢆ over S. Suppose further that s1 is also the MLE maximizer s . Finally,
suppose that over S ꢂQ ꢅsꢆ = Qꢅsꢆ−1/I, while over S ꢂQ ꢅsꢆ = Qꢅsꢆ+1/I. It immediately follows
1
I
2
I
∗
∗
that sˆ = s and sˆ = s , and the proof of Theorem 1 goes through. But Q ꢅsˆ ꢆ < Q ꢅsˆ ꢆ, so that a
1
1
2
2
I
1
I
2
∗
search over global maxima would exclude sˆ = s .
The conditions for the proof apply naturally to the sequential estimator developed in the main text.
Qꢅsꢆ is the negative of inner product of the expectation vector in equation (11), and Q is its sample
I
analog. Condition (v) (cross-identification) follows from the construction of equation (11). Since any
solution to equation (11) will be a zero of Qꢅsꢆ, the sequential estimator is a local maximizer of
Qꢅsꢆ. One potential difficulty is that, as noted by Wu (1983), some mixture problems lack a compact
parameter space.
REFERENCES
Amemiya, T. (1978): “On a Two-step Estimation of a Multivariate Logit Model,” Journal of Econo-
metrics, 8, 13–21.
Arcidiacono, P. (2002): “Affirmative Action in Higher Education: How Do Admissions and Finan-
cial Aid Rules Affect Future Earnings?” Manuscript, Duke University.
Cameron, S., and J. Heckman (1998): “Life Cycle Schooling and Dynamic Selection Bias: Models
and Evidence for Five Cohorts of American Males,” Journal of Political Economy, 106, 262–333.
(
2001): “The Dynamics of Educational Attainment for Black, Hispanic, and White Males,”
Journal of Political Economy, 109, 455–499.
Cox, D. R. (1975): “Partial Likelihood,” Biometrika, 62, 269–275.
Dempster, A. P., N. M. Laird, and D. B. Rubin (1977): “Maximum Likelihood from Incomplete
Data via the EM Algorithm,” Journal of the Royal Statistical Society, B, 39, 1–38.
Eckstein, Z., and K. Wolpin (1999): “Why Youths Drop Out of High School: The Impact of
Preferences, Opportunities and Abilities,” Econometrica, 67, 1295–1339.
Efron, B. (1982): “Maximum Likelihood and Decision Theory,” Annals of Statistics, 10, 323–339.
Everitt, B. S., and D. J. Hand (1981): Finite Mixture Distributions. London: Chapman and Hall.
Follmann, D. A., and D. Lambert (1989): “Generalized Logistic Regression by Nonparametric
Mixing,” Journal of the American Statistical Association, 84, 295–300.
Hamilton, J. D. (1989): “A New Approach to the Economic Analysis of Nonstationary Time Series
and the Business Cycle,” Econometrica, 57, 357–385.
(
1990): “Analysis of Time Series Subject to Changes in Regime,” Journal of Econometrics,
45, 39–70.
Hansen, L. (1982): “Large Sample Properties of Generalized Method of Moments Estimators,”
Econometrica, 50, 1029–1054.