Fit indices for structural equation modeling 

Psychlopedia  Quantitative data analysis  Structural equation modeling  Fit indices for structural equation modeling
Jump to the comments Section OverviewIn structural equation modeling, the fit indices establish whether, overall, the model is acceptable. If the model is acceptable, researchers then establish whether specific paths are significant. Acceptable fit indices do not imply the relationships are strong. Indeed, high fit indices are often easier to obtain when the relationships between variables are low rather than highbecause the power to detect discrepancies from predictions are amplified. Many of the fit indices are derived from the chisquare value. Conceptually, the chisquare value, in this context, represents the difference between the observed covariance matrix and the predicted or model covariance matrix. The fit indices can be classified into several classes. These classes include:
Many researchers, such as Marsh, Balla, and Hau (1996), recommend that individuals utilize a range of fit indices. Indeed, Jaccard and Wan (1996) recommend using indices from different classes as well; this strategy overcomes the limitations of each index. Summary of criteria that researchers often useA model is regarded as acceptable if:
These criteria are merely guidelines. To illustrate, in a field in which previous models generate CFI values of .70 only, a CFI value of .85 represents progress and thus should be acceptable (Bollen, 1989). Discrepancy functionsChisquareThe chisquare for the model is also called the discrepancy function, likelihood ratio chisquare, or chisquare goodness of fit. In AMOS, the chisquare value is called CMIN. If the chisquare is not significant, the model is regarded as acceptable. That is, the observed covariance matrix is similar to the predicted covariance matrixthat is, the matrix predicted by the model. If the chisquare is significant, the model is regarded, at least sometimes, as unacceptable. However, many researchers disregard this index if both the sample size exceeds 200 or so and other indices indicate the model is acceptable. In particular, this approach arises because the chisquare index presents several problems:
Relative chisquareThe relative chisquare is also called the normed chisquare. This value equals the chisquare index divided by the degrees of freedom. This index might be less sensitive to sample size. The criterion for acceptance varies across researchers, ranging from less than 2 (Ullman, 2001) to less than 5 (Schumacker & Lomax, 2004). Root mean square residualThe RMS, also called the RMR or RMSE, represents the square root of the average or mean of the covariance residualsthe differences between corresponding elements of the observed and predicted covariance matrix. Zero represents a perfect fit, but the maximum is unlimited. Because the maximum is unbounded, the RMS is difficult to interpret and consensus has not been reached on the levels that represent acceptable models. Some researchers utilized the standardized version of the RMS instead to override this problem. According to some researchers, RMS should be less than .08 (Browne & Cudeck, 1993)and ideally less than .05 (Stieger, 1990). Alternatively, the upper confidence interval of the RMS should not exceed .08 (Hu & Bentler, 1998). Indices that compare the target and null modelsComparative fit index (CFI)The comparative fit index, like the IFI, NFI, BBI, TLI, and RFI, compare the model of interest with some alternative, such as the null or independence model. The CFI is also known as the Bentler Comparative Fit Index. Specifically, the CFI compares the fit of a target model to the fit of an independent modela model in which the variables are assumed to be uncorrelated. In this context, fit refers to the difference between the observed and predicted covariance matrices, as represented by the chisquare index. In short, the CFI represents the ratio between the discrepancy of this target model to the discrepancy of the independence model. Roughly, the CFI thus represents the extent to which the model of interest is better than is the independence model. Values that approach 1 indicate acceptable fit. CFI is not too sensitive to sample size (Fan, Thompson, and Wang, 1999). However, CFI is not effective if most of the correlations between variables approach 0because there is, therefore, less covariance to explain. Furthermore, Raykov (2000, 2005) argues that CFI is a biased measure, based on noncentrality. Incremental fit index (IFI)The incremental fit index, also known as Bollen's IFI, is also relatively insensitive to sample size. Values that exceed .90 are regarded as acceptable, although this index can exceed 1. To compute the IFI, first the difference between the chi square of the independence modelin which variables are uncorrelatedand the chisquare of the target model is calculated. Next, the difference between the chisquare of the target model and the df for the target model is calculated. The ratio of these values represents the IFI. Normed fit index (NFI)The NFI is also known as the BentlerBonett normed fit index. The fit index varies from 0 to 1where 1 is ideal. The NFI equals the difference between the chisquare of the null model and the chi square of target model, divided by the chisquare of the null model. In other words, an NFI of .90, for example, indicates the model of interest improves the fit by 90% relative to the null or independence model. When the samples are small, the fit is often underestimated (Ullman, 2001). Furthermore, in contrast to the TLI, the fit can be overestimated if the number of parameters is increased; the NNFI overcomes this problem. Tucker Lewis index (TLI) or Nonnormed fit index (NNFI)The TLI, sometimes called the NNFI, is similar to the NFI. However, the index is lower, and hence the model is regarded as less acceptable, if the model is complex. To compute the TLI:
According to Marsh, Balla, and McDonald (1988), the TFL is relatively independent of sample size. The TFI is usually lower than is the GFIbut values over .90 or over .95 are considered acceptable (e.g., Hu & Bentler, 1999). Information theory goodness of fit measuresAkaike Information CriterionThe AIC, like the BIC, BCC, and CAIC, is regarded as an information theory goodness of fit measureapplicable when maximum likelihood estimation is used (Burnham & Anderson, 1998). These indices are used to compare different models. The models that generate the lowest values are optimal. The absolute AIC value is irrelevantalthough values closer to 0 are ideal; only the AIC value of one model relative to the AIC value of another model is meaningful. Like the chi square index, the AIC also reflects the extent to which the observed and predicted covariance matrices differ from each other. However, unlike the chi square index, the AIC penalizes models that are too complex. In particular, the AIC equals the chi square divided by n plus 2k / (n1). In this formula, k = .5v/v + 1  df, where v is the number of variables and n = the sample size. BrowneCudeck criterion (BCC) and Consistent AIC (CAIC)The BCC is similar to the AIC. That is, the BCC and AIC both represent the extent to which the observed covariance matrix differs from the predicted covariance matrixlike the chi square statisticbut include a penalty if the model is complex, with many parameters. The BCC bestows an even harsher penalty than does the AIC. The BCC equals the chi square divided by n plus 2k / (n v  2). In this formula, k = .5v/v + 1  df, where v is the number of variables and n = the sample size. The CAIC is similar to the AIC as well. However, the CAIC also confers a penalty if the sample size is small. Bayesian Information Criterion (BIC)The Bayesian Information Criterion is also known as Akaike's Bayesian Information Criterion (ABIC) and the Schwarz Bayesian Criterion (SBC). This index is similar to the AIC, but the penalty against complex models is especially pronouncedeven more pronounced than is the BCC and CAIC indices. Furthermore, like the CAIC, a penalty against small samples is include. BIC was derived by Raftery (1995). Roughly, the BIC is the log of a Bayes factor of the target model compared to the saturated model. Determinants of which indices to useMany other indices have also been developed. These indices include the GFI, AGFI, FMIN, noncentrality parameter, and centrality index. The GFI and, to a lesser extent, the FMIN used to be very popular, but their use has dwindled recently. Some indices are especially sensitive to sample size. For example, fit indices overestimate the fit when the sample size is smallbelow 200, for example. Nevertheless, RMSEA and CFI seem to be less sensitive to sample size (Fan, Thompson, and Wang, 1999). (for further information see Comprehensive summary of SEM).ReferencesAnderson, J. C., & Gerbing, D. W. (1984). The effect of sampling error on convergence, improper solutions and goodnessoffit indices for maximum likelihood confirmatory factor analysis.Psychometrika, 49, 155173. Bentler, P M. (1990). Comparative fit indexes in structural models. Psychological Bulletin, 107, 238246. Bentler, P. M., & Bonett, D. G. (1980). Significant tests and goodness of fit in the analysis of covariance structures. Psychological Bulletin, 88, 588606. Bentler, P. M., & Mooijaart, A. (1989). Choice of structural model via parsimony: A rationale based on precision. Psychological Bulletin, 106,315317. Bollen, K. A. (1989). Structural equations with latent variables. NY: Wiley. Bollen, K. A. (1990). Overall fit in covariance structure models: Two types of sample size effects. Psychological Bulletin, 107, 256259. Browne, M. W., & Cudeck, R. (1989). Single sample crossvalidation indices for covariance structures. Multivariate Behavioral Research, 24, 445455. Browne, M. W., & Cudeck, R. (1993). Alternative ways of assessing model fit. In K. A. Bollen & J. S. Long (Eds.), Testing structural equation models (pp. 136162). Newsbury Park, CA: Sage. Burnham, K, P., and D. R. Anderson (1998). Model selection and inference: A practical informationtheoretic approach. New York: SpringerVerlag. Byrne, B. M. (1994). Structural equation modeling with EQS and EQS/Windows. Thousand Oaks, CA: Sage Publications. Cheung, G. W. & Rensvold, R. B. (2002). Evaluating goodnessoffit indexes for testing measurement invariance. Structural Equation Modeling, 9, 233255. Fan, X., B. Thompson, and L. Wang (1999). Effects of sample size, estimation method, and model specification on structural equation modeling fit indexes. Structural Equation Modeling, 6, 5683. Hipp J. R., & Bollen K. A. (2003). Model fit in structural equation models with censored, ordinal, and dichotomous variables: testing vanishing tetrads. Sociological Methodology, 33, 267305. Hu, L. T., & Bentler, P. M. (1995). Evaluating model fit. In R. H. Hoyle (Ed.), Structural equation modeling: Concepts, issues, and applications (pp. 7699). Thousand Oaks, CA: Sage. Kline, R. B. (1998). Principles and practice of structural equation modeling. NY: Guilford Press. Jaccard, J., & Wan, C. K. (1996). LISREL approaches to interaction effects in multiple regression. Thousand Oaks, CA: Sage Publications. Joreskog, K. G. (1993). Testing structural equation models. In K. A. Bollen & J. S. Long (Eds.), Testing structural equation models (pp. 294316). Newbury, CA: Sage. Marsh, H. W., Balla, J. R., & Hau, K. T. (1996). An evaluation of incremental fit indexes: A clarification of mathematical and empirical properties. In G. A. Marcoulides & R. E. Schumacker (Eds.), Advanced structural equation modeling techniques(pp.315353 . Mahwah , NJ : Lawrence Erlbaum. Marsh, H. W., Balla, J. R., & McDonald, R. P. (1988). Goodnessoffit indexes in confirmatory factor analysis: The effect of sample size. Psychological Bulletin, 103, 391410. Marsh, H. W., & Hau, K. T. (1996). Assessing goodness of fit: Is parsimony alwaysdesirable? Journal of Experimental Education, 64, 364390. Raftery, A. E. (1995). Bayesian model selection in social research. In Adrian E. Raftery (Ed.) (pp. 111164). Oxford: Blackwell. Raykov, T. (2000). On the largesample bias, variance, and mean squared error of the conventional noncentrality parameter estimator of covariance structure models. Structural Equation Modeling, 7, 431441. Raykov, T. (2005). Biascorrected estimation of noncentrality parameters of covariance structure models. Structural Equation Modeling, 12, 120129. Schumacker, R. E., & Lomax, R. G. (2004). A beginner's guide to structural equation modeling, Second edition. Mahwah, NJ: Lawrence Erlbaum Associates. Steiger, J. H. (1990). Structural model evaluation and modification: An interval estimation approach. Multivariate Behavioural Research, 25, 173180. Steiger J. H. (2000). Point estimation, hypothesis testing and interval estimation using the RMSEA: Some comments and a reply to Hayduk and Glaser. Structural Equation Modeling, 7, 149162. Tucker, L. R., & Lewis, C. (1973). The reliability coefficient for maximum likelihood factor analysis. Psychometrika, 38, 110. Ullman, J. B. (2001). Structural equation modeling. In B. G. Tabachnick & L. S. Fidell (2001). Using Multivariate Statistics (4th ed; pp 653 771). Needham Heights, MA: Allyn & Bacon. Created by Dr Simon Moss on 27/04/2009
Free Personality Tests :
Relationships 
Personality 
Beliefs 
Wellbeing 
Attitudes 
Behaviour 
Cognitive Abilities

