Search National Agricultural Library Digital Collections
Back to Search
NALDC Record Details:
Using R2 to compare least-squares fit models: When it must fail
R2 can be used correctly to select from among competing least-squares fit models when the data are fitted in common form and with common weighting. However, when models are compared by fitting data that have been mathematically transformed in different ways, R2 is a flawed statistic, even when the data are properly weighted in accord with the transformations. The reason is that in its most commonly used form, R2can be expressed in terms of the excess variance (s2) and the total variance in y (sy2) — the first of which is either invariant or approximately so with proper weighting, but the second of which can vary substantially in data transformations. When given data are analyzed “as is” with different models and fixed weights, sy2 remains constant and R2 is a valid statistic. However, then s2, and χ2 in weighted fitting, are arguably better metrics for such comparisons.
Bolster, Carl H.
Chemometrics and intelligent laboratory systems 2011 Feb. 15, v. 105, no. 2
Journal Articles, USDA Authors, Peer-Reviewed
Works produced by employees of the U.S. Government as part of their official duties are not copyrighted within the U.S. The content of this document is not copyrighted.
Agricultural Research Service
Web Policies and Important Links