Si=1 to n(E{^Yi} - mi)2 + Si=1 to ns2{^Yi}which is seen as composed of a bias component and a variance component.
Gp = (1/s2) [Si=1 to n(E{^Yi} - mi)2 + Si=1 to ns2{^Yi}]Note that s2 is unknown. Assuming that the model that includes all P-1 potential X variables is such that MSE(X1, ... , XP-1) is an unbiased estimator of s2, it can be shown that Gp can be estimated as
Cp = SSEp/MSE(X1, ... , XP-1) - (n-2p)where SSEp is the SSE for the model with p-1 X variables and MSE(X1, ... , XP-1) is the MSE for the model with all P-1 X variables. It can be shown that when there is no bias in the model with p-1 X variables then
E{Cp} ~= pThus when Cp values are plotted against p, unbiased models will fall near the line Cp = p.
SSEp = S(Yi - ^Yi)2where the sums are for i=1 to n. The difference is that in PRESSp Yi is compared to its predicted value from a regression from which observation i was excluded, so that ^Yi(i) is the "deleted predictor" (by analogy with the "deleted residual" of regression diagnostics). In fact, SSEp and PRESSp can also be written
PRESSp = S(Yi - ^Yi(i))2
SSEp = Sei2that is, PRESSp is the sum of the squared external residuals (or deleted residuals) di discussed in Module 10 on diagnostics for outliers and influential cases.
PRESSp = Sdi2