y = X1b1 + X2b2 + ewhere X1 includes p1 variables (columns) and X2 includes p2 variables (columns). For simplicity, assume that a constant term is included in X1.
b1 = (X1'X1)-1X1'yThus b1 is biased by a factor (X1'X1)-1X1'X2b2. What is the nature of this bias? Suppose for simplicity that the omitted matrix X2 consists of a single variable (i.e., X2 has one column). Then (X1'X1)-1X1'X2 is equal to the vector, call it b12, of estimated coefficients of the regression of the omitted variable X2 on the variables in X1. Thus
b1 = (X1'X1)-1X1'(X1b1 + X2b2 + e) (replacing y by its value in the correctly specified model)
b1 = b1 + (X1'X1)-1X1'X2b2 + (X1'X1)-1X1'e (multiplying out and simplifying)
E{b1} = b1 + (X1'X1)-1X1'X2b2 (taking expectations, because E{e} = 0 and X1 is a constant matrix)
E{b1} = b1 + b12b2so that the bias is seen as equal to the product of b2 (the effect of X2 on y in the true regression model) by b12, the estimated coefficients of the regression of X2 on X1.