14 - Inference for Linear Regression
Confidence Interval

Alex John Quijano

12/03/2021

Previously on Statistics…

Inference on Linear Regression

Today, we will discuss the following:

Baby Weights (1/2)

Original data: weight of baby as a linear model of mother’s age. Notice that the relationship between mage and weight is not as strong as the relationship we saw previously between weeks and weight.

Baby Weights (2/2)

The least squares estimates of the intercept and slope are given in the estimate column. The observed slope is 0.036
term estimate std.error statistic p.value
(Intercept) 6.2295 0.708 8.79 <0.0001
mage 0.0361 0.024 1.50 0.1362

The population model is: \(y_{weight} = \beta_0 + \beta_1 x_{age} + e\) where \(y\) is the response, \(x\) is the predictor, \(\beta_0\) is the intercept, \(\beta_1\) is the slope, and \(e\) is the error term.

The least squares regression model uses the data to find a sample linear fit: \(\hat{y}_{weight} = b_0 + b_1 x_{age}.\) where \(b_0 = 6.2295\), \(b_1 = 0.0361\).

CI by Bootstrapping (1/3)

CI by Bootstrapping (2/3)

Repeated bootstrap resamples of size 100 are taken from the original data.  Each of the bootstrapped linear model is slightly different.

Repeated bootstrap resamples of size 100 are taken from the original data. Each of the bootstrapped linear model is slightly different.

CI by Bootstrapping (3/3)

CI by Theoretical Model

We can now construct the confidence interval in the usual way:

\[ \begin{aligned} \text{b_1} &\pm t_{99}^* \times SE \\ 0.0361 &\pm 1.98 \times 0.024 \\ (-0.0114,0.0836) \end{aligned} \]

We are 95% confident that a one unit increase in mage (in years) will be associated with an increase in predicted average baby weight of between \(-0.0114\) and \(0.0836\) pounds.

10.10-Minute Activity

Consider the following least squares output.

Summary of least squares fit for the Elmhurst College data, where we are predicting the gift aid by the university based on the family income of students (n = 50).
term estimate std.error statistic p.value
(Intercept) 24319.33 1291.45 18.83 <0.0001
family_income -0.04 0.01 -3.98 2e-04

The sample linear fit is \(\hat{y}_{aid} = b_0 + b_1 x_{income}\) where \(b_0 = 24319.33\), \(b_1 = -0.04\).

  1. Construct a 90% confidence interval for the true slope of the linear model and interpret it in context.

  2. Considering the p-value shown in the table, what is the hypothesis testing conclusion? What significance value did you use, and does it matter?

  3. Does your 90% confidence interval conclusion agree with your hypothesis testing conclusion?