Let’s read in the Davis data.
davis.data <- read.table("./Davis.txt")
davis.data[12, "height"] <- 166
davis.data[12, "weight"] <- 57
# remove rows with missing data
davis.data <- davis.data[complete.cases(davis.data), ]
# take a look at the data
davis.data
## sex weight height reportedWeight reportedHeight
## 1 M 77 182 77 180
## 2 F 58 161 51 159
## 3 F 53 161 54 158
## 4 M 68 177 70 175
## 5 F 59 157 59 155
## 6 M 76 170 76 165
## 7 M 76 167 77 165
## 8 M 69 186 73 180
## 9 M 71 178 71 175
## 10 M 65 171 64 170
## 11 M 70 175 75 174
## 12 F 57 166 56 163
## 13 F 51 161 52 158
## 14 F 64 168 64 165
## 15 F 52 163 57 160
## 16 F 65 166 66 165
## 17 M 92 187 101 185
## 18 F 62 168 62 165
## 19 M 76 197 75 200
## 20 F 61 175 61 171
## 21 M 119 180 124 178
## 22 F 61 170 61 170
## 23 M 65 175 66 173
## 24 M 66 173 70 170
## 25 F 54 171 59 168
## 26 F 50 166 50 165
## 27 F 63 169 61 168
## 28 F 58 166 60 160
## 29 F 39 157 41 153
## 30 M 101 183 100 180
## 31 F 71 166 71 165
## 32 M 75 178 73 175
## 33 M 79 173 76 173
## 34 F 52 164 52 161
## 35 F 68 169 63 170
## 36 M 64 176 65 175
## 37 F 56 166 54 165
## 38 M 69 174 69 171
## 39 M 88 178 86 175
## 40 M 65 187 67 188
## 41 F 54 164 53 160
## 42 M 80 178 80 178
## 43 F 63 163 59 159
## 44 M 78 183 80 180
## 45 M 85 179 82 175
## 46 F 54 160 55 158
## 49 F 54 174 56 173
## 50 F 75 162 75 158
## 51 M 82 182 85 183
## 52 F 56 165 57 163
## 53 M 74 169 73 170
## 54 M 102 185 107 185
## 56 M 65 176 64 172
## 58 M 73 183 74 180
## 59 M 75 172 70 169
## 60 M 57 173 58 170
## 61 M 68 165 69 165
## 62 M 71 177 71 170
## 63 M 71 180 76 175
## 64 F 78 173 75 169
## 65 M 97 189 98 185
## 66 F 60 162 59 160
## 67 F 64 165 63 163
## 68 F 64 164 62 161
## 69 F 52 158 51 155
## 70 M 80 178 76 175
## 71 F 62 175 61 171
## 72 M 66 173 66 175
## 73 F 55 165 54 163
## 74 F 56 163 57 159
## 75 F 50 166 50 161
## 77 F 50 160 55 150
## 78 F 63 160 64 158
## 79 M 69 182 70 180
## 80 M 69 183 70 183
## 81 F 61 165 60 163
## 82 M 55 168 56 170
## 83 F 53 169 52 175
## 84 F 60 167 55 163
## 85 F 56 170 56 170
## 86 M 59 182 61 183
## 87 M 62 178 66 175
## 88 F 53 165 53 165
## 89 F 57 163 59 160
## 90 F 57 162 56 160
## 91 M 70 173 68 170
## 92 F 56 161 56 161
## 93 M 84 184 86 183
## 94 M 69 180 71 180
## 95 M 88 189 87 185
## 96 F 56 165 57 160
## 97 M 103 185 101 182
## 98 F 50 169 50 165
## 99 F 52 159 52 153
## 101 F 55 164 55 163
## 102 M 63 178 63 175
## 103 F 47 163 47 160
## 104 F 45 163 45 160
## 105 F 62 175 63 173
## 106 F 53 164 51 160
## 107 F 52 152 51 150
## 108 F 57 167 55 164
## 109 F 64 166 64 165
## 110 F 59 166 55 163
## 111 M 84 183 90 183
## 112 M 79 179 79 171
## 113 F 55 174 57 171
## 114 M 67 179 67 179
## 115 F 76 167 77 165
## 116 F 62 168 62 163
## 117 M 83 184 83 181
## 118 M 96 184 94 183
## 119 M 75 169 76 165
## 120 M 65 178 66 178
## 121 M 78 178 77 175
## 122 M 69 167 73 165
## 123 F 68 178 68 175
## 124 F 55 165 55 163
## 128 F 45 157 45 153
## 129 F 68 171 68 169
## 130 F 44 157 44 155
## 131 F 62 166 61 163
## 132 M 87 185 89 185
## 133 F 56 160 53 158
## 134 F 50 148 47 148
## 135 M 83 177 84 175
## 136 F 53 162 53 160
## 137 F 64 172 62 168
## 139 M 90 188 91 185
## 140 M 85 191 83 188
## 141 M 66 175 68 175
## 142 F 52 163 53 160
## 143 F 53 165 55 163
## 144 F 54 176 55 176
## 145 F 64 171 66 171
## 146 F 55 160 55 155
## 147 F 55 165 55 165
## 148 F 59 157 55 158
## 149 F 70 173 67 170
## 150 M 88 184 86 183
## 151 F 57 168 58 165
## 152 F 47 162 47 160
## 153 F 47 150 45 152
## 155 F 48 163 44 160
## 156 M 54 169 58 165
## 157 M 69 172 68 174
## 160 F 57 167 56 165
## 161 F 51 163 50 160
## 162 F 54 161 54 160
## 163 F 53 162 52 158
## 164 F 59 172 58 171
## 165 M 56 163 58 161
## 166 F 59 159 59 155
## 167 F 63 170 62 168
## 168 F 66 166 66 165
## 169 M 96 191 95 188
## 170 F 53 158 50 155
## 171 M 76 169 75 165
## 173 M 61 170 61 170
## 175 M 62 168 64 168
## 176 M 71 178 68 178
## 178 M 66 170 67 165
## 179 M 81 178 82 175
## 180 M 68 174 68 173
## 181 M 80 176 78 175
## 184 F 63 165 59 160
## 185 M 70 173 70 173
## 186 F 56 162 56 160
## 187 F 60 172 55 168
## 188 F 58 169 54 166
## 189 M 76 183 75 180
## 190 F 50 158 49 155
## 191 M 88 185 93 188
## 192 M 89 173 86 173
## 193 F 59 164 59 165
## 194 F 51 156 51 158
## 195 F 62 164 61 161
## 196 M 74 175 71 175
## 197 M 83 180 80 180
## 199 M 90 181 91 178
## 200 M 79 177 81 178
You need to fill out the following sections for answering the questions.
Construct \(X\), \(X^TX\), and \(Y\).
# construct the design matrix
X <-
# assign names to X column
colnames(X) <- c("intercept", "reportedWeight")
# construct the Gramian matrix
gramian.mat <-
gramian.mat
# calculate the inverse of the Gramian matrix
gramian.mat.inv <- solve(gramian.mat)
# construct the Y matrix
Y <- as.matrix(davis.data[, "weight"])
Q1. Now let’s calculate the least square estimates
# Now calculate the LS estimate
beta.estimate <-
beta.estimate
Q2. Now calculate the variance-covariance matrix to get the standard error of \(\hat{\beta}_1\)
# calculate sum of squared errors (or residuals)
sse <-
sse
# calculate mean squared errors
mse <-
mse
# calculate the variance-covariance matrix for beta hat
sigma.matrix <-
sigma.matrix
# standard error for beta_1
sqrt(sigma.matrix[2, 2])
Q3. Hypothesis test of \(H_0: \beta_1 = 0\)
t(0.025, 179) critical value (the design matrix has 181 rows / 181 observations):
qt(1.0 - 0.025, 179) #take a look at the documentation of "qt"
# t statistic
t.statistic <-
Q4. Mean weight estimate given reportedWeight = 72:
a.vector <- c(1, 72)
mean.weight.est <-
mean.weight.est
Q5. standard error:
theta.se <- sqrt()
theta.se
Q6. 95% confidence interval:
confidence.interval <-
confidence.interval
Use lm
to check
(lmod <- lm(weight ~ reportedWeight, data = davis.data))
##
## Call:
## lm(formula = weight ~ reportedWeight, data = davis.data)
##
## Coefficients:
## (Intercept) reportedWeight
## 2.847 0.957
summary(lmod)
##
## Call:
## lm(formula = weight ~ reportedWeight, data = davis.data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -7.5029 -1.0943 -0.1374 1.0884 6.3465
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.84707 0.80817 3.523 0.000542 ***
## reportedWeight 0.95699 0.01204 79.472 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.235 on 179 degrees of freedom
## Multiple R-squared: 0.9724, Adjusted R-squared: 0.9723
## F-statistic: 6316 on 1 and 179 DF, p-value: < 2.2e-16
predict(lmod, newdata = data.frame("1" = 1, "reportedWeight" = 72), interval = "confidence")
## fit lwr upr
## 1 71.75025 71.38966 72.11084