Davis
data and explain what
they representlm()
function and check your
calculations aboveMandel
to regress
y
on x1
.The Davis
data set in the carData
package contains the measured and self-reported heights and weights of
200 men and women engaged in regular exercise. A few of the data values
are missing, and consequently there are only 183 complete cases for the
variables that are used in the analysis reported below.
First, take a quick look at the data:
library(carData)
summary(Davis)
## sex weight height repwt repht
## F:112 Min. : 39.0 Min. : 57.0 Min. : 41.00 Min. :148.0
## M: 88 1st Qu.: 55.0 1st Qu.:164.0 1st Qu.: 55.00 1st Qu.:160.5
## Median : 63.0 Median :169.5 Median : 63.00 Median :168.0
## Mean : 65.8 Mean :170.0 Mean : 65.62 Mean :168.5
## 3rd Qu.: 74.0 3rd Qu.:177.2 3rd Qu.: 73.50 3rd Qu.:175.0
## Max. :166.0 Max. :197.0 Max. :124.00 Max. :200.0
## NA's :17 NA's :17
Davis
data and
explain what they representWe focus here on the regression of weight
on
repwt
. This problem has response \(y=weight\) and one predictor,
repwt
, from which we obtain the regressor \(x_1=repwt\). We again construct the design
matrix and response vector first.
X <- as.matrix(cbind(1, Davis[, "repwt"]))
Y <- as.matrix(Davis[, "weight"])
# Now get X rows with complete observations
X.complete.rows <- complete.cases(X) # read the documentation of funciton complete.cases()
# Similarly get Y's
Y.complete.rows <- complete.cases(Y)
# subset X and Y
X <- X[X.complete.rows & Y.complete.rows, ]
Y <- Y[X.complete.rows & Y.complete.rows, ]
lm()
function and check
your calculations aboveMandel
to regress
y
on x1
.