Holooly Plus Logo

Question 11.4: To study the association of the monthly average temperature ......

To study the association of the monthly average temperature (in °C, X) and hotel occupation (in %, Y ), we consider data from three cities: Polenca (Mallorca, Spain) as a summer holiday destination, Davos (Switzerland) as a winter skiing destination, and Basel (Switzerland) as a business destination.

(a) Interpret the following regression model output where the outcome is “hotel occupation” and “temperature” is the covariate.

(b) Interpret the following output where “city” is treated as a covariate and “hotel occupation” is the outcome.

(c) Interpret the following output and compare it with the output from b):

(d) In the following multiple linear regression model, both “city” and “temperature” are treated as covariates. How can the coefficients be interpreted?

(e) Now consider the regression model for hotel occupation and temperature fitted separately for each city: How can the results be interpreted and what are the implications with respect to the models estimated in (a)–(d)? Howcan the models be improved?

(f) Describe what the design matrix will look like if city, temperature, and the interaction between them are included in a regression model.
(g) If the model described in (f) is fitted the output is as follows:

Interpret the results.

(h) Summarize the association of temperature and hotel occupation by city— including 95% confidence intervals—using the interaction model. The covariance matrix is as follows:

Month Davos Polenca Basel
X Y X Y X Y
Jan -6 91 10 13 1 23
Feb -5 89 10 21 0 82
Mar 2 76 14 42 5 40
Apr 4 52 17 64 9 45
May 7 42 22 79 14 39
Jun 15 36 24 81 20 43
Jul 17 37 26 86 23 50
Aug 19 39 27 92 24 95
Sep 13 26 22 36 21 64
Oct 9 27 19 23 14 78
Nov 4 68 14 13 9 9
Dec 0 92 12 41 4 12
Estimate Std. Error t value Pr(>|t|)
(Intercept) 50.33459 7.81792 6.438 2.34e-07 ***
X 0.07717 0.51966 0.149 0.883
Estimate Std. Error t value Pr(>|t|)
(Intercept) 48.3333 7.9457 6.083 7.56e-07 ***
cityDavos 7.9167 11.2369 0.705 0.486
cityPolenca 0.9167 11.2369 0.082 0.935
Analysis of Variance Table
Response: Y
Df Sum Sq Mean Sq F value Pr(>F)
city 2 450.1 225.03 0.297 0.745
Residuals 33 25001.2 757.61
Estimate Std. Error t value Pr(>|t|)
(Intercept) 44.1731 10.9949 4.018 0.000333 ***
X 0.3467 0.6258 0.554 0.583453
cityDavos 9.7946 11.8520 0.826 0.414692
cityPolenca -1.1924 11.9780 -0.100 0.921326
Davos:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 73.9397 4.9462 14.949 3.61e-08 ***
X -2.6870 0.4806 -5.591 0.000231 ***
Polenca:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -22.6469 16.7849 -1.349 0.20701
X 3.9759 0.8831 4.502 0.00114 **
Basel:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 32.574 13.245 2.459 0.0337 *
X 1.313 0.910 1.443 0.1796
Estimate Std. Error t value Pr(>|t|)
(Intercept) 32.5741 10.0657 3.236 0.002950 **
X 1.3133 0.6916 1.899 0.067230 .
cityDavos 41.3656 12.4993 3.309 0.002439 **
cityPolenca -55.2210 21.0616 -2.622 0.013603 *
X:cityDavos -4.0003 0.9984 -4.007 0.000375 ***
X:cityPolenca 2.6626 1.1941 2.230 0.033388 *
(Int.) X Davos Polenca X:Davos X:Polenca
(Int.) 101.31 -5.73 -101.31 -101.31 5.73 5.73
X -5.73 0.47 5.73 5.73 -0.47 -0.47
Davos -101.31 5.73 156.23 101.31 -9.15 -5.73
Polenca -101.31 5.73 101.31 443.59 -5.73 -22.87
X:Davos 5.73 -0.47 -9.15 -5.73 0.99 0.47
X:Polenca 5.73 -0.47 -5.73 -22.87 0.47 1.42
Step-by-Step
The 'Blue Check Mark' means that this solution was answered by an expert.
Learn more on how do we answer questions.

(a) The point estimate of β suggests a 0.077 % increase of hotel occupation for each one degree increase in temperature. However, the null hypothesis of β = 0 cannot be rejected because p = 0.883 > 0.05. We therefore cannot show an association between temperature and hotel occupation.

(b) The average hotel occupation is higher in Davos (7.9 %) and Polenca (0.9 %) compared with Basel (reference category). However, these differences are not significant. Both H _{0} : β _{Davos} = 0 and H _{0} : β _{Polenca} = 0 cannot be rejected. The model cannot show a significant difference in hotel occupation between Davos/Polenca and Basel.

(c) The analysis of variance table tells us that the null hypothesis of equal average temperatures in the three cities (β _{1} = β _{2} = 0) cannot be rejected. Note that in this example the overall F-test would have given us the same results.

(d) In the multivariate model, the main conclusions of (a) and (b) do not change: testing H _{0} : β _{j} = 0 never leads to the rejection of the null hypothesis.We cannot show an association between temperature and hotel occupation (given the city); and we cannot show an association between city and hotel occupation (given the temperature).

(e) Stratifying the data yields considerably different results compared to (a)–(c): In Davos, where tourists go for skiing, each increase of 1 °C relates to a drop in hotel occupation of 2.7 %. The estimate \hat{\beta } ≈ −2.7 is also significantly different from zero (p = 0.000231). In Polenca, a summer holiday destination, an increase of 1 °C implies an increase of hotel occupation of almost 4 %. This estimate is also significantly different from zero (p = 0.00114 < 0.05). In Basel, a business destination, there is a somewhat higher hotel occupation for higher temperatures (\hat{\beta } = 1.3); however, the estimate is not significantly different from zero. While there is no overall association between temperature and hotel occupation (see (a) and (c)), there is an association between them if one looks at the different cities separately. This suggests that an interaction between temperature and city should be included in the model.

(f) The design matrix contains a column of 1’s (to model the intercept), the temperature and two dummies for the categorical variable “city” because it has three categories. The matrix also contains the interaction terms which are both the product of temperature and Davos and temperature and Polenca. The matrix has 36 rows because there are 36 observations: 12 for each city.

  \begin{array}{r c} \begin{matrix} Int. && Temp. &&& Davos &&& Polenca && Temp.×Davos && Temp.×Polenca \end{matrix} \\ \begin{matrix} 1 \\ 2 \\ \vdots \\ 12 \\ 13 \\ \vdots \\ 24 \\ 25 \\\vdots \\ 36 \end{matrix} \left ( \begin{matrix} 1 &&& -6 &&&&& 1 &&&&&& 0 &&&&& &&-6 &&&&&&& 0 &&&\\ 1 &&& -5 &&&&& 1 &&&&&& 0 &&&&& &&-5 &&&&&&& 0 &&&\\ \vdots &&& \vdots &&&&& \vdots &&&&&& \vdots &&&&&&& \vdots &&&&&&& \vdots &&& \\ 1 &&& 0 &&&&& 1 &&&&&& 0 &&&&&&& 0 &&&&&&& 0 &&& \\ 1 &&& 10 &&&&& 0 &&&&&& 1 &&&&&&& 0 &&&&&&& 10 &&&\\ \vdots &&&\vdots &&&&& \vdots &&&&&& \vdots &&&&&&& \vdots &&&&&&& \vdots &&& \\ 1 &&& 12 &&&&& 0 &&&&&& 1 &&&&&&& 0 &&&&&&& 12 &&& \\ 1 &&& 1 &&&&& 0 &&&&&& 0 &&&&&&& 0 &&&&&&& 0 &&& \\ \vdots &&& \vdots &&&&& \vdots &&&&&& \vdots &&&&& &&\vdots &&&&&&& \vdots &&& \\ 1 &&& 4 &&&&& 0 &&&&&& 0 &&&&&&& 0 &&&&&&& 0 &&& \end{matrix} \right ) \end{array}

(g) Both interaction terms are significantly different from zero (p = 0.000375 and p = 0.033388). The estimate of temperature therefore differs by city, and the estimate of city differs by temperature. For the reference city of Basel, the association between temperature and hotel occupation is estimated as 1.31; for Davos it is 1.31 − 4.00 = −2.69 and for Polenca 1.31 + 2.66 = 3.97. Note that these results are identical to (d) where we fitted three different regressions—they are just summarized in a different way.

(h) From (f) it follows that the point estimates for β _{temperature} are 1.31 for Basel, −2.69 for Davos, and 3.97 for Polenca. Confidence intervals for these estimates can be obtained via (11.29):

\left(\hat{\beta } _{i}+\hat{\beta } _{j}\right) \pm t_{n−p−1;1−α/2}\cdot \hat{\sigma } _{\left(\hat{\beta } _{i}+\hat{\beta } _{j}\right)}.

We calculate t_{n−p−1;1−α/2} = t _{36−5−1,0.975} = t_{30,0.975} = 2.04. With Var(β _{temp}.) = 0.478 (obtained via 0.6916² from the model output or from the second row and second column of the covariance matrix), Var(β _{temp:Davos}) = 0.997, Var(β _{Polenca}) = 1.43, Cov(β _{temp}., β _{temp:Davos}) = −0.48, and also Cov( β _{temp}., β _{temp:Polenca}) = −0.48 we obtain:

\hat{\sigma } _{\left(\hat{\beta } _{temp}+\hat{\beta } _{Davos}\right)}=\sqrt{0.478 + 0.997 − 2 · 0.48} \approx 0.72,

\hat{\sigma } _{\left(\hat{\beta } _{temp}+\hat{\beta } _{Polenca}\right)}=\sqrt{0.478 + 1.43 − 2 · 0.48} \approx 0.97,

\hat{\sigma } _{\left(\hat{\beta } _{temp}+\hat{\beta } _{Basel}\right)}=\sqrt{0.478 + 0 + 0} \approx 0.69.

The 95 % confidence intervals are therefore:

Davos: [−2.69 ± 2.04 · 0.72] ≈ [−4.2;−1.2],
Polenca: [3.97 ± 2.04 · 0.97] ≈ [2.0; 5.9],
Basel: [1.31 ± 2.04 · 0.69] ≈ [−0.1; 2.7].

Related Answered Questions

Question: 11.3

Verified Answer:

(a) The correlation coefficient is r=\frac...