Consider the joint PDF for the type of customer service X (0 = telephonic hotline, 1 = Email) and of satisfaction score Y (1 = unsatisfied, 2 = satisfied, 3 = very satisfied):
(a) Determine and interpret the marginal distributions of both X and Y .
(b) Calculate the 75 % quantile for the marginal distribution of Y .
(c) Determine and interpret the conditional distribution of satisfaction level for X = 1.
(d) Are the two variables independent?
(e) Calculate and interpret the covariance of X and Y .
X\Y | 1 | 2 | 3 |
0 | 0 | 1/2 | 1/4 |
1 | 1/6 | 1/12 | 0 |
(a) The marginal distributions are obtained by the row and column sums of the joint PDF, respectively. For example, P(X = 1) =\Sigma ^{J}_{j=1} p_{1j}=p_{1+}={1}/{4}.
The marginal distribution of X tells us how many customers sought help via the telephone hotline (75 %) and via email (25 %). The marginal distribution of Y represents the distribution of the satisfaction level, highlighting that more than half of the customers (7/12) were “satisfied”.
(b) To determine the 75 % quantile with respect to Y , we need to find the value y_{0.75} for which F(y_{0.75}) ≥ 0.75 and F(y) < 0.75 for y < y_{0.75}. Y takes the values 1, 2, 3. The quantile cannot be y_{0.75} = 1 because F(1) = 1/6 < 0.75. The 75 % quantile is y_{0.75} = 2 because F(2) = 1/6 + 7/12 = 3/4 ≥ 0.75 and for all values which are smaller than 2 we get F(x) < 0.75.
(c) We can calculate the conditional distribution using P(Y = y_{j} |X = 1) = p_{1j} / p_{1+} = p_{1j} /(1/6 + 1/12 + 0) = p_{1j} /(0.25). Therefore,
P(Y = 1|X = 1) = \frac{{1}/{6}}{{1}/{4}}= \frac{2}{3},
P(Y = 2|X = 1) = \frac{{1}/{12}}{{1}/{4}}= \frac{1}{3},
P(Y = 3|X = 1) = \frac{0}{{1}/{4}}= 0.
Among those who used the email customer service two-thirds were unsatisfied, one-third were satisfied, and no one was very satisfied.
(d) As we know from (7.27), two discrete random variables are said to be independent if P(X = x_{i} , Y = y _{j} ) = P(X = x_{i} )P(Y = y_{j}). However, in our example, P(X = 0, Y = 1) = P(X = 0)P(X = 1) = \frac{3}{4} ·\frac{1}{6}≠ 0. This means that X and Y are not independent.
(e) The covariance of X and Y is defined as Cov(X, Y ) = E(XY) − E(X)E(Y ).We calculate
E(X) = 0 \cdot \frac{3}{4}+ 1 \cdot \frac{1}{4}=\frac{1}{4}
E(Y ) = 1 \cdot \frac{1}{6}+ 2 \cdot \frac{7}{12}+ 3 \cdot \frac{1}{4}=\frac{25}{12}
E(XY) = 0 · 1 · 0 +1 \cdot 1 \cdot \frac{1}{6}+ 0 \cdot 2 \cdot \frac{1}{2} + 1 \cdot 2 \cdot \frac{1}{12}+ 0 \cdot 3 \cdot \frac{1}{4} + 1 \cdot 3 \cdot 0
= \frac{2}{6}
Cov(X, Y ) =\frac{2}{6}-\frac{1}{4} \cdot \frac{25}{12}= – \frac{3}{16}.
Since Cov(X, Y) < 0, we conclude that there is a negative relationship between X and Y : the higher the values of X, the lower the values of Y—and vice versa. For example, those who use the email-based customer service (X = 1) are less satisfied than those who use the telephone customer service (X = 0). It is, however, evident that in this example the values of X have no order and therefore, care must be exercised in interpreting the covariance.
X | P(X = x_{i} ) |
0 | 3/4 |
1 | 1/4 |
Y | P(Y = y_{i} ) |
1 | 1/6 |
2 | 7/12 |
3 | 1/4 |