Linear Correlation. Suppose that an experiment is conducted, and the resulting observations are recorded in two data vectors x = (x1 x2 ... xn), y = (y1 y2 ... yn), and let e = (1 1 ... 1). Problem: Determine to what extent the yi’s are linearly related to the xi’s. That is, measure how close y is

Question

Linear Correlation. Suppose that an experiment is conducted, and the resulting observations are recorded in two data vectors

[latex]x = \begin{pmatrix}x_1\x_2\\vdots\x_n\end{pmatrix}, y = \begin{pmatrix}y_1\y_2\\vdots\y_n\end{pmatrix},[/latex] and let [latex]e = \begin{pmatrix}1\1\\vdots\1\end{pmatrix}.[/latex]

Problem: Determine to what extent the [latex]y_i[/latex] ’s are linearly related to the [latex]x_i ’s[/latex]. That is, measure how close y is to being a linear combination [latex]β_0e + β_1x[/latex].

Accepted Answer

The cosine as defined in (5.4.1) does the job.

[latex]cos θ = \frac{〈x|y〉}{||x|| ||y||}.[/latex] (5.4.1)

To understand how, let [latex]μ_x and σ_x[/latex] be the mean and standard deviation of the data in x. That is,

[latex]μ_x = \frac{\sum_i x_i}{n} = \frac{e^T x}{n} and σ_x = \sqrt{\frac{\sum_i (x_i − μ_x)^2}{n}} = \frac{||x − μ_x e||_2}{\sqrt{n}}.[/latex]

The mean is a measure of central tendency, and the standard deviation measures the extent to which the data is spread. Frequently, raw data from different sources is difficult to compare because the units of measure are different—e.g., one researcher may use the metric system while another uses American units. To compensate, data is almost always first “standardized” into unitless quantities. The standardization of a vector x for which [latex]σ_x ≠ 0[/latex] is defined to be

[latex]z_x = \frac{x − μ_x e}{σ_x}.[/latex]

Entries in [latex]z_x[/latex] are often referred to as standard scores or z-scores. All standardized vectors have the properties that [latex]||z|| = \sqrt{n}, μ_z = 0, and σ_z = 1.[/latex] Furthermore, it’s not difficult to verify that for vectors x and y such that [latex]σ_x ≠ 0 and σ_y ≠ 0,[/latex] it’s the case that

[latex]z_x = z_y \Longleftrightarrow[/latex] ∃ constants [latex] β_0, β_1[/latex] such that [latex] y = β_0e + β_1x,[/latex] where [latex]β_1 > 0,[/latex]

[latex]z_x = −z_y \Longleftrightarrow[/latex] ∃ constants [latex] β_0, β_1[/latex] such that [latex] y = β_0e + β_1x,[/latex] where [latex]β_1 < 0.[/latex]

• In other words, [latex]y = β_0e+β_1x[/latex] for some [latex]β_0 and β_1[/latex] if and only if [latex]z_x = ±z_y,[/latex] in which case we say y is perfectly linearly correlated with x.

Since [latex]z_x[/latex] varies continuously with x, the existence of a “near” linear relationship between x and y is equivalent to [latex]z_x[/latex] being “close” to [latex]±z_y[/latex] in some sense. The fact that [latex]||z_x|| = ||±z_y|| = \sqrt{n}[/latex] means [latex]z_x and ±z_y[/latex] differ only in orientation, so a natural measure of how close [latex]z_x is to ±z_y[/latex] is cos θ, where θ is the angle between [latex]z_x and z_y.[/latex] The number

[latex]ρ_{xy} = cos θ = \frac{z_x^T z_y}{||z_x|| ||z_y||} = \frac{z_x^T z_y}{n} = \frac{(x − μ_xe)^T (y − μ_ye)}{||x − μ_xe|| ||y − μ_ye||}[/latex]

is called the coefficient of linear correlation, and the following facts are now immediate.

• [latex]ρ_{xy} = 0[/latex] if and only if x and y are orthogonal, in which case we say that x and y are completely uncorrelated.

• [latex]|ρ_{xy}| = 1[/latex] if and only if y is perfectly correlated with x. That is, [latex]|ρ_{xy}| = 1[/latex] if and only if there exists a linear relationship [latex]y = β_0e + β_1x.[/latex]

[latex]\vartriangleright[/latex] When [latex]β_1 > 0,[/latex] we say that y is positively correlated with x.

[latex]\vartriangleright[/latex] When [latex]β_1 < 0,[/latex] we say that y is negatively correlated with x.

• [latex]|ρ_{xy}|[/latex] measures the degree to which y is linearly related to x. In other words, [latex]|ρ_{xy}| ≈ 1[/latex] if and only if [latex]y ≈ β_0e + β_1x[/latex] for some [latex]β_0 and β_1.[/latex]

[latex]\vartriangleright[/latex] Positive correlation is measured by the degree to which [latex]ρ_{xy} ≈ 1.[/latex]

[latex]\vartriangleright[/latex] Negative correlation is measured by the degree to which [latex]ρ_{xy} ≈ −1.[/latex]

If the data in x and y are plotted in [latex]ℜ^2[/latex] as points [latex](x_i, y_i),[/latex] then, as depicted in Figure 5.4.1, [latex]ρ_{xy} ≈ 1[/latex] means that the points lie near a straight line with positive slope, while [latex]ρ_{xy} ≈ −1[/latex] means that the points lie near a line with negative slope, and [latex]ρ_{xy} ≈ 0[/latex] means that the points do not lie near a straight line.

If [latex]|ρ_{xy}| ≈ 1,[/latex] then the theory of least squares as presented in §4.6 can be used to determine a “best-fitting” straight line.

Question 5.4.3: Linear Correlation. Suppose that an experiment is conducted,......

Related Answered Questions

Explain why the Frobenius matrix norm on C^n×n must satisfy the parallelogram identity. ...

Verified Answer:

Verified Answer:

Describe the norms that are generated by the inner products presented in Example 5.3.1. ...

Verified Answer:

Evaluate the following convolutions. (a) (1 2 3) ⊙ (4 5 6). (b) (-1 0 1) ⊙ (1 0 -1). (c) (1 1 1) ⊙ (α0 α1 α2). ...

Verified Answer:

Verified Answer:

Verified Answer:

Prove that a ⊙ b = b ⊙ a for all a, b ∈ C^n —i.e., convolution is a commutative operation. ...

Verified Answer:

Let η be an arbitrary scalar, and let c = (1 η η² … η^2n−1) and a = (α0 α1 … αn−1). Prove that c^T (a ⊙ a) = (c^T â)². ...

Verified Answer:

Except for the euclidean norm, is any other vector p-norm generated by an inner product? ...

Verified Answer:

Verified Answer: