# Mean as a Projection

This tutorial explains how mean can be viewed as an orthogonal projection onto a subspace defined by the span of an all 1’s vector (i.e., basis vector).

Suppose that $$\vec{y} \in \mathbb{R}^n$$ and $$L \subset \mathbb{R}^n$$ is the span defined by the space of vector $$\vec{x}$$, namely,

$$\vec{x}=\left[\begin{array}{ccc} 1\\ 1\\ …\\ 1 \end{array} \right], L ={c \vec{x}; c \in \mathbb{R}}$$

The question arises as to what the value of $$c$$ is, if we want to have a minimal distance between $$\vec{x}$$ and the subspace $$L$$.

As discussed in another tutorial on orthogonal projection, we know that the shortest distance between a vector and a subspace is via an orthogonal projection.

Thus, we can get the following,

$$c=\frac{\vec{x} \cdot \vec{y}}{\vec{x} \cdot \vec{x}}$$

Thus,

$$c=\frac{\vec{x} \cdot \vec{y}}{\vec{x} \cdot \vec{x}} =\frac{\sum_{i=1}^n y_i}{\sum_{i=1}^n 1}= \frac{\sum_{i=1}^n y_i}{n} =\bar{y}$$

Further, it is worth pointing out that we can also just write out the proof process to show that the shortest distance between vector $$\vec{y}$$ and the subspace defined by the $$c \vec{x}$$ is the mean.

Based on the tutorial on orthogonal projection, we can get the following to calculate the distance between the vector and the space.

$$\sum_i (cx_{i}-y_{i})^2$$

We can then calculate the partial derivative with respect to $$c$$ as follows.

$$\frac{d}{dc} \sum_i (cx_{i}-y_{i})^2 =0$$

Since all $$x_i$$ is 1, we can get the following.

$$\frac{d}{dc} \sum_i (c-y_{i})^2 =0$$

Next,

$$2\sum_i (c-y_{i})=0$$

Next,

$$\sum_i (c-y_{i})=0$$

Next,

$$nc -\sum_i y_{i}=0$$

Next,

$$c =\frac{\sum_i y_{i}}{n} =\bar{y}$$

Thus, we can see it is not difficult to prove that a vector (e.g., $$\vec{y}$$ ) projecting onto a constant (i.e., all 1s such as $$\vec{x}$$ ) subspace is actually the mean of the vector (i.e., $$\vec{y}$$ ).