mean() can be used to calculate mean in R. The following is the basic statement.
mean(x, trim = 0, na.rm = FALSE)
- x: It is an R object.
- trim: It is in the range of (0 to 0.5). The number indicates the percentage of observations to be trimmed from each end of sorted
x
, before the mean is computed. The default value is 0. - na.rm: It is a logical value indicating whether
NA
values should be stripped before the computation proceeds. The default value is FALSE.
Example 1: Basic format
x <- c(4,5,8,44,32,22,45,3,8,99,100,120)
mean(x)
Output:
> x <- c(4,5,8,44,32,22,45,3,8,99,100,120) > mean(x) [1] 40.83333
Example 2: With trim
Example 2 shows how to use trim
in mean(). trim will remove the percentage of observations on both ends in the sorted vector.
x <- c(4,5,8,44,32,22,45,3,8,99,100,120)
mean(x, trim=0.1)
Output:
> x <- c(4,5,8,44,32,22,45,3,8,99,100,120) > mean(x, trim=0.1) [1] 36.7
We can sort the vector and remove both end to see whether we can get the exactly same result.
# create a vector in R
x <- c(4,5,8,44,32,22,45,3,8,99,100,120)
# sort the vector
sort(x, decreasing = FALSE)
# calculate the number of the observations
print(length(x))
# calculate the number of observations we need to remove on both ends
print(length(x)*0.1)
Output:
> sort(x, decreasing = FALSE) [1] 3 4 5 8 8 22 32 44 45 99 100 120 > length(x) [1] 12 > 12*0.1 [1] 1.2
Based on the output above, we know that we need to remove 1 observations (1.2 ≈ 1). Based on the sorted vector, we know that we need to remove 3 and 120 from both ends. Thus, we can manually calculate the mean to see whether we get the same mean of 36.7 shown above.
# manually calculate the mean
(sum(x)-3-120)/10
Output:
> (sum(x)-3-120)/10 [1] 36.7
Example 3: with NA
The following vector has NA
in the vector. We can see that the NA
will make the mean as NA
.
x <- c(4,5,8,44,32,22,45,3,8,99,100,120,NA)
mean(x)
Output:
> x <- c(4,5,8,44,32,22,45,3,8,99,100,120,NA) > mean(x) [1] NA
# create a vector in R
x <- c(4,5,8,44,32,22,45,3,8,99,100,120,NA)
# add na.rm=TRUE in mean()
mean(x,na.rm = TRUE)
Output:
> x <- c(4,5,8,44,32,22,45,3,8,99,100,120,NA) > mean(x,na.rm = TRUE) [1] 40.83333