This is a post about some caveats on calculating mean value in R:
1. If a vector contains NA value, then mean(vector) always returns NA
This is unlike many data management software/languages, which return 1.5 in the above situation.
2. If a vector contains numeric(0), it will not influence mean() function in R:
3. Therefore, we can solve such problem (take the mean for a vector that contains NA) by this:
or
or just in function use option na.rm = T to remove NA in the calculation.
4. In addition, we can check the attribute of numeric(0) in R, numeric(0) will appear when one filters a data frame or vector and there is no corresponding data that satisfies the wanted condition.
When we calculate with numeric(0), the result will be numeric(0), with NA is the same.
In summary, if we use functions to calculate some statistics, we need to ensure our data does not contain any NA value, but numeric(0) values are allowed and usually needed:
Comments