- Is it between -1 and 1
- Formula \(r = \frac{\sum_{i=1}^{n} (x_i - \overline{x})(y_i - \overline{y})}{\sqrt{\left(\frac{\sum_{i=1}^{n} (x_i - \overline{x})^2}{n-1}\right) \left(\frac{\sum_{i=1}^{n} (y_i - \overline{y})^2}{n-1}\right)}}\)
-
The formula has the standard deviation (Dispersion) for
y
andx
and the Degrees of Freedom to make it unbiased - On the correlation matrix, the diagonal == 1
-
We can use
corrplot/heatmaps
- Correlation is sensitive to outliers
Spearman/Kendalls
They are based on ranks e are robust to outliers
In addition, they can deal with some non-linearity as well
It is suitable for small datasets
References
- Bruce, 2017, p30-45