Terms that appear frequently throughout this lesson are defined below:
Term | Definition |
Data quality | |
Bias |
Systematic error introduced by selecting or encouraging one outcome over others |
Validity |
Extent to which a measure actually represents what it claims to measure |
Reliability |
Degree to which results are stable and consistent |
Sensitivity |
Proportion of positives that are correctly identified |
Specificity |
Proportion of negatives that are correctly identified |
Data relationships | |
Correlated |
A statistical relationship existing between two variables or datasets that reflects a dependence between the two |
Independent |
The occurrence of one variable does not influence the probability of another variable |
Data parametrics | |
Parametric |
Data with an underlying normal distribution |
Nonparametric |
Data for which the probability distribution is unknown or known not to be normal |