| Term |
Definition |
| Population |
The entire collection of people, animals, cells, or other things from which we collect data |
|
Parameter
|
A number that is calculated from an entire population |
| Sample |
A subset or group drawn from the population |
|
Statistic
|
A number or quantity that is calculated from a sample of data |
|
Descriptive statistics
|
Statistics that describe the sample without attempting to generalize the results to other groups or the population |
|
Inferential statistics
|
Statistics that infer the likelihood that the results can be generalized to the population |
| Measure of central tendency |
A single value that attempts to describe the central position of a set of data |
|
Mean
|
The average value |
|
Median
|
The middle value |
|
Mode
|
The most frequent value |
| Measure of dispersion/variation |
A value that describes how the data are dispersed around the measure of central tendency, or the extent to which individual values differ from the mean, median, or mode |
|
Standard deviation
|
On average, how much individual values differ from the mean; the square root of the variance |
|
Variance
|
How far a set of numbers is spread out from the mean; the sum of the squared differences between each value and the mean, divided by the number of values minus one |
|
Range
|
The difference between the largest and smallest value in the data set |
|
Interquartile range
|
A measure of the “middle fifty” in the data set; where the bulk of the values exist |
|
Outlier
|
An observation point that is distant from other observations |
|
Frequency
|
The number of times a value appears in the data set |
| Frequency distribution |
A table or graph that illustrates how frequently each value appears in the data set |
|
Normal distribution
|
A symmetric, bell-shaped distribution for a continuous variable; 68% of observations fall within 1 standard deviation of the mean, 95% fall within 2 standard deviations of the mean, and 99.7% fall within 3 standard deviations of the mean |
|
Binomial distribution
|
The probability distribution for a binomial variable (i.e. a variable that has only two possible values) with fixed probabilities that add up to one |
|
Confidence interval
|
An estimate of the population parameter that will contain the population mean a specified proportion of the time, typically either 95% or 99% of the time |
|
Probability
|
The likelihood that an event will occur |