2. Descriptive statistics
Measures of central tendency
The symmetrical bell-shaped distribution from a large series of measurements plotted on a frequency histogram. The mean is in the middle, with an equal number of smaller and larger values either side of it.
Add all the measurements together then divide by the number of measurements taken. The mean can only be used if the data approximate to a normal distribution, and have an interval or ratio measurement scale.
Arrange the data in order, and take the middle value as the median. It can be used for data which are not normally distributed. Suitable for variables with an ordinal scale.
The value which occurs most often. Suitable for variables with a nominal scale.
Measures of dispersion
The spread of data around the average,
If the data are normally distributed, use the interquartile range or the standard deviation. The usual way of expressing dispersion is as
mean ± interquartile range or mean ± standard deviation
If the data are not normally distributed, use the median and range.
Range is the distance between the highest and lowest value.
Example: 12 17 21 23 24 24 25 26 29 31
Range = 12–31
Interquartile range is the part of the range that covers the middle 50% of the data.
If the variable has an interval or ratio scale and if the data are normally distributed, use the interquartile range or the standard deviation. Otherwise use the median and range to show dispersion.
This is calculated using the formula below. In a normal distribution, 68% of values are within one standard deviation, 95% within two standard deviations, and 99.7% within three standard deviations of the mean.
`s =sqrt ((sum x^2 - ((sum x^2)/n))/(n-1))`