Geography Fieldwork

A Level

Data analysis

4. Data analysis

Sophisticated data analysis will help you spot patterns, trends and relationships in your results. Data analysis can be qualitative and/or quantitative, and may include statistical tests. An example of a statistical test is outlined below.

Mann Whitney U test

Mann Whitney U is a statistical test that is used either to test whether there is a significant difference between the medians of two sets of data.

The Mann Whitney U test can only be used if there are at least 6 pairs of data. It does not require a normal distribution.

There are 3 steps to take when using the Mann Whitney U test

Step 1. State the null hypothesis

There is no significant difference between _______ and _______

Step 2. Calculate the Mann Whitney U statistic

`U_1= n_1 xx n_2 + 0.5 n_2 (n_2 + 1) - ∑ R_2`

`U_2 = n_1 xx n_2 + 0.5 n_1 (n_1 + 1) - ∑ R_1`

  • `n_1` is the number of values of `x_1`
  • `n_2` is the number of values of `x_2`
  • `R_1` is the ranks given to `x_1`
  • `R_2` is the ranks given to `x_2`

Step 3. Test the significance of the result

Compare the value of U against the critical value for U at a confidence level of 95% / significance value of P = 0.05.

If U is equal to or smaller than the critical value (p=0.05) the REJECT the null hypothesis. There is a SIGNIFICANT difference between the 2 data sets.

If U is greater than the critical value, then ACCEPT the null hypothesis. There is NOT a significant difference between the 2 data sets.

Worked example

A geographer was interested in whether there was a difference in cliff gradient between places with a beach and places with no beach. Here are the results.

Cliff gradient where there is no beach (°) Cliff gradient where there is a beach (°)
20 15
35 21
32 36
16 12
41 10
23 18

Step 1. State the null hypothesis

There is no significant difference in cliff gradient between places with a beach and places with no beach.

Step 2. Calculate the Mann Whitney U statistic

(a) Give each result a rank. Calculate the sum of the ranks for the two columns.

No beach Beach
Cliff gradient (°) Rank Cliff gradient (°) Rank
20 6 15 3
35 10 21 7
32 9 36 11
16 4 12 2
41 12 10 1
23 8 18 5
TOTAL 49 TOTAL 29

(b) Calculate `∑R_1`and `∑R_2`

`∑R_1` is the sum of the ranks in the first column (no beach) = `49`

`∑R_2` is the sum of the ranks in the first column (beach) = `29`

`n_1 = 6` and `n_2 = 6`

(c) Calculate `U_1` and `U_2`

`U_1 = 6 xx 6 + 0.5 xx 6 (6 + 1) - 29 = 28`

`U_2 = 6 xx 6 + 0.5 xx 6 (6 + 1) - 49 = 8`

Step 3. Test the significance of the result

In this example, `U_1 = 28` and `U_2 = 8`

`U` is the smaller of the two values, so `U=8`

The critical value at `p=0.05` significance level for `n_1=6` and `n_2=6` is `5`. Since our calculated value of `8<6` the null hypothesis is not rejected.

In conclusion, there is no significant difference in cliff gradient between places with a beach and places with no beach.