# Geography Fieldwork

A Level

# Data analysis

## 4. Data analysis

Sophisticated data analysis will help you spot patterns, trends and relationships in your results. Data analysis can be qualitative and/or quantitative, and may include statistical tests. An example of a statistical test is outlined below.

## Lorenz curves

The **Lorenz curve** is a graph showing how evenly distributed a variable is over space.

The diagonal black line represents a perfectly even distribution. The blue and red lines show uneven distributions. The further these coloured lines are from the black line, the more uneven is the distribution.

You can draw Lorenz curves based on ordinal data (see worked example 1 below) or interval data (see worked example 2 below).

### Worked example 1: Lorenz curve for ordinal data

There are 32844 LSOAs in England. These have been given an IMD score, and then ranked from 0 (the most deprived) to 32844 (the least deprived). The LSOAs can be divided into five quintiles. The table shows how many LSOAs are in each of the five quintiles for Barking and Dagenham and for Hillingdon.

All LSOAs in England | All LSOAs in Barking and Dagenham | All LSOAs in Hillingdon |
---|---|---|

1^{st} (top 20% deprived) |
66 | 6 |

2^{nd} |
35 | 52 |

3^{rd} |
8 | 30 |

4^{th} |
0 | 31 |

5^{th} (least 20% deprived) |
0 | 33 |

SUM | 109 | 152 |

From the raw data, it looks like there is a greater number of deprived LSOAs in Barking and Dagenham. In contrast, Hillingdon contains a more even distribution. Calculate the percentages for all three columns.

England | Barking and Dagenham | Hillingdon | |||
---|---|---|---|---|---|

quintile | % | raw data | % | raw data | % |

1st | 20 | 66 | 60.6 | 6 | 3.9 |

2nd | 20 | 35 | 32.1 | 52 | 34.2 |

3rd | 20 | 8 | 7.3 | 30 | 19.7 |

4th | 20 | 0 | 0 | 31 | 20.4 |

5th | 20 | 0 | 0 | 33 | 21.7 |

SUM | 100 | 109 | 100 | 152 | 100 |

Now calculate the cumulative percentages for all three columns.

England | Barking and Dagenham | Hillingdon | ||||||
---|---|---|---|---|---|---|---|---|

quintile | % | cu% | data | % | cu% | data | % | cu% |

1st | 20 | 20 | 66 | 60.6 | 60.6 | 6 | 3.9 | 3.9 |

2nd | 20 | 40 | 35 | 32.1 | 92.7 | 52 | 34.2 | 38.2 |

3rd | 20 | 60 | 8 | 7.3 | 100 | 30 | 19.7 | 57.9 |

4th | 20 | 80 | 0 | 0 | 100 | 31 | 20.4 | 78.3 |

5th | 20 | 100 | 0 | 0 | 100 | 33 | 21.7 | 100 |

SUM | 100 | 100 | 109 | 100 | 100 | 152 | 100 | 100 |

Plot a scattergraph with axes as follows

- x-axis: cumulative percentages for England
- y-axis: cumulative percentages for a single London Borough

The black line shows a perfectly even distribution. This shows the distribution of deprivation ranks in England. The further a line is from this, the more uneven the distribution. As suspected, Barking and Dagenham has a more uneven distribution of IMD ranks than Hillingdon.

### Worked example 2: Lorenz curve for interval data

Lorenz curves can also be constructed for interval data, but there are some extra steps.

Bristol City Council have divided up the city into 14 ‘Neighbourhood Areas’. For each Neighbourhood Area, the total population of each area has been counted, plus the number of people with a ‘severe limiting long-term illness’.

This information can be used to help answer the question: do certain areas of Bristol contain a greater concentration of severely ill people than other areas? Or by contrast, are severely ill people evenly distributed throughout Bristol?

Name of Neighbourhood Area in Bristol | Total population of this Neighbourhood Area | Number of severely ill in this Neighbourhood Area |
---|---|---|

Ashley | 47782 | 3514 |

Avonmouth | 20237 | 2074 |

Bishopston | 36713 | 1383 |

Clifton | 41192 | 1537 |

Dundry View | 28771 | 3411 |

Filwood | 38778 | 3553 |

Bedminster | 22664 | 1864 |

Brislington | 22107 | 1686 |

Fishponds | 27575 | 3503 |

Henbury | 24253 | 2691 |

Hengrove | 28786 | 2963 |

Henleaze | 31412 | 2127 |

Horfield | 23912 | 2141 |

St Georges | 24052 | 2123 |

TOTAL | 418234 | 34570 |

Calculate the percentages for the ‘total population’ and ‘number of severely ill’ columns. This shows the percentage of Bristol’s population and number of severely ill people in each Neighbourhood Area. For example, Fishponds contains 6.59% of Bristol’s population and 10.13% of Bristol’s severely ill people.

Name of Neighbourhood Area in Bristol | Total population of this Neighbourhood Area | Number of severely ill in this Neighbourhood Area | ||
---|---|---|---|---|

% | % | |||

Ashley | 47782 | 11.42 | 3514 | 10.16 |

Avonmouth | 20237 | 4.84 | 2074 | 6.00 |

Bishopston | 36713 | 8.78 | 1383 | 4.00 |

Clifton | 41192 | 9.85 | 1537 | 4.45 |

Dundry View | 28771 | 6.88 | 3411 | 9.87 |

Filwood | 38778 | 9.27 | 3553 | 10.28 |

Bedminster | 22664 | 5.42 | 1864 | 5.39 |

Brislington | 22107 | 5.29 | 1686 | 4.88 |

Fishponds | 27575 | 6.59 | 3503 | 10.13 |

Henbury | 24253 | 5.80 | 2691 | 7.78 |

Hengrove | 28786 | 6.88 | 2963 | 8.57 |

Henleaze | 31412 | 7.51 | 2127 | 6.15 |

Horfield | 23912 | 5.72 | 2141 | 6.19 |

St Georges | 24052 | 5.75 | 2123 | 6.14 |

TOTAL | 418234 | 100 | 34570 | 100.00 |

Calculate the ratio between the two percentage columns. `"ratio" = "% severely ill"/"% population"` For example, in Ashley, the ratio is `10.16 -: 11.42 = 0.89`

Name of Neighbourhood Area in Bristol | Total population of this Neighbourhood Area | Number of severely ill in this Neighbourhood Area | Ratio of % severely ill to % population | ||
---|---|---|---|---|---|

% | % | ||||

Ashley | 47782 | 11.42 | 3514 | 10.16 | 0.89 |

Avonmouth | 20237 | 4.84 | 2074 | 6.00 | 1.24 |

Bishopston | 36713 | 8.78 | 1383 | 4.00 | 0.46 |

Clifton | 41192 | 9.85 | 1537 | 4.45 | 0.45 |

Dundry View | 28771 | 6.88 | 3411 | 9.87 | 1.43 |

Filwood | 38778 | 9.27 | 3553 | 10.28 | 1.11 |

Bedminster | 22664 | 5.42 | 1864 | 5.39 | 1.00 |

Brislington | 22107 | 5.29 | 1686 | 4.88 | 0.92 |

Fishponds | 27575 | 6.59 | 3503 | 10.13 | 1.54 |

Henbury | 24253 | 5.80 | 2691 | 7.78 | 1.34 |

Hengrove | 28786 | 6.88 | 2963 | 8.57 | 1.25 |

Henleaze | 31412 | 7.51 | 2127 | 6.15 | 0.82 |

Horfield | 23912 | 5.72 | 2141 | 6.19 | 1.08 |

St Georges | 24052 | 5.75 | 2123 | 6.14 | 1.07 |

TOTAL | 418234 | 100 | 34570 | 100.00 |

Rank the ratio column from highest number to lowest number. You can either do this by hand or by using the Sort command in Excel.

Name of Neighbourhood Area in Bristol | Total population of this Neighbourhood Area | Number of severely ill in this Neighbourhood Area | Ratio of % severely ill to % population | |||
---|---|---|---|---|---|---|

% | % | rank | ||||

Ashley | 47782 | 11.42 | 3514 | 10.16 | 0.89 | 11 |

Avonmouth | 20237 | 4.84 | 2074 | 6.00 | 1.24 | 5 |

Bishopston | 36713 | 8.78 | 1383 | 4.00 | 0.46 | 13 |

Clifton | 41192 | 9.85 | 1537 | 4.45 | 0.45 | 14 |

Dundry View | 28771 | 6.88 | 3411 | 9.87 | 1.43 | 2 |

Filwood | 38778 | 9.27 | 3553 | 10.28 | 1.11 | 6 |

Bedminster | 22664 | 5.42 | 1864 | 5.39 | 1.00 | 9 |

Brislington | 22107 | 5.29 | 1686 | 4.88 | 0.92 | 10 |

Fishponds | 27575 | 6.59 | 3503 | 10.13 | 1.54 | 1 |

Henbury | 24253 | 5.80 | 2691 | 7.78 | 1.34 | 3 |

Hengrove | 28786 | 6.88 | 2963 | 8.57 | 1.25 | 4 |

Henleaze | 31412 | 7.51 | 2127 | 6.15 | 0.82 | 12 |

Horfield | 23912 | 5.72 | 2141 | 6.19 | 1.08 | 7 |

St Georges | 24052 | 5.75 | 2123 | 6.14 | 1.07 | 8 |

TOTAL | 418234 | 100 | 34570 | 100.00 |

Rearrange the rows in the table according to the ranks that you have just made.

Neighbourhood Area | % total population | % severely ill | ratio | rank |
---|---|---|---|---|

Fishponds | 6.59 | 10.13 | 1.54 | 1 |

Dundry View | 6.88 | 9.87 | 1.43 | 2 |

Henbury | 5.80 | 7.78 | 1.34 | 3 |

Hengrove | 6.88 | 8.57 | 1.25 | 4 |

Avonmouth | 4.84 | 6.00 | 1.24 | 5 |

Filwood | 9.27 | 10.28 | 1.11 | 6 |

Horfield | 5.72 | 6.19 | 1.08 | 7 |

St Georges | 5.75 | 6.14 | 1.07 | 8 |

Bedminster | 5.42 | 5.39 | 1.00 | 9 |

Brislington | 5.29 | 4.88 | 0.92 | 10 |

Ashley | 11.42 | 10.16 | 0.89 | 11 |

Henleaze | 7.51 | 6.15 | 0.82 | 12 |

Bishopston | 8.78 | 4.00 | 0.46 | 13 |

Clifton | 9.85 | 4.45 | 0.45 | 14 |

Calculate cumulative figures for the two % columns.

Neighbourhood Area | total population | severely ill | ||
---|---|---|---|---|

% | cumulative % | % | cumulative % | |

Fishponds | 6.59 | 6.59 | 10.13 | 10.13 |

Dundry View | 6.88 | 13.47 | 9.87 | 20.00 |

Henbury | 5.80 | 19.27 | 7.78 | 27.78 |

Hengrove | 6.88 | 26.15 | 8.57 | 36.36 |

Avonmouth | 4.84 | 30.99 | 6.00 | 42.35 |

Filwood | 9.27 | 40.26 | 10.28 | 52.63 |

Horfield | 5.72 | 45.98 | 6.19 | 58.83 |

St Georges | 5.75 | 51.73 | 6.14 | 64.97 |

Bedminster | 5.42 | 57.15 | 5.39 | 70.36 |

Brislington | 5.29 | 62.44 | 4.88 | 75.24 |

Ashley | 11.42 | 73.86 | 10.16 | 85.40 |

Henleaze | 7.51 | 81.37 | 6.15 | 91.55 |

Bishopston | 8.78 | 90.15 | 4.00 | 95.55 |

Clifton | 9.85 | 100.00 | 4.45 | 100.00 |

Finally it is time to draw the Lorenz curve! Plot the cumulative % total population on the x-axis. Plot the cumulative % severely ill on the y-axis.

## Gini coefficient

Lorenz curves are a useful visual technique for presenting your data. But it is sometimes difficult to see how one uneven distribution compares to another. The **Gini coefficient** is a summary statistic that will provide a precise answer.

`"Gini coefficient" = "area of graph between the diagonal and the curve"/"area of graph above the diagonal"`

The result for the Gini coefficient ranges from 0 (completely even distribution) to 1 (completely uneven distribution).

### Worked example of Gini coefficient

There are 32844 LSOAs in England. These have been given an IMD score, and then ranked from 0 (the most deprived) to 32844 (the least deprived). The LSOAs can be divided into five quintiles. The table shows how many LSOAs are in each of the five quintiles for Barking and Dagenham and for Hillingdon.

All LSOAs in England | All LSOAs in Barking and Dagenham | All LSOAs in Hillingdon |
---|---|---|

1^{st} (top 20% deprived) |
66 | 6 |

2^{nd} |
35 | 52 |

3^{rd} |
8 | 30 |

4^{th} |
0 | 31 |

5^{th} (least 20% deprived) |
0 | 33 |

SUM | 109 | 152 |

Lorenz curves were plotted for the data.

To calculate the area of the graph above the diagonal, and the area of graph between the diagonal and the curve, you can count the number of squares on graph paper. Include fractions for part-squares.

There are 625 squares shown 312.5 squares are above the black diagonal line There are 61 squares between the diagonal and the red curve (for Hillingdon) There are 109 squares between the diagonal and the red curve (for Barking)

`"Gini coefficient for Hillingdon" = 61-:312.5 = 0.20`

`"Gini coefficient for Barking" = 109-:312.5 = 0.35`

## Location Quotient

The Location Quotient is another mathematical technique for showing how unevenly distributed a variable is over space.

`"Location Quotient" = "% in one area" = "% the whole population"`

Location Quotient (LQ) varies from 0 to infinity.

If LQ is less than 1, the variable is under-represented in a particular area. If LQ is greater than 1, the variable is over-represented in a particular area.

### Worked example

Bristol City Council have divided up the city into 14 ‘Neighbourhood Areas’. For each Neighbourhood Area, the number of people in different age bands has been counted. Here are the total number of people aged 16-24 and 65-74 for each area.

Name of Neighbourhood Area in Bristol | Total population of this Neighbouhood Area | Total number of people aged 16-24 |
---|---|---|

Ashley | 47782 | 7519 |

Avonmouth | 20237 | 2364 |

Bishopston | 36713 | 8351 |

Clifton | 41192 | 14003 |

Dundry View | 28771 | 3621 |

Filwood | 38778 | 4288 |

Bedminster | 22664 | 2762 |

Brislington | 22107 | 2294 |

Fishponds | 27575 | 5535 |

Henbury | 24253 | 2631 |

Hengrove | 28786 | 3137 |

Henleaze | 31412 | 4160 |

Horfield | 23912 | 3773 |

St Georges | 24052 | 2566 |

TOTAL | 418234 | 67004 |

Calculate the percentages for the ‘total population’ and ‘number aged 16-24’ columns. This shows the percentage of Bristol’s population and number of people aged 16-24 in each Neighbourhood Area.

For example, Avonmouth contains 3.53% of all the 16-24 year olds in Bristol. Be careful not to get confused here. This does not mean that 3.53% of Avonmouth’s population is aged 16-24.

Name of Neighbourhood Area in Bristol | Total population of this Neighbouhood Area | Total number of people aged 16-24 in this Neighborhood Area | ||
---|---|---|---|---|

% | % | |||

Ashley | 47782 | 11.42 | 7519 | 11.22 |

Avonmouth | 20237 | 4.84 | 2364 | 3.53 |

Bishopston | 36713 | 8.78 | 8351 | 12.46 |

Clifton | 41192 | 9.85 | 14003 | 20.90 |

Dundry View | 28771 | 6.88 | 3621 | 5.40 |

Filwood | 38778 | 9.27 | 4288 | 6.40 |

Bedminster | 22664 | 5.42 | 2762 | 4.12 |

Brislington | 22107 | 5.29 | 2294 | 3.42 |

Fishponds | 27575 | 6.59 | 5535 | 8.26 |

Henbury | 24253 | 5.80 | 2631 | 3.93 |

Hengrove | 28786 | 6.88 | 3137 | 4.68 |

Henleaze | 31412 | 7.51 | 4160 | 6.21 |

Horfield | 23912 | 5.72 | 3773 | 5.63 |

St Georges | 24052 | 5.75 | 2566 | 3.83 |

TOTAL | 418234 | 100 | 67004 | 100 |

The Location Quotient is the ratio between the two percentage columns.

`"Location Quotient" = "% aged 16-24" = "% whole population"`

For example, in Avonmouth, the LQ is `3.53-:4.84 = 0.73`

Name of Neighbourhood Area in Bristol | Total population of this Neighbourhood Area | Total number of people aged 16-24 in this Neighborhood Area | Location Quotient | ||
---|---|---|---|---|---|

% | % | ||||

Ashley | 47782 | 11.42 | 7519 | 11.22 | 0.98 |

Avonmouth | 20237 | 4.84 | 2364 | 3.53 | 0.73 |

Bishopston | 36713 | 8.78 | 8351 | 12.46 | 1.42 |

Clifton | 41192 | 9.85 | 14003 | 20.90 | 2.12 |

Dundry View | 28771 | 6.88 | 3621 | 5.40 | 0.79 |

Filwood | 38778 | 9.27 | 4288 | 6.40 | 0.69 |

Bedminster | 22664 | 5.42 | 2762 | 4.12 | 0.76 |

Brislington | 22107 | 5.29 | 2294 | 3.42 | 0.65 |

Fishponds | 27575 | 6.59 | 5535 | 8.26 | 1.25 |

Henbury | 24253 | 5.80 | 2631 | 3.93 | 0.68 |

Hengrove | 28786 | 6.88 | 3137 | 4.68 | 0.68 |

Henleaze | 31412 | 7.51 | 4160 | 6.21 | 0.83 |

Horfield | 23912 | 5.72 | 3773 | 5.63 | 0.98 |

St Georges | 24052 | 5.75 | 2566 | 3.83 | 0.67 |

TOTAL | 418234 | 100 | 67004 | 100 | 1 |

The calculated figures show that people aged 16-24 are under-represented in a number of areas, such as Avonmouth, Brislington and St Georges. But people aged 16-24 are over-represented in other areas, such as Clifton, Bishopston and Fishponds. The LQ results show that the greatest concentration of young adults is in Clifton: can you find any other data to help explain this?

## Index of Dissimilarility

The Index of Dissimilarility is used to compare the distribution of two variables, such as two socio-economic groups or two ethnic groups in a particular area.

`"Index of dissimilarity" = 1/2 ∑ |x_i/X-y_i/Y|`

- `x_i`is the population of group `x `in small area `i`
- `X` is the total population of group `x`in the whole area
- `y_i`is the population of group `y `in small area `i`
- `Y` is the total population of group `y`in the whole area

It helps answer the question: is group X more evenly distributed in a particular place than group Y? The index ranges from 0 (complete integration) to 100 (complete segregation).

### Worked example 1 of Index of Dissimilarity

Census 2011 data for wards in Sandwell (West Midlands) can be obtained from Neighbourhood Statistics. An extract is shown below

Name of ward in Sandwell | Number of persons identifying their ethnicity as White in the ward (this is `x_i`) | Number of persons identifying their ethnicity as Asian in the ward (this is `y_i`) |
---|---|---|

Abbey | 9078 | 1271 |

Blackheath | 10808 | 870 |

Bristnall | 9064 | 1814 |

Charlemont with Grove Vale | 8903 | 1918 |

Cradley Heath and Old Hill | 11913 | 1009 |

Friar Park | 11335 | 619 |

Great Barr with Yew Tree | 8300 | 3105 |

Great Bridge | 10393 | 1626 |

Greets Green and Lyng | 6925 | 3244 |

Hateley Heath | 10295 | 2182 |

Langley | 10135 | 1448 |

Newton | 7879 | 2178 |

Old Warley | 9388 | 1399 |

Oldbury | 7648 | 4011 |

Princes End | 11847 | 369 |

Rowley | 10648 | 609 |

St Pauls | 4252 | 7822 |

Smethwick | 7128 | 4522 |

Soho and Victoria | 3854 | 6881 |

Tipton Green | 9262 | 2625 |

Tividale | 10616 | 913 |

Wednesbury North | 10331 | 1734 |

Wednesbury South | 9132 | 2232 |

West Bromwich Central | 6337 | 4857 |

TOTAL | 215471 (this is `X`) | 59258 (this is `Y`) |

Calculate the percentages for the ‘White’ and ‘Asian columns. This shows the percentage of Sandwell’s population of each ethnic group who live in each ward.

For example, there are 215471 people identifying as White as Sandwell. There are 11847 people identifying as White in Princes End.

`"% of Sandwell's White population who live in Princes End" = (11847/215471)xx100 = 5.50%`

This means that Princes End contains 5.50% of people identifying as White in Sandwell. Be careful not to get confused here. This does not mean that 5.50% of the population of Princes End is White.

Ward | White | Asian | ||
---|---|---|---|---|

raw data | % of Sandwell's population (this is `x_i/X`) | raw data | % % of Sandwell's population (this is `y_i/Y`) | |

Abbey | 9078 | 4.21 | 1271 | 2.14 |

Blackheath | 10808 | 5.02 | 870 | 1.47 |

Bristnall | 9064 | 4.21 | 1814 | 3.06 |

Charlemont | 8903 | 4.13 | 1918 | 3.24 |

Cradley Heath | 11913 | 5.53 | 1009 | 1.70 |

Friar Park | 11335 | 5.26 | 619 | 1.04 |

Great Barr | 8300 | 3.85 | 3105 | 5.24 |

Great Bridge | 10393 | 4.82 | 1626 | 2.74 |

Greets Green | 6925 | 3.21 | 3244 | 5.47 |

Hateley Heath | 10295 | 4.78 | 2182 | 3.68 |

Langley | 10135 | 4.70 | 1448 | 2.44 |

Newton | 7879 | 3.66 | 2178 | 3.68 |

Old Warley | 9388 | 4.36 | 1399 | 2.36 |

Oldbury | 7648 | 3.55 | 4011 | 6.77 |

Princes End | 11847 | 5.50 | 369 | 0.62 |

Rowley | 10648 | 4.94 | 609 | 1.03 |

St Pauls | 4252 | 1.97 | 7822 | 13.20 |

Smethwick | 7128 | 3.31 | 4522 | 7.63 |

Soho and Victoria | 3854 | 1.79 | 6881 | 11.61 |

Tipton Green | 9262 | 4.30 | 2625 | 4.43 |

Tividale | 10616 | 4.93 | 913 | 1.54 |

Wednesbury N | 10331 | 4.79 | 1734 | 2.93 |

Wednesbury S | 9132 | 4.24 | 2232 | 3.77 |

West Bromwich C | 6337 | 2.94 | 4857 | 8.20 |

SUM | 215471 | 100.00 | 59258 | 100.00 |

Calculate `|x-y|`

This is the difference between the two columns of percentages. Remove all negative numbers.

Ward | White | Asian | Differences (this is `|x_i/X-y_i/Y|` ) | ||
---|---|---|---|---|---|

raw data | % | raw data | % | ||

Abbey | 9078 | 4.21 | 1271 | 2.14 | 2.07 |

Blackheath | 10808 | 5.02 | 870 | 1.47 | 3.55 |

Bristnall | 9064 | 4.21 | 1814 | 3.06 | 1.15 |

Charlemont | 8903 | 4.13 | 1918 | 3.24 | 0.90 |

Cradley Heath | 11913 | 5.53 | 1009 | 1.70 | 3.83 |

Friar Park | 11335 | 5.26 | 619 | 1.04 | 4.22 |

Great Barr | 8300 | 3.85 | 3105 | 5.24 | 1.39 |

Great Bridge | 10393 | 4.82 | 1626 | 2.74 | 2.08 |

Greets Green | 6925 | 3.21 | 3244 | 5.47 | 2.26 |

Hateley Heath | 10295 | 4.78 | 2182 | 3.68 | 1.10 |

Langley | 10135 | 4.70 | 1448 | 2.44 | 2.26 |

Newton | 7879 | 3.66 | 2178 | 3.68 | 0.02 |

Old Warley | 9388 | 4.36 | 1399 | 2.36 | 2.00 |

Oldbury | 7648 | 3.55 | 4011 | 6.77 | 3.22 |

Princes End | 11847 | 5.50 | 369 | 0.62 | 4.88 |

Rowley | 10648 | 4.94 | 609 | 1.03 | 3.91 |

St Pauls | 4252 | 1.97 | 7822 | 13.20 | 11.23 |

Smethwick | 7128 | 3.31 | 4522 | 7.63 | 4.23 |

Soho and Victoria | 3854 | 1.79 | 6881 | 11.61 | 9.82 |

Tipton Green | 9262 | 4.30 | 2625 | 4.43 | 0.13 |

Tividale | 10616 | 4.93 | 913 | 1.54 | 3.39 |

Wednesbury N | 10331 | 4.79 | 1734 | 2.93 | 1.87 |

Wednesbury S | 9132 | 4.24 | 2232 | 3.77 | 0.47 |

West Bromwich C | 6337 | 2.94 | 4857 | 8.20 | 5.26 |

SUM | 215471 | 100.00 | 59258 | 100.00 | 75.21 |

Calculate `|x_i/X-y_i/Y|`

This is the sum of all the differences column.

In this example, `|x_i/X-y_i/Y| = 75.21`

Calculate `"Index of dissimilarity" = 1/2 ∑ |x_i/X-y_i/Y|`

In this example `"Index of dissimilarity" = 1/2 xx 75.21 = 37.61`

This means that 37.61 of the Asian population of Sandwell would need to change residence to a different ward in order to have the same relative distribution ast the White population of Sandwell.

### Worked example 2 of Index of Dissimilarity

Census 2011 data for wards in Sandwell (West Midlands) can be obtained from Neighbourhood Statistics. The Index of Dissimilarility has been calculated for ward-level data for the 7 largest ethnic groups of residents (excluding people of mixed ethnicity). A summary of the results is shown in the table.

White British | White Other | Indian | Pakistani | Bangladeshi | Black Caribbean | Black African | |
---|---|---|---|---|---|---|---|

White British | 31.76 | 37.39 | 54.01 | 57.81 | 33.86 | 33.48 | |

White Other | 18.68 | 39.80 | 45.22 | 15.96 | 21.07 | ||

Indian | 34.03 | 42.73 | 15.58 | 25.60 | |||

Pakistani | 43.42 | 33.71 | 26.17 | ||||

Bangladeshi | 48.79 | 54.75 | |||||

Black Caribbean | 18.16 |