You are here: Home >> Geographical enquiry >> Stage 2

The most accurate way of investigating the pebbles on the beach in the photo (right) would be to measure every pebble.

However, you could well die of old age before you finished. To avoid this problem, it is expected that you will limit the amount of information that you collect, as long as you are careful that your *sample* is *representative* of the whole *population*.

These three words have a specific meaning in statistics.

- The
*sample*is the limited number of measurements that you make - The
*population*is the total number of measurements that you could potentially take - A totally
*representative*sample will tell you everything you might need to know about the*population*

Sampling comes in three varieties: random sampling, systematic sampling and stratified sampling.

Random sampling is used where the study area is the same throughout. In a flat grassy field, you could assume that the environmental conditions do not change within the meadow, it doesn’t matter whereabouts within the area you take your samples from. In urban investigations, you might use random sampling if, for example, you are assessing a small number of sites in one particular housing estate for environmental quality (see urban inequalities).

Random sampling can be used to choose spots or areas as sites to sample. It is vitally important that you do not choose sample sites yourself, as this will introduce bias. Random sampling is achieved by generating two random numbers (from a random number table or a scientific calculator) and using them as co-ordinates. For a small area, such as a field, you could lay two 20m tape measures on the ground and use the co-ordinates to place a quadrat. For an urban area, you could use the co-ordinates to generate Ordnance Survey grid references.

Random sampling should be free from bias. But it may be difficult to obtain a truly representative sample. The number of samples that you take (the *sampling size*) is important. This is considered in more detail below.

Systematic sampling is used when the study area includes an environmental gradient. You could sample along a line (e.g. at 10 equally spaced points on 3km of a river's course to investigate downstream changes in a river or every 20m along a line running inland in a sand dune system) or in every grid square within a defined area (e.g. within every 100m x 100m grid square within a small area for flood hazard mapping). Sample points should be evenly spaced or distributed.

Systematic sampling is quick and easy to do. But it is easy to miss variation. For example, if you are investigating downstream changes in a river by choosing equally spaced samples, you may not easily be able to pick out the effect of tributaries joining the river. If you are investigating a sand dune system, widely spaced intervals may mean that you miss some variations in vegetation, such as small dune slacks. The number of samples that you take (the *sampling size*) is important. This is considered in more detail below.

Stratified sampling is used when the study area includes significantly different parts (also known as subsets). You should make sure that the number of samples taken is representative of the importance of each subset within the total population. In an rivers investigation into the effect of stream ordering on discharge, a stratified sample would be to choose sites where the two river segments of the same order join. In a sand dune investigation, a stratified sample would be to choose to sample where there is a break of slope rather than at equally spaced intervals.

Stratified sampling should overcome the problem with missing variation that might arise with systematic sampling. But it may be difficult to get background data to allow you to apply stratified sampling appropraitely.

How many readings (or replicates) should you take? There are two ways to look at this question:

- what is the
*minimum*number of replicates that I need to collect so that I can carry out statistical tests? - what is the
*maximum*number of replicates above which the results do not change?

If you are looking for a correlation between two variables (e.g. hydraulic radius vs distance downstream OR rurality score vs distance from the city), you will need at least 12-15 pairs of measurement to carry out a statistical test like the Spearman's Rank Correlation Coefficient. The absolute minimun number is 10 pairs of measurements, but this is only really an option for students who are aiming for the lower grades.

A more statistically valid approach, if you are carrying out random sampling, is to calculate the running mean. Find the mean of your first two readings, then the mean of the first three readings, the mean of the first four readings and so on. The mean values will fluctuate each time, but will gradually settle within a closer limit, until a point is reached where adding to the sample only has a very small effect on the mean. You can assume at this point that the sample size is adequate.

Reading | Running mean | Reading | Running mean |
---|---|---|---|

35.3 | 35.3 | 27.6 | 27.1 |

28.0 | 31.7 | 21.4 | 26.7 |

28.1 | 30.5 | 20.3 | 26.3 |

20.6 | 28.0 | 22.3 | 26.1 |

19.1 | 26.2 | 29.4 | 26.3 |

23.4 | 25.8 | 27.5 | 26.3 |

32.5 | 26.7 | 26.6 | 26.3 |

32.1 | 27.4 | 28.1 | 26.4 |

32.1 | 27.9 | 26.4 | 26.4 |

32.1 | 28.3 | 22.5 | 26.2 |

18.5 | 27.4 | 25.3 | 26.2 |

23.5 | 27.1 | 28.2 | 26.3 |

27.5 | 26.3 |

In this example, the running mean has been calculated for pebble size measurements. After 25 measurements, adding to the sample is only having a small effect on the mean.

Of course, the exact number of replicates that you choose will also be affected by the amount of time that you have available to carry out the survey.

- A - How long you have to collect your data
- B - How long it takes to collect one set of data (e.g. all the river measurements at one point on the river)
- Divide A by B. Is the result at least 12? If it is less than 12, you will need to collect less data at each site.

It sounds obvious, but it isn't. Results can easily be missed if you don't have a systematic way of recording them. Construct tables (sometimes called *booking sheets*) to fill in at each sample site. An evaluation section which includes the comment '*This investigation was not successful because I lost half of my data*' will not score highly. The example below shows what you could produce.

GO TO NEXT SECTION: Finding more data

**Looking for a next step?**

The FSC has a national network of residential and day Centres, open all year round with full-time teaching staff. We can work with you to meet all Geography fieldwork needs from 11-19. Find out more about fieldwork in geography with FSC, covering: A level Geography fieldwork; AS geography fieldwork; GCSE geography fieldwork; key stage 3 geography field trips.

We offer a range of publications and courses for adults, families and professionals that relate to geography.

Copyright © 2010 Field Studies Council

Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 Licence .