Introduction to forest sampling ©

Forests are large and variable and the individual components (ie trees) tend to have a relatively small value. Consequently, samples rather than complete measurements are often necessary.

No matter how carefully and meticulously a scientist makes observations on a population, if the sample being tested is unrepresentative, the time spent is wasted. But if the sampling system is correctly designed and applied, estimates of the population will be obtained within a desired/specified precision.

The only way to reduce sampling error and increase the precision of the final estimate is to use a more efficient sampling technique or increase the size of sample. To eliminate sampling error, all units in a population must be measured. This is sometimes done when a population is small or very valuable - but the result may still be incorrect due to measurement error.

There are three general objectives in sampling for estimates of a variable:-

  1. to obtain an unbiased estimate of the population mean;
  2. to obtain as precise an estimate of the mean as is possible for the time and money spent;
  3. to assess the precision of estimate, i.e. standard error of the mean.
Two main approaches to sampling are available:-
  1. Non-random methods
  2. Methods of probability sampling

Subjective Sampling

Subjective sampling involves selection by personal judgment. It is liable to bias, the extent of which cannot be assessed. This also applies to HAPHAZARD selection, e.g. sticking a pin into a map while blindfolded.

Neither procedure is favoured for inventory because of the likelihood of bias and because knowledge of precision of the estimate is often just as important as the estimate itself.

Random Sampling - Unrestricted

This method is commonly referred to as simple random sampling. The selection method used is objective, i.e. selection is not influenced by the personal judgment of the selector.

The procedure involves selecting 'n' units out of a total of 'N' units in such a manner that every possible combination of 'n' units has an equal chance of selection.Commonly, each unit (plot) is identified by a number or rectangular coordinate. To locate plots randomly within a stand, the x, y co-ordinates of the plot's centre or one corner are selected from a table of random numbers or generated by computer. Locating an individual tree (if it is the unit) purely at random is difficult unless each has been numbered beforehand. One method that can be used is to select points randomly within a stand, locate the nearest tree to that point, and then select and measure its nearest neighbour.

Generally, selection is without replacement, i.e. a unit once drawn is ignored if re-drawn (contrast with procedure in Art Unions). This affects the procedure for computing the standard error of the estimate.

With simple random sampling, the estimates of the mean and variance are unbiased and the precision of the estimate of the mean can be assessed. However, this precision may be low if the sample is not well distributed over the population. It is also likely to be low if the population is variable and the intensity of sampling is low.

Simple random sampling has some disadvantages when used for inventory of extensive forest areas, namely:

Random Sampling - Restricted (= Stratified Random Sampling)

A population can often be separated into strata (relatively homogeneous units) so that the variation within each stratum is minimised at the expense of variation between the strata. Samples can then be taken from each stratum to obtain a more efficient estimate of the total population.

Restricted random sampling is generally more efficient than unrestricted random sampling (because of a more effective distribution of the sampling units) than is possible using simple random sampling. This means that costs are minimised for a given precision of the population estimate.

There are many ways of stratifying a forest in advance of a forest inventory and some are more effective than others. The aim is to take advantage of certain types of information about the population with the express purpose of improving the precision of the estimate or usefulness of the sample. Thus, stratification could be based on an artificial grid, geographical divisions or on cadastral boundaries, but the most effective basis is usually some natural forest characteristic, e.g. forest type, age class, or some growth characteristic with which volume is correlated, viz. site index, predominant or top height, etc. Obviously, to group units of the population based on the similarity of some characteristic, prior knowledge of the population must be available from maps, aerial photographs, records of past surveys, etc.

Once the strata are established, each stratum is sampled at random and then the estimates for all strata are combined to give a population estimate.

There are three ways of allocating samples to strata -

  1. Proportional allocation
  2. Optimum allocation
  3. Neyman allocation.
Estimates from stratified random sampling are unbiased provided each stratum value is weighted according to the proportion the stratum forms of the whole population. The precision of the estimates can be assessed provided that at least two sampling units occur within each stratum.

Proportional Allocation

In proportional allocation, distribution of the sample is weighted by area alone, i.e. allocation is in proportion to the ratio of n/N where n is the number of areal units (e.g. ha) in a stratum and N is the total number of areal units in all strata. It is used when variances of the strata are known to be similar or they cannot be estimated prior to sampling. It is the most common form of allocation.

Optimum Allocation

Optimum allocation requires that estimates of both the within-stratum variances and the costs of sampling are available. Such information is often difficult to obtain but if available, e.g. in periodic management inventory after the initial inventory, it should be used.

Optimum allocation is designed to give most information per dollar spent, i.e. to cost the least for a given precision of the estimate or, for a given cost, to give minimum variance. Assuming that the cost per sampling unit in stratum j is cj, the estimation equation is:

where ni is the size of sample in the i-th stratum
  n  is the total number of units in the sample
  Ni is the number of elements in the i-th stratum
  ci is the cost of establishment of a unit in stratum i
  si is the standard deviation of the i-th stratum
  h  is the number of strata.

Neyman Allocation

Neyman allocation is a simplified form of optimum allocation. It is used when costs are uniform (or when they are unknown and one is prepared to assume that they are uniform) and variances in the strata differ. The sample is allocated so as to achieve for a given size of sample the smallest possible standard error of estimate of the population mean, i.e. the highest precision of the estimate. Distribution of the samples is weighted by the product of the stratum area and the within stratum standard deviation(s) of volume, basal area, etc. Thus, proportional allocation can only be applied if the relevant standard deviation is known from a preliminary investigation.

Intensity of sampling within a stratum

The number of sampling units to select per stratum can be determined in one of two ways:
  1. decide on a fixed sampling intensity and number of sampling units prior to the inventory. In this approach, less emphasis is given in the planning stage to the desired precision of estimates.
  2. calculate the number of units required for a chosen probability level and allowable sampling error, e.g.,
The desired accuracy should be selected after considering for what purpose the information is desired and what accuracy would suffice. Do not blindly follow some conventionally accepted error or probability. Keep in mind: Suppose, for example, the coefficient of variation of a particular parameter of an infinite population is found from an initial sample of 10 plots to be 25%. We wish to know how many plots should be used in the main survey such that we can be sure 95% of the time that the mean will be within 10% of the true value. Thus:
C = 25%
 t = 2.26 (normal deviate with 9 df (n-1) at the 0.05 (95%) confidence level
 and  e 	= 10%

therefore n = [25 x 2.26/10]^2 
                   = 31.9
i.e. 32 plots would meet requirements.
This approach demands preliminary information about the expected mean and standard deviation of the population, and a special reconnaissance survey may be necessary to obtain it.

The advantages of stratified random sampling over simple random sampling are:

The main disadvantages of stratified random sampling are: The most common barrier to the use of stratified random sampling is lack of knowledge of the population and of the strata sizes.

Systematic Sampling

Here, the pattern of the position of samples relative to each other is determined in advance, i.e. the sample units are selected in a systematic, predictable way; for example, every tenth tree is counted, or plots are laid out on a regular grid pattern. With these patterns, the position of every unit is determined once that of the first is located. It is essential, therefore, that the position of the first unit be selected at random, otherwise the estimate of the mean will be liable to bias.

Although there is no theoretical way of calculating the standard error of a systematically obtained estimate (a single systematic sample does not provide an unbiased estimate of variance because each unit does not have an equal chance of selection - a restriction imposed both by the sampling pattern and sampling intensity), in practice it is assumed that the sample units underlying a systematic selection are distributed randomly. In this situation the statistical formulae applicable to a random sample can be applied to the systematic sample. Calculated in this way, the maximum sampling error is estimated. It may considerably exceed the actual error.

When applying systematic sampling, one must allow for any natural pattern in the population and adjust for it to avoid gross errors, e.g. if the sampling pattern (frame) coincides with some natural periodicity of the population (recurring parallel ridges evenly spaced apart), or where the grid is coarse and the population changes systematically (e.g. tree growth on a slope). However, provided care is taken to avoid these mistakes, systematic sampling is a very practical method of measuring stands efficiently and accurately. Not least amongst its advantages is that no part of a stand remains unvisited.

One might argue that if a stratified random sample can be expected to show an improvement in precision over a simple random sample because of an improved distribution of sampling units, a better precision might be expected with a systematic sample than with a simple random selection.A systematic sample, while being statistically less desirable, is often much more practicable.

Systematic designs have certain advantages which explain their widespread use in forest inventory:

The larger the forest area to be inventoried, the greater is the amount of variation that can be expected and the more likely that a systematic sample will give a better estimate of the population mean than a simple random sample.Even for a stratified population, a systematic sample will probably yield a better estimate of the mean if the strata are large and exhibit considerable variation. As the homogeneity of the defined strata increases, the estimate from a random sample will more closely agree with that from a systematic sample. Stratified random sampling and systematic sampling are the most popular sampling systems in forest inventory.

The choice between the two depends on several factors. Systematic sampling is more practicable if:

Sequential sampling

Sequential sampling has been used successfully in regeneration, disease and insect surveys.

Multistage sampling

The population is first subdivided into a number of primary sampling units. Some of these units are then randomly selected as the sample of the first stage and these units are then subdivided into a series of secondary sampling units. Some of these secondary units are then randomly selected as the sample of the second stage. This process can be repeated for third and further stages.

The procedure has the advantage of concentrating work on a few specific primary samples after which less effort is usually needed to obtain the secondary samples.

Multistage sampling is particularly useful where the primary samples vary in importance and the selection is made with a probability proportional to their contribution to the estimate of the mean. For example, in a nation-wide sawmill production study, the sawmills might be the primary sample, and a random selection of hours worked over a period of time might be the secondary sample.

Cluster sampling

Cluster Sampling is a type of multistage sampling that consists of selecting primary sample points and establishing a number of closely located secondary sampling units (often 3 or 4) as a group or cluster at each selected primary point. The cluster then forms the primary sampling unit.

Cluster sampling has been used with success in tropical forests in remote areas involving difficult access. Under these circumstances, it may take several days to reach the location of a sample plot, and with little additional cost or effort, several plots in a group or cluster can be taken. The location of these clusters can be determined either systematically or randomly, although systematic distribution eliminates the possibility of calculating valid sampling errors. It is also worth pointing out that the use of a cluster with a fixed location of secondary units also eliminates the possibility of calculating a valid within-cluster sampling error.

Multiphase Sampling

In multiphase sampling, some of the same sampling units are employed at the different phases of sampling. This is different from the descending hierarchy of sampling units formed in multistage sampling. Application of multiphase sampling is common in designs for forest inventories of large areas involving use of aerial photographs. For example, a large number of sample points on photographs may be examined stereoscopically and classified as 'forest' or 'non-forest' as the first phase. A subsample of these photographic interpretation points may then be selected as plot centres for a second phase of detailed photointerpretation. A subsample of these photographic plots may then be selected as a third phase for field measurements.

Two-phase or double sampling is commonly used where the variable of interest is difficult or expensive to measure. A sample of the population is taken and a relatively quick and inexpensive measurement is made on each unit. The variable of interest is measured on a subsample of the units. From this subsample a relationship is then derived which can be used to predict values of the required variable for the whole sample. This relationship may be a simple ratio, as used when all trucks are weighed and a sample is scaled for the volume of their loads, or a more complex regression, as used when all trees are measured for diameter at breast height and a few for total height to derive a stand height curve (h/d).

PPS List Sampling and 3P Sampling (= Poisson sampling)

Both these sampling methods involve variable probability sampling of individual trees or groups of trees (stands). This is a much more efficient technique than equal probability sampling.

In variable probability sampling, trees (or stands) are selected for measurement in proportion to a prediction or preliminary measurement of the principal characteristic of interest, usually volume or value.

PPS list sampling with probability proportional to size can be a highly useful technique when there is an existing list of the entire population with a preliminary measurement or estimate of the variable of interest. It can be used if the inventory is conducted to two phases, but is more commonly utilised for updating old inventories. Because the basic parameters of the population are known, the exact sample size desired can be obtained and an extract variance calculated, which is not the case with 3P sampling. Thus from a statistical point of view, list sampling is a more efficient technique, but it is limited in its practical application.

3P sampling has wide application in forestry particularly for timber sales and forest inventories establishing resource stocks. It has been tested extensively in the USA and Australia during the past two decades (1970s and 1980s) and its efficiency has been established beyond question. We can expect it to become more important in forest assessment in Australia, particularly for inventory of our native hardwood forests using techniques such as point-3P (= point Poisson), point model-based and point-list sampling.

Designing an inventory

In planning a forest inventory, other matters have to be considered before any thought is given to implementing the inventory. Some relevant matters are discussed below:

Size of sample

A number of factors determine the size of sample (number of sampling units) that can be established in the field, namely: Of these factors, the last (time and $) is often the most critical and frequently overrides the desired number of units calculated using the standard formulae, i.e. due to cost, fewer units than desired have to be accepted.

The size of sample may be expressed as a given number of sampling units or as a sampling intensity, i.e. area of the sample expressed as a percent of the population area. It is preferable to express the size of sample both ways because if two forests of different area have the same mean and variance, the same number of sampling units will be required for a given precision of estimate but the intensity of sampling will be different. Obviously, the planner must compromise.

Sampling unit

Three types of sampling unit are common in forest assessment and inventory, viz. strips (hence stripline or transect), plots, and point or angle count spots. Striplines are usually 5-40 m wide. The distance between the strips depends on a number of factors including sampling intensity, topography, forest composition, and other factors referred to above.

Strips are most convenient if information on topography and forest composition is also required as part of the survey and if dense undergrowth or difficult terrain necessitates spending a disproportionate amount of time on plot establishment. Conventionally, strips are run at right angles to the contours of the main topography of the area under survey because the fertility gradient usually runs in this direction. Thus, if an area is essentially of one topographic form, the bearing of the long side (plots or strips) can be specified in the prescription for plot location. If topography varies by zones, bearing of the long side may need to be specified by zones.

For a given sampling intensity, a strip survey may be faster than a survey based on plots because the ratio of working time on the units to the travelling time between units is greater for strips. However, the number of units and therefore the number of degrees of freedom for calculating sampling error are often far fewer for strips than plots: this may not be balanced by the reduction in the variation between units.

Strips and plots may be combined in what are called Line Plots. With these, topographical and forest type data are gathered from the strips and quantitative information (diameter, height, volume, etc.) is obtained from plots located at intervals along the strips.

Sampling units (plots) are established in forests by forest authorities for three main reasons:-

  1. To provide information on growth and yield of forest crops which is then used to construct yield models (yield tables). These are required both for routine
  2. management of forest crops and for long-term planning and production forecasting.
  3. To provide information on the effects of specific experimental treatments, e.g. different thinning regimes, initial plant spacings, fertiliser treatments, etc.
  4. To provide mensurational data which are used in devising new measurement methods and systems for general forest use.

Plot Size and Shape

Unbiased estimates of timber quantity can be obtained from any fixed-area plot size and plot shape (rectangular, square, circular, and narrow-width rectangular called striplines or transects); however, the optimum size and shape to use vary with forest conditions. For important surveys, it is well worth investigating in a pilot study the relative efficiency of different sizes and shapes by comparing the respective sampling errors and costs. Gambill et al. (1985) describe a method for doing this which minimises the total time involved on the survey and provides estimates within a specified level of precision. Their approach warrants serious consideration by anyone involved in designing efficient forest inventories.

A guiding principle in choosing the size of unit is to have it large enough to include a representative number of trees but small enough so that the time required for measurement is not excessive, i.e. the size of unit must be related to the elements of the population. One may need to use:

Concentric plots are sometimes used in mixed forest containing a wide range of tree sizes from mature veterans down to saplings/regeneration. A maximum of three concentric units can be handled. Measurement of the the large trees is done on the plots of greatest radius whereas that of the smaller trees is confined to the inner plot of smallest radius.

Except in very uniform populations, small plots yield more variable information than large plots but, because of their wider distribution for a given sampling intensity, they may give an increase in the precision of estimate of the mean, but at a cost! Again, a compromise must be sought.

Plot orientation

If a fertility gradient is recognizable, one should use rectangular plots aligned with the long axis parallel to the gradient: this leads to an increase in precision of the estimate. If no gradient is recognizable, square or circular plots are suitable.

As stated above, strips conventionally are run at right angles to the contours of the main topography of an area because the fertility gradient usually runs in this direction.

Establishment of units on the ground

The possibility of bias in location is very slight with systematic samples if the starting point is selected at random and located by survey method: the position of all other points is automatically fixed. Contrast this with random samples, the units of which are usually selected on paper. Various sources of bias in location may occur here due to: If an operator is unable to locate a specific point on the ground, an alternative location should be selected by an objective means, e.g. move a certain distance in a certain direction, the distance and direction being determined at random.

Special care should be taken when establishing plots in plantation forests where trees occur in rows that are not exactly parallel (most cases). Serious bias can occur in the plot estimates in such forests if an attempt is made to establish an exactly square or rectangular plot of a specified size. To avoid such bias, one should locate the plot corners mid-way between the rows. This usually results in both the plot size and shape only approximating that desired, but this is relatively unimportant.

Edge effects

Edge effects arise as a result of different micro-locality conditions and may extend from one metre to tens and even hundreds of metres from the forest boundary. If this "fringe area" is small relative to the area of the population, units which fall within it can be rejected without serious bias. On the other hand, if the "fringe area" is large (as one finds in comparatively small isolated stands), it cannot be ignored and must be taken into account in sampling. A special prescription detailing the use of partial plots may be needed as described by Grosenbaugh (1958) . Other aspects of adjusting for edge effects at the forest boundary are addressed in some detail by Schreuder et al. (1993) .

Errors in the quantitative estimate

Any errors in tree and stand measurement, species identification and in determining the area of sampling units will affect the estimate of the population mean and should be rigidly controlled.

One cannot over-emphasise the need for accurate determination of total forest area. Too often, area information is taken from erroneous land records, rough uncorrected maps, or aerial photographs. Errors arising from such sources may be such as to completely undermine the considerable care taken during the survey in establishing the sampling units, measuring the trees and stands, identifying the species, and so on, to derive accurate volume estimates.

Offsetting from the centre line can be a serious source of error in area determination with long narrow plots and striplines. An extension pole may be used in very narrow units (less than 5 m wide). In wider units, the offsets should be measured by tape. Errors arising from visual estimation can be large, and they are not necessarily compensating. All the sources of error mentioned above can be minimized by a special preliminary instruction (training) of operators and by frequent checking during measurement.

Recording Field Data

In the past, information gathered during forest assessment and inventory was recorded on specially prepared forms which were subsequently taken back to the office for processing. This often took weeks, months or years, and even then the data were not necessarily 'error free'. Some 20 years ago (early 1970s), a revolutionary development occurred in recording and processing field data which made it possible to record, edit and process field data almost instantaneously. The new technology involves portable electronic devices called portable data recorders (PDRs) or field computers.

These machines have drastically changed the habits of people collecting data, especially those involved in large organisations and large projects, and has minimised the errors arising from transcription and data entry. These and other benefits which accrue from using the machines are discussed by Nieman (1990) . During the past decade, the ever increasing miniaturisation of electronic components and ever broadening availability and lower prices of PDRs have become extremely important to all sections of commerce and industry, not the least being forestry. However, assessing the array of machines now available to find the product that best suits the user's needs is a daunting task. Careful consideration must be given to a wide range of questions including:

These questions and many others were addressed at a special conference in Atlanta, Georgia, convened by the Society of American Foresters from March 6-7, 1990. The proceedings were summarised and discussed in detail by Wood (1990) who outlines the recommended procedure for selecting and purchasing a PDR. These developments cannot be ignored by anyone associated with large scale measurement projects. The PDR most widely used in Australian forestry is the 'Husky Hunter', but it is only one of a number of machines now available which are ideally suited for forestry applications.

References


Index    Help    Authors 

Document URLhttp://online.anu.edu.au/Forestry/mensuration/SAMPLE1.HTM
Editor Cris Brack ©
Last Modified DateFri, 9 Feb 1996