Thursday, April 30, 2009

Probability Sampling - Multi Stage Random Sampling

The four methods we've covered so far -- simple, stratified, systematic and cluster -- are the simplest random sampling strategies. In most real applied social research, we would use sampling methods that are considerably more complex than these simple variations. The most important principle here is that we can combine the simple methods described earlier in a variety of useful ways that help us address our sampling needs in the most efficient and effective manner possible. When we combine sampling methods, we call this multi-stage sampling.

Multi-stage sampling is like cluster sampling, but involves selecting a sample within each chosen cluster, rather than including all units in the cluster. Thus, multi-stage sampling involves selecting a sample in at least two stages. In the first stage, large groups or clusters are selected. These clusters are designed to contain more population units than are required for the final sample.

In the second stage, population units are chosen from selected clusters to derive a final sample. If more than two stages are used, the process of choosing population units within clusters continues until the final sample is achieved.

An example of multi-stage sampling is where, firstly, electoral sub-divisions (clusters) are sampled from a city or state. Secondly, blocks of houses are selected from within the electoral sub-divisions and, thirdly, individual houses are selected from within the selected blocks of houses

The advantages of multi-stage sampling are convenience, economy and efficiency. Multi-stage sampling does not require a complete list of members in the target population, which greatly reduces sample preparation cost. The list of members is required only for those clusters used in the final stage. The main disadvantage of multi-stage sampling is the same as for cluster sampling: lower accuracy due to higher sampling error

Multi Stage Random Sampling is sampling technique needing minimize 2 step withdrawal of sample, which the including Multi Stage Random Sampling category is stratified random sampling, cluster random sampling and combination among both technique.

Multi Stage Random Sampling with very large populations it may be desirable to arrange the data into groups on one criterion, e.g. address by area of postcode, and to select randomly from within this group, then select from within this sample to obtain randomly a representative number of specimens, such as dogs of each age group.

A multistage random sample is constructed by taking a series of simple random samples in stages. This type of sampling is often more practical than simple random sampling for studies requiring "on location" analysis, such as door-to-door surveys. In a multistage random sample, a large area, such as a country, is first divided into smaller regions (such as states), and a random sample of these regions is collected. In the second stage, a random sample of smaller areas (such as counties) is taken from within each of the regions chosen in the first stage. Then, in the third stage, a random sample of even smaller areas (such as neighborhoods) is taken from within each of the areas chosen in the second stage. If these areas are sufficiently small for the purposes of the study, then the researcher might stop at the third stage. If not, he or she may continue to sample from the areas chosen in the third stage, etc., until appropriately small areas have been chosen.

In social research, we surely face complex problem and to solve this problem we have to use Multi Stage Random Sampling. Multi Stage Random Sampling is complex technique, which combine some technique sampling, like stratified random sampling, cluster random sampling and simple random sampling.

For example: Sample location in Jakarta, we wish to get the sample using Multi Stage Random Sample.
First, we determine cluster from Jakarta, such as south, west, north, east, and center.
Second, we determine some samples from each cluster (south, west, north, east, and center) using Simple Random Sample, we call “kelurahan”, or in other word we using Stratified Random Sampling in Jakarta area.
Third, we determine some samples from each “kelurahan” using Simple Random Sample, then we call “RW”,
Fourth, like previous step, we determine some samples from each “RW” using Simple Random Sample, then we call “RT”,
Fifth, we determine some samples from each “RT” using Simple Random Sample, then we call “starting point”
Sixth, we determine sample using Systematic Random Sampling from “starting point”. We move house to house to find chosen respondent based on interval which determined.

Scheme of Multi Stage Random Sampling
city -> kelurahan -> RW -> RT -> starting point -> unit residence -> respondent

Such an approach is called multistage sampling. It can be much more economical than trying to sample directly from the population. In this example, the multistage approach needs a list of persons in a few dwellings, a list of dwellings in a few blocks, a list of blocks in a few counties, a list of counties in a few states, and a list of states. Constructing or obtaining these lists is much easier and much less error-prone than trying to construct a list of all persons in the United States, which we would need to sample persons directly in a single stage.


Source:
-. Australian Bureau of Statistics
-. http://statistics.berkeley.edu/~stark/
-. Wikipedia.com
-. www.socialresearchmethods.net/
-. www.fao.org/

2 comments:

nmontalva said...

Was very clarifying. I't trying to figure out how to do a survey of a population which is disperse in small rural villages clearly separated by geographical features. There are 173 villages, settled in well established areas far apart one for each other, and are very heterogeneous in regards to their number of members (few with a thousand of families, others with no more than 10, and all the numbers between).

Seems like a multi-stage sampling is the most suitable, since there is impossible to get a list of members or households, but the list of villages is widely available and accurate.

But I still have lots of doubts about how to apply it in real life.

Firstly, once you have established what will be selected in each stage (in my case, 1st Counties, 2nd Villages in each county, 3rd households, 4th person). How can you know how many should be selected each time? I mean, you cannot apply simple random standard considerations to calculate "sample size" to know how many counties (out of 15), villages (out of 173, but NOT homogeneously distributed in each county), household (unknown number, but for sure, very heterogenous), and person (unknown number but should not be very heterogenous, I presume) , etc.

Secondly, is very likely that each first-stage cluster will have a different number of elements, and then, clusters with less elements will have a proportional higher chance to be selected than elements in large cluster. I could solve this weightening, but I do not know the exact number of persons in each cluster. (I have an approximate value, from a source which seems a but out-of-date).

Thank you!

Unknown said...

i think a systemic sampling method might work here