Sampling,JICABadr

Page 1

Clinical Epidemiology Unit Faculty of Medicine , Suez Canal University

Sample Selection & sample size Badr Mesbah Prof.of Pediatrics


Session objectives • To know how to define the study target population and sampling frame • To know the terminology used in sampling procedures • To understand the importance of proper sampling procedure • To know probability and non-probability sampling techniques • To understand the principles of sample size calculations


Important statistical terms Population: The entire group under study as defined by research objectives. Sometimes called the “universe.� (The collection of all responses, measurements, counts that are of interest) Sample: A subset of the population

or


Important statistical terms  Reference population (Target population):  Aggregate or collection of units to which the study results apply.

 Source population:  The broad group of units from whom the researcher will obtain the study units.


Important statistical terms Sampling frame:  List of potential units from which a sample will be drawn

Study sample:  The units “Subjects” that are selected to take part in the study

Study subjects:  The units “subjects” who provided the data.


Sampling plan Reference Population Source Population Sampling Frame Selected Sample Study subjects


Why sampling? Get information about large populations  Less costs  Less field time  More accuracy i.e. Can Do A Better Job of Data Collection  When it’s impossible to study the whole population


Issues in sampling

-

Representativeness Generalizability Feasibility Retention (minimum loss to follow up) Ethical considerations


Basic Sampling classification Probability samples: Samples in which members of the population have a known chance (probability) of being selected Non-probability samples: Samples in which the chances (probability) of selecting members from the population are unknown


Basic Sampling classification Sampling Techniques Non-probability

Probability

Convenience

Simple Random

Judgmental Quota

Snowball

Systematic Stratified Cluster


Non-probability sampling • Convenience Sampling It attempts to obtain a sample of convenient elements. Often, respondents are selected because they happen to be in the right place at the right time.

• Judgmental Sampling It is a form of convenience sampling in which the population elements are selected based on the judgment of the researcher.


Non-probability sampling Quota Sampling it may be viewed as two-stage restricted judgmental sampling. The first stage consists of developing control categories of population elements. In the second stage, sample elements are selected based on convenience or judgment. Snowball Sampling In this sample, an initial group of respondents is selected, usually at random. After being interviewed, these respondents are asked to identify others who belong to the target population of interest..


Probability sampling • Random sampling – Each subject has a known probability of being selected

• Allows application of statistical sampling theory to results, so we can: – Generalise – Test hypotheses


Probability sampling • Probability samples are the best • Ensure – Representativeness – Precision


Simple Random Sampling ďƒ˜Each element in the population has a known and equal probability of selection. ďƒ˜Each possible sample of a given size (n) has a known and equal probability of being the sample actually selected. ďƒ˜This implies that every element is selected independently of every other element.


Simple random sampling


Simple Random Sampling • Advantages:  Known and equal chance of selection  Easy method when there is an electronic database

• Disadvantages:  Complete accounting of population needed – and requires to provide unique designations to every population member  Very inefficient when applied to skewed population distribution (over- and under-sampling problems)


Systematic sampling The sample is chosen by selecting a random starting point and then picking every ith element in succession from the sampling frame. The sampling interval, i, is determined by dividing the population size N by the sample size n and rounding to the nearest integer.


Systematic sampling (procedure) 1. Select a suitable sampling frame 2. Each element is assigned a number from 1 to N (pop. size) 3. Determine the sampling interval i:i=N/n. If i is a fraction, round to the nearest integer 4. Select a random number, r, between 1 and i, as explained in simple random sampling 5. The elements with the following numbers will comprise the systematic random sample: r, r+i,r+2i,r+3i,r+4i,...,r+(n-1)i


Systematic sampling


Systematic sampling Advantages:  Known and equal chance of any element being selected  Efficiency. do not need to designate (assign a number to) every population member, just those early on the list (unless there is a very large sampling frame).  Less expensive…faster than SRS

Disadvantages:  Small loss in sampling precision  Potential “periodicity” problems


Stratified Sampling A two-step process in which the population is partitioned into subpopulations, or strata. The strata should be mutually exclusive and collectively exhaustive in that every population element should be assigned to one and only one stratum and no population elements should be omitted.


Stratified Sampling Elements are selected from each stratum by a random procedure, usually SRS. A major objective of stratified sampling is to increase precision without increasing cost.

• Types of Strata variables: – Geographic (region, province, rural/urban, etc…) – Non-geographic (age, sex, income, etc…)


Stratified Sampling Advantage: ď‚Ą More accurate overall sample of skewed population

Disadvantage: ď‚Ą More complex sampling plan requiring different sample sizes for each stratum


Cluster Sampling Involves selecting the sample units in groups.

The target population is first divided into mutually exclusive and collectively exhaustive subpopulations, or clusters. Then a random sample of clusters is selected, based on a probability sampling technique such as SRS.


Cluster Sampling For each selected cluster, either all the elements are included in the sample (one-stage) or a sample of elements is drawn probabilistically (two-stage). Elements within a cluster should be as heterogeneous as possible, but clusters themselves should be as homogeneous as possible. Ideally, each cluster should be a smallscale representation of the population.


Cluster Sampling Advantages

 Economic efficiency … faster and less expensive than SRS  Does not require a list of all members of the universe

Disadvantage:

 Cluster specification error…the more homogeneous the cluster chosen, the more imprecise the sample results


Developing a Sampling Plan Sample plan: Definite sequence of steps that the researcher goes through in order to draw and ultimately arrive at the final sample


Developing a Sampling Plan Step 1: Define the relevant population. Specify the descriptors, geographic locations, and time for the sampling units.

Step 2: Obtain a population list to generate the sample frame


Developing a Sampling Plan Step 3: Design the sample method (size and method). Determine specific sampling method to be used. All necessary steps must be specified (sample frame, n, ‌ recontacts, and replacements) Step 4: Draw the sample. Select the sample unit and gain the information


Developing a Sampling Plan Step 5: Assess the sample.

Sample validation – compare sample profile with population profile; check nonresponders


Errors in Sample

Systematic error (or bias) Sampling error (random error)


Type 1 error • The probability of finding a difference with our sample compared to population, and there really isn’t one…. • Known as the α (or “type 1 error”) • Usually set at 5% (or 0.05)


Type 2 error • The probability of not finding a difference that actually exists between our sample compared to the population… • Known as the β (or “type 2 error”)

• Power is (1- β) and is usually 80%


Sample Size

How many subjects should be studies: - Empiric choice - Feasibility ( time, resources, frequency of cases)


Sample Size

• Large or small sample size - ambiguous - unmeasurable - relative


Sample Size • Depends on: 1 - Difference to be detected (effect size): A large sample size is needed to detect a minute difference Sample size is inversely related to the effect size


Sample Size • Depends on: 2 – variability of measurement: Reflected by the standard deviation or the variance Sample size is directly related to the standard deviation


Sample Size • Depends on: 3 – Level of significance: Type 1 error, has been arbitarily set to 5% or 0.05 Sample size is inversely related to ι error


Sample Size • Depends on: 4 – Power of the study The power of a study is the probability that it will yield statistically significant result Related to type 2 error (power = 1 – β) Sample size is directly related to the power


Thank You


Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.