What Is a Sample?
A sample refers to a smaller, manageable version of a larger group. It is a subset containing the characteristics of a larger population. Samples are used in statistical testing when population sizes are too large for the test to include all possible members or observations. A sample should represent the population as a whole and not reflect any bias toward a specific attribute.
There are several sampling techniques used by researchers and statisticians, each with its own benefits and drawbacks.
- In statistics, a sample is an analytic subset of a larger population.
- The use of samples allows researchers to conduct their studies with more manageable data and in a timely manner.
- Randomly drawn samples do not have much bias if they are large enough, but achieving such a sample may be expensive and time-consuming.
- In simple random sampling, every entity in the population is identical, while stratified random sampling divides the overall population into smaller groups.
A sample is an unbiased number of observations taken from a population. In simple terms, a population is the total number of observations (i.e., individuals, animals, items, data, etc.) contained in a given group or context. A sample, in other words, is a portion, part, or fraction of the whole group, and acts as a subset of the population. Samples are used in a variety of settings where research is conducted. Scientists, marketers, government agencies, economists, and research groups are among those who use samples for their studies and measurements.
Using whole populations for research comes with challenges. Researchers may have problems gaining ready access to entire populations. And, because of the nature of some studies, researchers may have difficulties getting the results they need in a timely fashion. This is why people samples are used. Using a smaller number of people who represent the entire population can still produce valid results while reducing time and resources.
Samples used by researchers must resemble the broader population in order to make accurate inferences or predictions. All the participants in the sample should share the same characteristics and qualities. So, if the study is about male college freshmen, the sample should be a small percentage of males that fit this description. Similarly, if a research group conducts a study on the sleep patterns of single women over 50, the sample should only include women within this demographic.
Consider a team of academic researchers who want to know how many students studied for less than 40 hours for the CFA exam and still passed. Since more than 200,000 people take the exam globally each year, reaching out to each and every exam participant would burn time and resources.
In fact, by the time the data from the population has been collected and analyzed, a couple of years would have passed, making the analysis worthless since a new population would have emerged. What the researchers can do instead is take a sample of the population and get data from this sample.
In order to achieve an unbiased sample, the selection has to be random so everyone from the population has an equal and likely chance of being added to the sample group. This is similar to a lottery draw and is the basis for simple random sampling.
For an unbiased sample, the selection must be random so that everyone in the population has an equal chance of being added to the group.
Types of Sampling
Simple Random Sampling
Simple random sampling is ideal if every entity in the population is identical. If the researchers don’t care whether their sample subjects are all male or all female or a combination of both sexes in some form, simple random sampling may be a good selection technique.
Let's say there were 200,000 test-takers who sat for the CFA exam in 2021, out of which 40% were women and 60% were men. The random sample drawn from the population should, therefore, have 400 women and 600 men for a total of 1,000 test-takers.
But what about cases where knowing the ratio of men to women that passed a test after studying for less than 40 hours is important? Here, a stratified random sample would be preferable to a simple random sample.
Stratified Random Sampling
This type of sampling, also referred to as proportional random sampling or quota random sampling, divides the overall population into smaller groups. These are known as strata. People within the strata share similar characteristics.
What if age was an important factor that researchers would like to include in their data? Using the stratified random sampling technique, they could create layers or strata for each age group. The selection from each stratum would have to be random so that everyone in the bracket has a likely chance of being included in the sample. For example, two participants, Alex and David, are 22 and 24 years old, respectively. The sample selection cannot pick one over the other based on some preferential mechanism. They both should have an equal chance of being selected from their age group. The strata could look something like this:
|Strata (Age)||Number of People in Population||Number to Be Included in Sample|
From the table, the population has been divided into age groups. For example, 30,000 people within the age range of 20 to 24 years old took the CFA exam in 2021. Using this same proportion, the sample group will have (30,000 ÷ 200,000) × 1,000 = 150 test-takers that fall within this group. Alex or David—or both or neither—may be included among the 150 random exam participants of the sample.
There are many more strata that could be compiled when deciding on a sample size. Some researchers might populate the job functions, countries, marital status, etc., of the test-takers when deciding how to create the sample.
Examples of Samples
In 2021, the population of the world was nearly 7.9 billion, out of which 49.6% were female and 50% were male. The total number of people in any given country can also be a population size. The total number of students in a city can be taken as a population, and the total number of dogs in a city is also a population size. Samples can be taken from these populations for research purposes.
Following our CFA exam example, the researchers could take a sample of 1,000 CFA participants from the total 200,000 test-takers—the population—and run the required data on this number. The mean of this sample would be taken to estimate the average of CFA exam takers that passed even though they only studied for less than 40 hours.
The sample group taken should not be biased. This means that if the sample mean of the 1,000 CFA exam participants is 50, the population mean of the 200,000 test-takers should also be approximately 50.
Why Do Analysts Use Samples Instead of Measuring the Population?
Often, a population is too large or extensive in order to measure every member and measuring each member would be expensive and time-consuming. A sample allows for inferences to be made about the population using statistical methods.
What Is a Simple Random Sample?
This sampling method uses respondents or data points that are randomly selected from the larger population. With a large enough sample size, a random sample removes bias.
Why Do Random Samples Allow for Inference?
The laws of statistics imply that accurate measurements and assessments can be made about a population by using a sample. Analysis of variance (ANOVA), linear regression, and more advanced modeling techniques are valid because of the law of large numbers and the central limit theorem.
How Large of a Sample Do You Need?
This will depend on the size of the population and the type of analysis you'd like to do (e.g., what confidence intervals you are using). Power analysis is a technique for mathematically evaluating the smallest sample size needed based on your needs. Another rule of thumb is that your sample should be large enough, but no more than 10% as large as the population.
Investopedia requires writers to use primary sources to support their work. These include white papers, government data, original reporting, and interviews with industry experts. We also reference original research from other reputable publishers where appropriate. You can learn more about the standards we follow in producing accurate, unbiased content in oureditorial policy.
Sage Publishing. "Introduction to Statistics, Chapter 1," Pages 4-5.
CFA Institute. "1963 - 2022 Candidate Examination Results."
Virginia Tech Library. "Significant Statistics: 1.5 Sampling Techniques and Ethics."
The World Bank Group. "Population, Female (% of Total Population)."
The World Bank Group. "Population, Male (% of Total Population)."
The World Bank Group. "Population, Total."
A sample statistic (or just statistic) is defined as any number computed from your sample data. Examples include the sample average, median, sample standard deviation, and percentiles. A statistic is a random variable because it is based on data obtained by random sampling, which is a random experiment.What is sample and its types in statistics? ›
Sampling means selecting the group that you will actually collect data from in your research. For example, if you are researching the opinions of students in your university, you could survey a sample of 100 students. In statistics, sampling allows you to test a hypothesis about the characteristics of a population.What is sample data example? ›
A sample data set contains a part, or a subset, of a population. The size of a sample is always less than the size of the population from which it is taken. [Utilizes the count n - 1 in formulas.] Example: The sample may be "SOME people living in the US."What is the type of sample? ›
There are two main types of sampling: probability sampling and non-probability sampling. The main difference between the two types of sampling is how the sample is selected from the population.What are the 5 types of samples? ›
There are five types of sampling: Random, Systematic, Convenience, Cluster, and Stratified.What are 3 examples of sample vs population? ›
Population vs. Sample | Definitions, Differences & Examples.
|Songs from the Eurovision Song Contest||Winning songs from the Eurovision Song Contest that were performed in English|
There are four primary, random (probability) sampling methods – simple random sampling, systematic sampling, stratified sampling, and cluster sampling.How many types of sample sizes are there? ›
There are two types of sample sizes to determine: one sample size determination is used to find the number to have enough participants to be representative of a population, and the other sample size determination is to achieve statistical power. Let's talk about these two types.What is called sample in statistics? ›
A sample is a group of elements chosen from the population. The features that describe the population are called the parameters and the properties of the sample data are known as statistics. Population and sample both are important parts of statistics.What is the best example of a sample in statistics? ›
A sample is just a part of a population. For example, let's say your population was every American, and you wanted to find out how much the average person earns. Time and finances stop you from knocking on every door in America, so you choose to ask 1,000 random people. This one thousand people is your sample.
“Sample”- Learn the Difference. The word example is used to mention an illustration, in support of a claim. The word sample is used to denote a specimen or model.What type of data is sample data? ›
Data sampling is a statistical analysis technique used to select, manipulate and analyze a representative subset of data points to identify patterns and trends in the larger data set being examined.How do I choose a sample type? ›
We could choose a sampling method based on whether we want to account for sampling bias; a random sampling method is often preferred over a non-random method for this reason. Random sampling examples include: simple, systematic, stratified, and cluster sampling.What sample means? ›
A sample mean is an average of a set of data . The sample mean can be used to calculate the central tendency, standard deviation and the variance of a data set. The sample mean can be applied to a variety of uses, including calculating population averages.What are the 4 sampling strategies? ›
Four main methods include: 1) simple random, 2) stratified random, 3) cluster, and 4) systematic. Non-probability sampling – the elements that make up the sample, are selected by nonrandom methods. This type of sampling is less likely than probability sampling to produce representative samples.What are the 4 types of population? ›
There are 4 main characteristics of the population structure: age, gender, ethnicity, and density.What is population mean vs sample? ›
In statistics, there are two different averages: the sample mean and the population mean. The sample mean only considers a selected number of observations—drawn from the population data. The population mean, on the other hand, considers all the observations in the population—to compute the average value.Why do we use sample? ›
Samples are used to make inferences about populations. Samples are easier to collect data from because they are practical, cost-effective, convenient, and manageable.What are the two types of sampling? ›
Probability Sampling is a sampling technique in which samples from a larger population are chosen using a method based on the theory of probability. Non-probability sampling is a sampling technique in which the researcher selects samples based on the researcher's subjective judgment rather than random selection.What are the 5 types of non-probability sampling? ›
- Convenience or haphazard sampling. ...
- Volunteer sampling. ...
- Judgement sampling. ...
- Quota sampling. ...
- Snowball or network sampling. ...
- Crowdsourcing. ...
- Web panels. ...
- Advantages and disadvantages of non-probability sampling.
An example of a simple random sample would be the names of 25 employees being chosen out of a hat from a company of 250 employees. In this case, the population is all 250 employees, and the sample is random because each employee has an equal chance of being chosen.What is sample size example? ›
In statistics, the sample size is the measure of the number of individual samples used in an experiment. For example, if we are testing 50 samples of people who watch TV in a city, then the sample size is 50. We can also term it Sample Statistics.What is the most common sample size? ›
A good maximum sample size is usually around 10% of the population, as long as this does not exceed 1000. For example, in a population of 5000, 10% would be 500. In a population of 200,000, 10% would be 20,000. This exceeds 1000, so in this case the maximum would be 1000.What is large sample and small sample? ›
Large and Small sample theory. Large sample theory. The sample size n is greater than 30 (n≥30) it is known as large sample. For large samples the sampling distributions of statistic are normal(Z test). A study of sampling distribution of statistic for large sample is known as large sample theory.What are the 4 sampling techniques in statistics? ›
Collect unbiased data utilizing these four types of random sampling techniques: systematic, stratified, cluster, and simple random sampling.What does sample mean? ›
The sample mean is a statistic obtained by calculating the arithmetic average of the values of a variable in a sample. If the sample is drawn from probability distributions having a common expected value, then the sample mean is an estimator of that expected value. Definition.What is the best sample type? ›
Simple random sampling: One of the best probability sampling techniques that helps in saving time and resources, is the Simple Random Sampling method. It is a reliable method of obtaining information where every single member of a population is chosen randomly, merely by chance.What are the types of sampling *? ›
There are four main types of probability sampling: simple random, cluster, systematic, and stratified.Is sample mean population? ›
The sample mean is the average of sample values picked from the population. The result resembles the population mean to a certain extent. Population mean is the central tendency for the entire group. Compared to the population, the sample size is small.What is the sample in a study example? ›
In research terms a sample is a group of people, objects, or items that are taken from a larger population for measurement. The sample should be representative of the population to ensure that we can generalise the findings from the research sample to the population as a whole.
The most straightforward way to sample data is with simple random sampling. Essentially, the subset is built of observations that were chosen from a larger set purely by chance; Each observation has the same chance of being selected from the larger set. Simple random sampling is extremely simple and easy to implement.How does sample work? ›
The theory behind sampling is based on the concept of the simple random sample. In a simple random sample, individuals are selected from the population in a completely random fashion. This implies that all individuals have identical (nonzero) probability of being selected for our sample.How do you measure a sample? ›
- Determine the population size (if known).
- Determine the confidence interval.
- Determine the confidence level.
- Determine the standard deviation (a standard deviation of 0.5 is a safe choice where the figure is unknown)
- Convert the confidence level into a Z-Score.
Large and Small sample theory. Large sample theory. The sample size n is greater than 30 (n≥30) it is known as large sample. For large samples the sampling distributions of statistic are normal(Z test). A study of sampling distribution of statistic for large sample is known as large sample theory.