2. Describing Data with Tables and Graphs

Frequency Distributions

2. Describing Data with Tables and Graphs

Frequency Distributions: Videos & Practice Problems

Topic summary

Frequency distributions organize data into classes, showing the frequency of measurements within each class. To create a frequency distribution, calculate the class width using the formula: $(\max - \min) / number of classes$ . Determine lower and upper class limits, ensuring they do not overlap. Finally, calculate relative frequencies by dividing the frequency of each class by the total number of measurements, then multiplying by 100 to express as a percentage.

concept

Intro to Frequency Distributions

Video duration:

Intro to Frequency Distributions Video Summary

Understanding how to visualize data effectively is crucial in data analysis, and one of the foundational tools for this is the frequency distribution. A frequency distribution is essentially a table that organizes data into classes, showing how many measurements fall into each category. This organization helps in analyzing both qualitative and quantitative data.

To create a frequency distribution, you first need to define your classes. For example, if you have a dataset representing the time students spend studying, you might create classes such as 20-29 minutes, 30-39 minutes, and so on. In this case, you would identify the frequency of data points that fall within each class. For instance, if you have a range of study times from 20 to 75 minutes, you would count how many students fall into each class, marking tallies or counts for clarity.

Each class in a frequency distribution has specific characteristics. The lower class limit is the smallest value in the class, while the upper class limit is the largest. For the class 20-29, the lower limit is 20 and the upper limit is 29. The class midpoint is calculated by averaging the lower and upper limits, which for the 20-29 class would be calculated as:

Midpoint = $\frac{20 + 29}{2} = 24.5$

The class width is the difference between the lower limits of consecutive classes. For example, the class width between 20-29 and 30-39 is:

Class Width = $30 - 20 = 10$

It’s important to note that the class width is not determined by the upper and lower limits of the same class, as this could lead to miscounting data points that fall on the boundary.

Once the frequency distribution is established, you may also need to calculate the relative frequency, which expresses the frequency of each class as a percentage of the total number of observations. This is done using the formula:

Relative Frequency = $\frac{f}{n} \times 100$

where $f$ is the frequency of the class and $n$ is the total number of observations. For example, if there are 10 total observations and a class has a frequency of 1, the relative frequency would be:

Relative Frequency = $\frac{1}{10} \times 100 = 10\%$

By organizing data into a frequency distribution and calculating relative frequencies, you can gain valuable insights into the distribution of your data, making it easier to visualize and analyze trends.

Problem

Use the frequency distribution below to find the class width and class midpoints.

Class width = 10, class midpoints = 10, 20, 30, 40, 50, 60, 70

Class width = 10, class midpoints = 10, 21, 32, 43, 54, 65, 76

Class width = 11, class midpoints = 10, 21, 32, 43, 54, 65, 76

Class width = 11, class midpoints = 10,20, 30, 40, 50, 60, 70

Problem

The following data set shows the number of overtime hours that 12 employees worked in a month. Construct a frequency distribution, using a lower class limit of 3 and a class width of 4.

option a

option b

option c

option d

example

Intro to Frequency Distributions Example 1

Video duration:

Intro to Frequency Distributions Example 1 Video Summary

In this example, we explore how to construct a frequency distribution based on customer counts at a cafe over a fourteen-hour period. We start with a lower class limit of 15 and a class width of 5, which helps us define our classes. The classes are determined as follows: the first class is 15-19, the second is 20-24, and this pattern continues up to 40-44. However, since the highest customer count is 41, we stop at the class 40-44, as it contains no data values.

Next, we identify the upper class limits by subtracting one from the lower class limits. This gives us upper limits of 19, 24, 29, 34, 39, and 44. With our classes established, we can now tally the customer counts into their respective classes. For instance, if we have counts of 15, 24, 30, and others, we mark them in the appropriate class. After counting, we find the frequencies for each class: 1, 2, 4, 3, 2, and 0, totaling 14 customers.

To further analyze the data, we calculate the relative frequencies, which represent the proportion of each class relative to the total number of observations. The formula for relative frequency is given by:

$f / n$

where f is the frequency of the class and n is the total number of observations (14 in this case). For example, the relative frequency for the first class (1 customer) is:

$1 / 14 = 0.071$

Continuing this process for each class, we find the relative frequencies as decimals: 0.071, 0.143, 0.286, 0.214, 0.143, and 0.000.

Finally, we address the question of what percentage of the time the cafe serves 30 or more customers per hour. This involves summing the relative frequencies of the classes that represent 30 customers and above, which are the classes 30-34, 35-39, and 40-44. Adding these relative frequencies (0.286 + 0.214 + 0.000) gives us 0.500. Converting this to a percentage results in 50%. Thus, the cafe serves 30 or more customers per hour 50% of the time during the observed period.

concept

How to Create Frequency Distributions

Video duration:

How to Create Frequency Distributions Video Summary

Frequency distributions are essential tools for organizing data into classes, allowing for easier analysis of the frequency of occurrences within specified ranges. When constructing a frequency distribution, particularly when only the number of classes is provided, a systematic approach is necessary to determine the class limits and widths.

The first step in this process is to calculate the class width. The class width is defined as the difference between two consecutive lower or upper class limits. To find the class width, you can use the formula:

Class Width = $\frac{\text{Maximum} - \text{Minimum}}{\text{Number of Classes}}$

For example, if the maximum value in your dataset is 115 and the minimum is 5, and you need to create 8 classes, the calculation would be:

Class Width = $\frac{115 - 5}{8} = \frac{110}{8} = 13.75$

Since class widths are typically rounded to whole numbers or convenient values, you might choose to round 13.75 up to 14 or 15, depending on the context. In this case, using 15 as the class width simplifies calculations.

Next, you will determine the lower class limits. The first lower class limit is often set at or below the minimum value of the dataset. In this example, you can use 5 as the first lower limit. Subsequent lower class limits are found by adding the class width to the previous lower limit:

Lower Class Limits: 5, 20, 35, 50, 65, 80, 95, 110

After establishing the lower class limits, the upper class limits can be calculated. The first upper class limit is found by subtracting 1 from the second lower class limit to avoid overlap. Continuing this pattern, the upper class limits would be:

Upper Class Limits: 19, 34, 49, 64, 79, 94, 109, 124

With both lower and upper class limits defined, you can now tally the frequency of data points that fall within each class. This involves counting how many data values lie between each pair of limits. For instance, if you have data values that fall between 5 and 19, you would count those occurrences and record them accordingly.

After completing the frequency counts for all classes, you will have a complete frequency distribution that summarizes the dataset effectively. This structured approach not only aids in organizing data but also enhances the clarity of analysis, making it easier to interpret results and draw conclusions.

example

How to Create Frequency Distributions Example 2

Video duration:

How to Create Frequency Distributions Example 2 Video Summary

In constructing a frequency distribution for a dataset representing the sales in dollars of 15 sales representatives, the first step is to determine the class width. This is calculated by taking the maximum value from the dataset, subtracting the minimum value, and then dividing by the number of classes. In this case, the maximum sales figure is $1,223 and the minimum is $819. The formula for class width (CW) can be expressed as:

$$ CW = \frac{Max - Min}{Number \ of \ Classes} $$

Substituting the values, we have:

$$ CW = \frac{1223 - 819}{5} = \frac{404}{5} = 80.8 $$

Since class widths are typically rounded to whole numbers for practical purposes, we round 80.8 up to 81. This means each class will span 81 units.

Next, we establish the lower class limits. The first lower class limit is set at the minimum value of the dataset, which is $819. The subsequent lower class limits are calculated by adding the class width to the previous lower limit. Thus, the lower class limits are:

First: $819
Second: $900 ($819 + $81)
Third: $981 ($900 + $81)
Fourth: $1,062 ($981 + $81)
Fifth: $1,143 ($1,062 + $81)
Sixth: $1,224 ($1,143 + $81) - This limit exceeds the maximum data value, so it will not be included in the frequency count.

To find the upper class limits, subtract one from each of the lower class limits (except for the last one, which is not used). The upper class limits are thus:

First: $899 ($900 - 1)
Second: $980 ($981 - 1)
Third: $1,061 ($1,062 - 1)
Fourth: $1,142 ($1,143 - 1)
Fifth: $1,223 ($1,224 - 1)

With the class limits established, the next step is to tally the frequencies of the sales figures that fall within each class. By analyzing the dataset, we can categorize each sales figure into its respective class. The final frequency distribution table will show the number of sales representatives whose sales fall within each defined class range.

After counting the frequencies, the completed frequency distribution will provide a clear overview of the sales performance across the defined classes, allowing for better analysis and understanding of the sales data.

Problem

An economist is analyzing the monthly unemployment rates (as a %) across different cities. The lowest was 16% and the highest was 71%. Without constructing a table, find the class width if this data is divided into 7 classes. Then write the lower and upper class limits for each class.

Class width = 7; lower class limits = 16, 23, 30, 37, 44, 51, 58, 65; Upper class limits = 24, 32, 40, 48, 56, 64, 72

Class width = 7; lower class limits = 16, 24, 32, 40, 48, 56, 64; Upper class limits = 24, 32, 40, 48, 56, 64, 72

Class width = 8; lower class limits = 16, 24, 32, 40, 48, 56, 64; Upper class limits = 23, 31, 39, 47, 55, 63, 71

Class width = 8; lower class limits = 16, 24, 32, 40, 48, 56, 64; Upper class limits = 24, 32, 40, 48, 56, 64, 72

Here’s what students ask on this topic:

What is a frequency distribution and why is it important in data analysis?

A frequency distribution is a table that organizes data into classes and shows the frequency of measurements within those classes. It is crucial in data analysis because it helps to visualize and understand the distribution of data, making it easier to identify patterns, trends, and outliers. By organizing data into classes, frequency distributions simplify complex data sets, allowing for more effective analysis and interpretation. They are foundational for creating histograms and other graphical representations, which are essential tools for communicating data insights in business and research contexts.

How do you calculate the class width in a frequency distribution?

To calculate the class width in a frequency distribution, use the formula: $\frac{(\max - \min)}{number of classes}$ . This formula involves subtracting the minimum value from the maximum value in the data set and dividing the result by the number of classes you wish to use. The class width determines the range of values within each class, ensuring that the classes are evenly spaced and do not overlap. It is important to round the class width to a convenient number to avoid decimals, which can complicate the distribution process.

What are lower and upper class limits in a frequency distribution?

Lower and upper class limits are the boundaries that define the range of values within each class in a frequency distribution. The lower class limit is the smallest value that can be included in a class, while the upper class limit is the largest value that can be included. These limits are crucial for ensuring that classes do not overlap, which would lead to double counting of data points. To find the lower class limits, start with a number less than or equal to the minimum data value and add the class width to determine subsequent limits. The upper class limits are found by subtracting one from the next lower class limit.

How do you calculate relative frequencies in a frequency distribution?

Relative frequencies in a frequency distribution are calculated by dividing the frequency of each class by the total number of measurements, denoted as $n$ . The formula is: $\frac{f}{n}$ , where $f$ is the frequency of the class. This result is then multiplied by 100 to convert it into a percentage. Relative frequencies provide a clearer understanding of the data distribution by showing the proportion of the total data that falls within each class, making it easier to compare different classes and identify significant patterns.

What is the difference between frequency distribution and relative frequency distribution?

A frequency distribution is a table that shows the number of measurements within each class, while a relative frequency distribution expresses these frequencies as percentages of the total number of measurements. The relative frequency distribution provides a more intuitive understanding of the data by showing the proportion of the total data that falls within each class. This is particularly useful for comparing classes and understanding the overall distribution of data. While frequency distribution focuses on raw counts, relative frequency distribution emphasizes the relative importance of each class in the context of the entire data set.

Your Statistics for Business tutors

Patrick Ford

Physics and Math Lead Instructor

Colleen Daly