##### Calculating averages and ranges for grouped continuous data

Some data is discrete and can only take on certain values. For example, if you throw an ordinary die then you can get one of the numbers 1,2,3,4,5 or 6. If you count the number of red cars in a car park then the result can only be a whole number.

Some data is continuous and can take on any value in a given range. For example, heights of people, or temperature of a liquid, are continuous measurement.

Continuous data can be difficult to process effectively unless it is summarized. for instance, if you measure the heights of 100 children you could end up with 100 different results. You can group the data in to frequency tables to make the process more manageable-this is now grouped data. the groups( or classes) can be written using inequality symbols. For examples, if you want to create a class for heights (hcm) between 120 cm and 130 cm you could write:

$\displaystyle 120\le h<130$

This means that h is greater than or equal to 120 but strictly less than 130. The next class could be:

$\displaystyle 130\le h<140$

Notice that 130 is not included in the first class but is included in the second. This is to avoid any confusion over where to put values at the boundaries.

The example shows how a grouped frequency table is used to find the estimated __mean__ and range, and also to find the modal class and the median classes

Example 1: The heights of 100 children were measured in cm and the results recorded in the table below:

Find an estimate for the __mean__ height of the children, the modal class, the median class and an estimate for the range.

None of the children’s heights are known exactly, so you use the midpoint of each group as a best estimate of the height of each child in a particular class. For example, the 12 children in the $\displaystyle 120\le h<130$ have heights that lie between 120 cm and 130 cm, and that is all that you know. Halfway between 120 cm and 130 cm is $\displaystyle \frac{{120+130}}{2}=125cm$

A good estimate of the total height of the 12 children in this class is 12 x 125 (=frequency x midpoint)

So, extend your table to include midpoints and then totals for each class:

An estimate for the __mean__ height of the children is then:

$\displaystyle \frac{{1500+2160+5510+3720+1650}}{{12+16+38+24+10}}$$\displaystyle =\frac{{14540}}{{100}}=145.4cm$

To find the median class you need to find where the 50th and 51st tallest children would be placed. Notice that the first two frequencies add to give 28, meaning that the 28th child in an ordered list of heights would be the tallest in the $\displaystyle 130\le h<140$. The total of the first three frequencies is 66, meaning that the 50th child will be someone in the $\displaystyle 140\le h<150$. This then, makes $ \displaystyle 140\le h<150$ the median class.

The class with the highest frequency is the modal class. In this case it is the same class as the median class: $\displaystyle 140\le h<150$

the shortest child could be as small as 120cm and the tallest could be as tall as 170 cm. The best estimate of the range is, therefore, 170-120=50cm

Tip! You may be asked to explain why your calculations only give an estimate. Remember that you don’t have the exact data, only frequencies and classes.