Graphical presentation of Data

GRAPHICAL PRESENTATION OF DATA

The presentation of quantitative data by graphs and charts are termed as graphical presentation.

It gives the reader a nice overview of the essential features of the data. Graphs are designed to give an intuitive feeling of the data at a glance.

Therefore graphs:

- Should be self-explanatory
- Must have title
- Must have labeled axis
- Should mention unit of observation
- Should be simple & clean

Advantages of Graph Representation

It is easy to read
It is easy to understand by all.
It shows relationship between two or more sets of observations.
It is universally applicable
It is attractive in representation
It helps in proper estimation, evaluation, and interpretation of the characteristics of items and individuals
It has more lasting effect on brain
It simplifies complex data
It indicates trend, and therefore, helps in forecasting.

Disadvantages of Graph Representation

It is time consuming.
Finer details may be lost during preparation
It represents only approximate values.

Graphical Presentation of Statistical data:

Grouped and ungrouped data may be presented as:

Line Graphs

A line chart or line graph is a type of chart which displays information as a series of data points called ‘markers’ connected by straight line segments.
These are drawn on the plane paper by plotting the data concerning one variable on the horizontal x-axis (abscissa) and other variable of data on y-axis (ordinate). Which intersect at a point called origin.
With the help of such graphs the effect of one variable upon another variable during and experimental study may be clearly demonstrated.
According to data for corresponding X, Y values (in pairs), we will find a pint on the graph paper. The points thus generated are then jointed by pieces of straight lines successfully. The figure thus formed is called line diagram or graph.

Example

In the experimental sciences, data collected from experiments are often visualized by a graph. For example, if one were to collect data on the speed of a body at certain points in time, one could visualize the data by a data table such as the following:

Elapsed Time (s)	Speed (m s⁻¹)
0	0
1	3
2	7
3	12
4	20
5	30
6	45

Graph of Speed Vs Time

Bar Diagram:

A bar diagram is a graph on which the data are represented in the form of bar and it is useful in comparing qualitative or quantitative data of discrete type.
It consists of a number of equally spaced rectangular areas with equal width and originate from a horizontal base line (x-axis)
The length of the bar is proportional to the value it represents. It should be seen that the bars are neither too short nor too long.
They are shaded or coloured suitably.
The mars may be vertical or horizontal in a bar diagram. If the bare are placed horizontally, it is called horizontal bar diagram, when bares are placed vertically it is called a vertical bar diagram.
It is used with discrete qualitative variables and provides a visual comparison of figures.

Types of Bar Diagram

There are three types of bar diagram

Simple bar diagram
Multiple or grouped bar diagram
Component bar subdivided bar diagram.

Simple bar chart:

Represent one type of data (variable).

Example:

Following is an example of bar chart which shows educational status of certain area.

Multiple Bar charts:

Such charts are useful for direct comparison between two or more sets of data. The technique of drawing such a chart is same as that of a single bar chart with a difference that each set of data is represented in different shades or colors on the same scale. An index explaining shades or colors must be given.

Example:
Draw a multiple bar chart to represent the import and export of Canada (values in $) for the years 1991 to 1995.

Years	Imports	Exports
1991	7930	4260
1992	8850	5225
1993	9780	6150
1994	11720	7340
1995	12150	8145

Simple bar chart showing the import and export of Canada from 1991 – 1995.

Component bar chart:

Sub-divided or component bar chart is used to represent data in which the total magnitude is divided into different or components.

In this diagram, first we make simple bars for each class taking total magnitude in that class and then divide these simple bars into parts in the ratio of various components. This type of diagram shows the variation in different components within each class as well as between different classes. Different shades or colours are used to distinguish the various components and should be given with the diagram. It is also known as staked chart.

Example:
The table below shows the quantity in hundred kgs of Wheat, Barley and Oats produced on a certain form during the years 1991 to 1994.

Years	Wheat	Barley	Oats	Total
1991	34	18	27	79
1992	43	14	24	81
1993	43	16	27	86
1994	45	13	34	92

Pie Chart

It is a circular graph whose area is subdivided into sectors by radii in such a way that the areas of the sectors are proportional to the angles at the centre.
The area of the circle represents the total value and the different sectors of the circle represent the different parts.
It is generally used for comparing the relation between the various components of a value and between components and the total value.
The data is expressed as percentage. Each component it expressed as percentage of the total value.

Working procedure:

Plot a circle of an appropriate size. The angle of a circle total is 360^o.
Convert the given value of the components of an item in percentage of the total value of the item.

Value of component

Area = x 360

Total value of item

It the pie chart largest sector remains at the top and other in sequence running clockwise.
Measure with protector, the points on a circle representing the size of each sector. Label each sector for identification.

Example:

A family’s weekly expenditure on its house mortgage (finance), food and fuel is as follows: Draw pie chart:

Expense	$
Mortage	300
Food	225
Fuel	75

Histogram

It is the most common form of diagrammatic representation of grouped frequency distribution of both continuous and discontinuous type, in which the frequencies are represented in the forms of bars. The area and more especially the height of each rectangle is proportional to the frequency.

Working Procedure:

Convert the data in exclusive series from inclusive series. (Make class boundaries if classes do not coincide; discontinuous class interval)
Take class intervals (class boundaries) and plot in x-axis.
Take two extra class intervals one below and one above the given grouped intervals.
Plot separate rectangles for each class interval. The base of each rectangle is the width of the class interval and the height is the respective frequency of that class.
Frequencies are plotted on y-axis.

Age	Class Boundaries	Frequency
30-39	29.5–39.5	11
40-49	39.5–49.5	46
50-59	49.5–59.5	70
60-69	59.5–69.5	45
70-79	69.5–79.5	16
80-89	79.5–89.5	1

Frequency Polygon:

It is an area diagram represented in the form of curve obtained by joining the middle points of the tops of the rectangles in a histogram or joining the mid-point of class intervals at the height of frequencies by straight lines.

Cumulative Frequency Polygon (Ogive)

The graphical representation of a cumulative frequency distribution where the cumulative frequencies are plotted against the corresponding class boundaries and the successive points are joined by straight lines, the line diagram or curve is obtained known as ogive or cumulative frequency polygon.

Working procedure:

The upper limits of the classes are represented along x-axis.
The cumulative frequency of a particular class is taken along the y-axis.

Class interval	Class Boundries	f	c.f
151- 155	150.5 – 155.5	8	8
156 – 160	155.5 – 160.5	7	15
161 – 165	160.5 – 165.5	15	30
166 – 170	165.5 – 170.5	9	39
171 – 175	170.5 – 175.5	9	48
176 – 180	175.5 – 180.5	2	50

The points corresponding to cumulative frequency at each upper limit of the classes are joined by a free hand curve.

Stem-and-Leaf Displays

A stem-and-leaf display bears a strong resemblance to a histogram and serves the same purpose. It provides information regarding the range of the data set, shows the location of the highest concentration of measurements, and reveals the presence or absence of symmetry. An advantage of the stem-and-leaf display over the histogram is the fact that it preserves the information contained in the individual measurements.

Another advantage of stem-and-leaf displays is the fact that they can be constructed during the tallying process, so the intermediate step of preparing an ordered array is eliminated.

Working procedure:

To construct a stem-and-leaf display we partition each measurement into two parts.
The first part is called the stem, and the second part is called the leaf.
The stem consists of one or more of the initial digits of the measurement, and the leaf is composed of one or more of the remaining digits.
The stems form an ordered column with the smallest stem at the top and the largest at the bottom. We include in the stem column all stems within the range of the data even when a measurement with that stem is not in the data set.
The rows of the display contain the leaves, ordered and listed to the right of their respective stems.
The stems are separated from their leaves by a vertical line.

Example:

The following example illustrates the construction of a stem-and-leaf display.

44, 46, 47, 49, 63, 64, 66, 68, 68, 72, 72, 75, 76, 81, 84, 88, 106

Stem Leaves

4 4, 6, 7, 9

6 3, 4, 6, 8, 8

7 2, 2, 5, 6

8 1, 4, 8

10 6

Key: 6|3=63

Leaf unit: 1.0

Stem unit: 10.0

Box-and-Whisker Plots

A useful visual device for communicating the information contained in a data set is the box-and-whisker plot. The construction of a box and- whisker plot (sometimes called, simply, a box plot) makes use of the quartiles of a data set and may be accomplished by following these five steps:

Represent the variable of interest on the horizontal axis.
Draw a box in the space above the horizontal axis in such a way that the left end of the box aligns with the first quartile Q₁ and the right end of the box aligns with the third quartile Q₃
Divide the box into two parts by a vertical line that aligns with the median
Draw a horizontal line called a whisker from the left end of the box to a point that aligns with the smallest measurement in the data set.
Draw another horizontal line, or whisker, from the right end of the box to a point that aligns with the largest measurement in the data set.

Examination of a box-and-whisker plot for a set of data reveals information regarding the amount of spread, location of concentration, and symmetry of the data.

Example:

The following example illustrates the construction of a box-and-whisker plot.

The smallest and largest measurements are 14.6 and 44, respectively.

First quartile Q₁= 27.25, the median Q₂ =31.1and the third quartileQ₃= 33.525.

Measure of Central Tendency

Central tendency or central position or statistical averages reflects the central point or the most characteristic value of a set of measurements. The measure of central tendency describes the one score that best represents the entire distribution,

(OR)

A single figure that describes the entire series of observations with their varying sizes, occupying a central position.

The most common measures of central tendency are

Mean
Median
Mode

Characteristics of Central Tendency:

It should be rigidly defined
An average should be properly defined so that it has one and only one interpretation.
The average should not depend on the personal prejudice and bias of the investigator.
It should be based onall items
It should be easily understand.
It should not be unduly affected by the extreme value.
It should be least affected by the fluctuation of the sampling.
It should be easy to interpret.
It should be easily subjected to further mathematical calculations.

Measure of Central Tendency

If n ≤ 15

Direct Method

If n > 15

Frequency Distribution Method

Simple /Ungrouped Frequency Distribution

(Range ≤ 20 digits)

Grouped Frequency Distribution

(Range > 20 digits)

Mean:

It is defined as a value which is obtained by dividing the sum of all the values by the numbers of observations. Thus arithmetic mean of a set of values x₁, x₂, x₃, x_4.. . . .x_n is denoted by (read as “x bar”) and is calculated as:

= = (Direct Method)

Where sign ∑ stands for the sum and “n” is the number of observations.

Example:

The grades of a student in five examinations were 67, 75, 81, 87, 90 find the arithmetic mean of grades.

Solution:

Here, = = 80

Thus, the mean grade is 80.

Method of Finding Mean

If x₁, x₂, x₃, x₄, ….x_n are the values of different observations andf₁, f₂, f₃, f₄, ….f_nare their frequencies_,then,

Or. A.M. =

Example 2. The number of children of 80 families in a village are given below:

No. of Children/Family	1	2	3	4	5	6
No. of Families	8	10	10	25	20	7

Calculate mean.

Solution: let x_i represent the number of children per family and f_i represent the number of families. The calculations are presented in the following table:

No. of Children/Family (x_i)	No. of Families (f_i)	f_ix_i
1	8	8
2	10	20
3	10	30
4	25	100
5	20	100
6	7	42
	n=∑f_i =80	∑f_ix_i = 300

Thus = = = 3.75

Methods of Finding Arithmetic mean for Grouped Data

Let x₁, x₂, x₃, x_4.. . . .x_nbe mid-points of the class intervals with corresponding frequencies f₁, f₂, f₃, f₄, ….f_n . Then the arithmetic mean is obtained by dividing the sum of the product of “f “ and “x” by the total of all frequencies.

Thus:

A.M. = =

Example:

Given below are the heights of (in inches) of 200 students. Find A.M.

Height (inches)	30-35	35-40	40-45	45-50	50-55	55-60
No. of Students	28	32	36	46	36	22

Solution:

Height (Inches)	Mid points (x)	Frequency (f)	fx
30-35	32.5	28	910
35-40	37.5	32	1200
40-45	42.5	36	1530
45-50	47.5	46	2185
50-55	52.5	36	1890
55-60	57.5	22	1265
Total:	—	∑f = 200	∑fx = 8980

= = = 44.90 (inches).

Example: Given below are the weights (in kgs) of 100 students. Find Mean Weight:

Weight	70-74	75-79	80-84	85-89	90-94
No. of Students	10	24	46	12	8

Solution:

Weight (Kg)	Mid-Points (x)	Frequency (f)	fx
70 – 74	72	10	720
75 – 79	77	24	1848
80 – 84	82	46	3772
85 – 89	87	12	1044
90 – 94	92	8	736
Total:	—	∑f = 100	∑fx = 8120

= = = 81.20

Here, Mean Weight is 31.2 kgs.

Merits of Mean

It has the simplest average formula which is easily understandable and easy to compute.
It is so rigidly defined by mathematical formula that everyone gets same result for single problem.
Its calculation is based on all the observations.
It is least affected by sampling fluctuations.
It is a typical i.e. it balances the value at either side.
It is the best measure to compare two or more series.(data)
Mean is calculated on value and does not depend upon any position.
Mathematical centre of a distribution
Good for interval & ratio scale
Does not ignore any information
Inferential statistics is based on mathematical properties of the mean.
It is based on all the observations.
It is easy to calculate and simple to understand.
It is relatively stable and amendable to mathematical treatment.

Demerits of Mean

It cannot be calculated if all the values are not known.
The extreme values have greater affect on it.
It cannot be determined for the qualitative data.
It may not exist in data.

Median:

It is the middle most point or the central value of the variable in a set of observation when observations are arranged in either order of their magnitudes.

It is the value in a series, which divides the series into two equal parts, one consisting of all values less and the other all values greater than it.

Median for Ungrouped data

Median of “n” observations, x₁, x₂, x₃,…x_n can be obtained as follows:

When “n” is an odd number,

Median = ()^th observation

When “n” is an even number,

Median is the average of ()^thand ()^thobservations.

Simply use ()th observation. It will the average

The median for the discrete frequency distribution can be obtained as above, Using a cumulative frequency distribution.

Problem

Find the median of the following data:

12, 2, 16, 8, 14, 10, 6

Step 1: Organize the data, or arrange the numbers from smallest to largest.

2, 6, 8, 10, 12, 14, 16

Step 2: count number of observation in data (n)

.n = 7

Step 3: Since the number of data values is odd, the median will be found in the position.

Median term (m) =

7 + 1 8

= = = 4^th value

2 2

Step 4: In this case, the median is the value that is found in the fourth position of the organized data, therefore

Median = 10

Problem

Median for even data:

Find the median of the following data:

7, 9, 3, 4, 11, 1, 8, 6, 1, 4

Step 1: Organize the data, or arrange the numbers from smallest to largest.

1, 1, 3, 4, 4, 6, 7, 8, 9, 11

Step 2: Since the number of data values is even, the median will be the mean value of the numbers found before and after the position.

Step 3: The number found before the 5.5 position is 4 and the number found after the 5.5 position is 6. Now, you need to find the mean value.

1, 1, 3, 4, 4, 6, 7, 8, 9, 11

Example:

The following are the runs made by a batsman in 7 matches:

8, 12, 18, 13, 16, 5, 20.Find the median.

Solution: Writing the runs in ascending order.

5, 8, 12, 13, 16, 18, 20

As n=7

Median= ()^thitem = ()4^th item.

Hence, Median is13 runs.

Example:

Following are the marks (out of 100) obtained by 10 students in English:

23, 15, 35, 41, 48, 5, 8, 9, 11, 51. Find the median mark.

Solution: arranging the marks in ascending order. The marks are:

5, 8, 9, 11, 15, 23, 35, 41, 48, 51

As n= 10

So, median = [] item.

Or, Median = [15+23] = = 19 marks.

Alternative Method:

Median term(m) = ()^th value

= 11/2 = 5.5^th value

5, 8, 9, 11, 15, 23, 35, 41, 48, 51

M1 M2

Median =

Median = = 19

Median for Grouped data

It is obtained by the following formula:

Median = l₁ +()

Where, l₁ = lower class limit of median class.

l₂ = upper class limit of median class

f = frequency of median class.

m = or

C = cumulative frequency preceding the median class.

n = total frequency, i.e. ∑f.

Example:

Find the median height of 200 students in given data

Solution:

Class interval	Frequency (f)	C.F
30-35	28	28
35-40	32	28+32=60
40-45	36	60+36=96
45-50	46	96+46=142
50-55	36	142+36=178
55-60	22	178+22=200 n

Median =

As 100.5 ^th item lies in (45-50), it is the median class with l₁ = 45, l₂= 50 ,f= 46, C= 96

Median = l₁ +()

Median = 45 + (

= 45 +

= 45 + 0.489

= 45.489

Thus, median height is 45.489 inches.

2^nd Method:

l + (

Where, l = lower class boundary of median class.

w = width of median class.

f = frequency of median class.

n = total frequency, i.e. ∑f.

c = cumulative frequency preceding the median class.

Example:

Following are the weights in kgs of 100 students. Find the median weight.

Weights (kgs)	70-74	75-79	80-84	85-89	90-94
No of students.	10	24	46	12	8

Solution: As class boundaries are not given so, first of all we make class boundaries by using procedure.

Weight (kgs)	No. of students	Class boundaries	C.F
70-74	10	69.5-74.5	10
75-80	24	74.5-79.5	34
80-84	46	79.5-84.5	80
85-89	12	84.5-89.5	92
90-94	8	89.5-94.5	100

Median =

As 50^th item lies in (79.5-84.5), it is the median class with h= 5, f= 46, C= 34

Median = l + (, we find

Median = 79.5 + (

= 79.5 +

Hence, median weight is 81.24 kg.

Merits of Median:

It is easily understood although it is not so popular as mean.
It is not influenced or affected by the variation in the magnitude or the extremes items.
The value of the median can be graphically ascertained by ogives.
It is the best measure for qualitative data such as beauty, intelligence etc.
The median indicated the value of middle item in the distribution i.e. middle most item is the median
It can be determined even by inspection in many cases.
Good with ordinal data
Easier to compute than the mean

Demerits of Median:

For the calculation of median, data must be arranged.
Median being a positional average, cannot be dependent on each and every observations.
It is not subject to algebraic treatment.
Median is more affected or influenced by samplings fluctuations that the arithmetic mean.
May not exist in data.
It is not rigorously defined.
It does not use values of all observations.

Mode:

Mode is considered as the value in a series which occurs most frequently (has the highest frequency)

The mode of distribution is the value at the point around which the items tend to be most heavily concentrated. It may be regarded as the most typical value.

The word modal is often used when referring to the mode of a data set.
If a data set has only one value that occurs most often, the set is called unimodal.
A data set that has two values that occur with the same greatest frequency is referred to as bimodal.
When a set of data has more than two values that occur with the same greatest frequency, the set is called multimodal.

Mode for Ungrouped data

Example 1. The grades of Jamal in eight monthly tests were 75, 76, 80, 80, 82, 82, 82, 85.Find the mode of his grades.

Solution: As 82 is repeated more than any other number, so clearly mode is 82.

Example 2. Ten students were asked about the number of questions they have solved out of 20 questions, last week. Records were 13, 14, 15, 11, 16, 10, 19, 20, 18, 17. Find the modes.

Solution: it is obvious that the data contain no mode, as none of the numbers is repeated. Sometimes data contains several modes.

If x = 10, 15, 15, 15, 20, 20, 20, 25 then the data contains two modes i.e. 15 and 20.

Mode for grouped data

Mode for the grouped data can be calculated by the following formula:

Mode=

(OR)

Mode=

(OR)

Mode=

l₁= lower limit (class boundary) of the modal class.

l₂ = upper limit of the modal class

f_m= frequency of the modal class

f₁= frequency associated with the class preceding the modal class.

f₂ = frequency associated with the class following the modal class

h = (size of modal class)

The class with highest frequency is called the “Modal Class”.

Example 3. Find the mode for the heights of 200 students in given data

Height (inches)	Frequency
30-35	28
35-40	32
40-45	36 ()
45-50	46 ()
50-55	36 ()
55-60	22
	∑f=200

Solution:

Mode=

Mode = 47. 5

Merits of Mode:

It can be obtained by inspection.
It is not affected by extreme values.
This average can be calculated from open end classes.
The score comes from the data set
Good for nominal data
Good when there are two ‘typical‘ scores
Easiest to compute and understand
It can be used to describe qualitative phenomenon
The value of mode can also be found graphically.

Demerits of Mode

Mode has no significance unless a large number of observations are available.
It cannot be treated algebraically.
It is a peculiar measure of central tendency.
For the calculation of mode, the data must be arranged in the form of frequency distribution.
It is not rigidly define measure.
Ignores most of the information in a distribution
Small samples may not have a mode.
It is not based on all the observations.

Empirical Relationship b/w

Skewness:

Data distributions may be classified on the basis of whether they are symmetric or asymmetric. If a distribution is symmetric, the left half of its graph (histogram or frequency polygon) will be a mirror image of its right half. When the left half and right half of the graph of a distribution are not mirror images of each other, the distribution is asymmetric.

If the graph (histogram or frequency polygon) of a distribution is asymmetric, the distribution is said to be skewed. The mean, median and mode do not fall in the middle of the distribution.

Types of Skewness

Positive skewness: If a distribution is not symmetric because its graph extends further to the right than to the left, that is, if it has a long tail to the right, we say that the distribution is skewed to the right or is positively skewed. In positively skewed distribution Mean > Median > Mode. The positive skewness indicates that the mean is more influenced than the median and mode, by the few extremely high value. Positively skewed distribution have positive value because mean is greater than mode
Negative skewness: If a distribution is not symmetric because its graph extends further to the left than to the right, that is, if it has a long tail to the left, we say that the distribution is skewed to the left or is negatively skewed. In negatively skewed distribution Mean < Median < Mode. Negatively skewed distribution have negative value because mean is less than mode.

KURTOSIS

Kurtosis is a measure of the degree to which a distribution is “peaked” or flat in comparison to a normal distribution whose graph is characterized by a bell-shaped appearance.

Platykurtic curve: when the frequency distribution curve is flatter than the normal bell shaped curve
Leptokurtic curve: when the frequency distribution curve is more peaked than the normal bell shaped curve
Mesokurtic curve: the normal bell shaped distribution curve.

KurtosisPict

4 thoughts on “Graphical presentation of Data”

예스벳88 on June 5, 2020 at 9:43 am said:

Best view you can finde , in this side of world!

Reply ↓
آموزش ویندوز ده on August 30, 2020 at 12:03 am said:

Hello I’ve been reading your site for a while now and finally got the bravery to go ahead and give you a shout out from Lubbock Tx! Just wanted to say keep up the great work! many thanks

Reply ↓
- Bherulal on August 30, 2020 at 11:57 am said:
  
  Thanks a lot dear for appreciation
  
  Reply ↓
دانلود نرم افزار طراحی دو بعدی برای کامپیوتر on November 22, 2020 at 1:51 am said:

Hola I’ll be sure to bookmark it and return to read extra of your useful information. Thank you for the post. I’ll definitely return vielen dank

Reply ↓

Nursing Tutorials

Graphical presentation of Data

Method of Finding Mean

Methods of Finding Arithmetic mean for Grouped Data

Median for Ungrouped data

Mode for Ungrouped data

4 thoughts on “Graphical presentation of Data”

Leave a Reply Cancel reply