Thursday, August 8, 2024

Probability and statistics Chapter-2

 

Unit 2 Summarizing and Describing the Numerical Data

Measures of central tendency

An “average” is a single value which is the representative of the entire distribution and it lies between the two extreme observations (i.e. the largest and smallest observations) of the distribution and give us an idea about the concentration of the values in the central part of the distribution. The measures of such single value is  known as the ‘Measures of Central Tendency” or  “measures of location”  Thus, Measures of central tendency are used to describe the middle or Centre of data set.

Various Measures of Central Tendency

The following are the measures of central tendency or measures of location:

1.  Arithmetic mean 

(i)    Simple Arithmetic Mean(̅X)

(ii)  Weighted Arithmetic Mean (̅Xw)

2.  Median (Md)

3.  Mode(Mo)

4.  Geometric mean(G. M. ) and

5.  Harmonic mean(H. M. )

Note: Geometric mean(G. M. ) and Harmonic mean(H. M. ) are beyond of our syllabus.

Arithmetic Mean (A.M.)

The arithmetic mean is the most popular and widely used measure of central tendency. It is also called simply ‘the mean’ or ‘the average’. It is also considered as an ideal measure of central tendency or the best-known measures of central tendency because it satisfies almost all requisites of ideal measure of central tendency given by Prof. Yule. 

Arithmetic mean may either be 

(i) Simple arithmetic mean or (ii) Weighted arithmetic mean 

Simple arithmetic mean

In case of simple arithmetic mean, all the items in the distribution are equally important. It is denoted by ̅X (X bar)

Calculation of Arithmetic mean:

Individual Series

(i)    Direct method

         

Where ∑ X  = the sum of observations       n = the number of observations. 

(ii)  Short-cut method or assumed mean method or change of origin method

               

   Where a = assumed mean or assumed value                d = X – a = Deviations of the items from the assumed mean.                 n = no. of observations.

There is no any hard and fast rule for the selection of 'a' but better to take between highest and lowest values.

Discrete Series

(i)    Direct method

               

Where N = Σf = Total frequency

(ii)  Short-cut method or assumed mean method or coding method or change of origin method

         

Where   a = Assumed mean                d = X - a = Deviation of the items from the assumed mean

              N = ∑ f =  Total frequency

Continuous Series (Grouped Data)  

(ii)    Direct method

               

  Where X = midpoint of the class interval               N = Σf = Total frequency

        mid. value (X) =  

(ii)    Short-cut method or assumed mean method or coding method or change of origin method

         

Where   a = Assumed mean 

              d = X - a = Deviation of the items from the assumed mean

              N = ∑ f =  Total frequency

(iii)  Step-deviation method or  change of origin and scale method or coding method

         h

                 Where,                   d' =  

                 X = Mid.value                  a = Assumed mean 

                    h = Class size or class width 

Note:

(i)    For unequal class size, h is taken as common factor

(ii)  For mean, it is not necessary to be equal class size and exclusive class (i.e. adjusted class)

Weighted Arithmetic Mean

While calculating simple arithmetic mean, it is based on the assumption that all the items in the distribution are equally important. But in practice, this may not be so. The relative importance of some items in a distribution are more important than others. So, when the weights are assigned for individual items with their relative importance or priorities (or weights), then the arithmetic mean calculated with respect to their priorities is called weighted arithmetic mean.

Then, weighted arithmetic mean is given by

    

                              Where, 

                              X = Value of variable in rate (per)                          w = given weight or proportion or frequency 

 

Combined mean or Mean of combined Series

For two groups or two series

Combined mean ( 

For three groups or three series

Combined mean ( 

        where

     n1 =Size of  first  group       n2 =  Size of second  group

                 n3 = Size of  third group

                 ̅X1  = Mean of  first group

               ̅X= Mean of second group

                ̅X= Mean of third group

Corrected mean

Correct mean ( 

Where,

 Incorrect ∑ X = n × ̅X = n  incorrect mean

  Correct ∑ X = Incorrect ∑ X - Incorrect items + correct items

Median or Positional average (Md)

The variate value which divides the total number of observations into two equal parts is called the median. It is denoted by Md.

  Md is suitable measure of central tendency (or average) for the qualitative characteristics such as knowledge, intelligent, beauty, honesty, talent, good, bad, defective, etc. 

  It is also more appropriate (or suitable) average (or measure of central tendency) for the open ended classified data.

Note: 

(i)The classes should be exclusive type

(ii)For calculation of Md. It is not necessary to be equal class size.

Calculation of median depends upon the given series

  For Individual series

At first, arranging the given set  of observations (data) in ascending order of magnitude.

Median (Md.) = Value of  item

                      Where n = no. of observations

In discrete series:

  At first, arrange the given data in ascending order of their magnitudes.

  Obtain the less than cumulative frequency (c.f.)

Median (Md.) = Value of  item

                      Where N = ∑ f =  Total frequency

For continuous series

  Prepare the less than cumulative frequency distribution.

  Find   

  See cumulative frequency equal to or just greater than the value of  and note the corresponding frequency.

  The corresponding class contains the median value and is called the median class.

  Md = L +   

 Where,

        N = ∑ f = Total frequency         L = Lower limit of median class         f = frequency of median class        h = with of median class  or class size of median class.

      c.f. = Less than cumulative frequency preceding the median class

Note: 

(i)The classes should be exclusive type

(ii)For calculation of Md. It is not necessary to be equal class size.

 

Mode or Modal value or Most repeated value or most usual value (Mo)

Mode is that variate value which repeats maximum number of times.

It is used to find the most common size of pen drive, size of shoes, size of T-shirts and other readymade garments.

Calculation of mode

The mode for various distributions is given below.

For Individual series:

Mode = Value of variable X which repeats maximum number of times

 For discrete series

Case I: If the distribution is regular and unimodal (i.e. only one maximum frequency). 

Mode = Value of variable X corresponding to maximum frequency

 Case II: When the distribution is regular and bimodal or multimodal, the mode can be determined by using empirical relation 

Mo = 3Md - 2̅X

Case III: When the distribution is irregular, mode is determined by using grouping method.

For continuous series (Grouped frequency distribution)

Case I: If the distribution is regular and unimodal, mode is calculated by using the following formula.

 

           Where

                  f1 = maximum frequency or modal class frequency.              f0 = preceding frequency of modal class.            f2 = following frequency of modal class.

            L = lower limit of modal class.

            h = class size or width of modal class.

Case II: When the distribution is regular and bimodal or multimodal, the mode can be determined by using empirical relation 

Mo = 3Md - 2̅X

Case III:  When the distribution is irregular, mode is determined by using grouping method.

Note: Case III is beyond of our syllabus.

Note: For Mode

(i)  It is necessary to be equal class size as well as exclusive class intervals.

Note: To construct class intervals if mid values are given

If the mid. values of the distribution are given. So, at first we need to construct the class intervals.  Class size (h) = difference between two successive mid-values

                 = . . . 

               = . . .        

Subtract from the first middle value for lower limit of first class interval and add to the same mid value for the upper limit of first class interval and so on. Other class intervals are constructed in the similar fashion.

The Partition Values

The values which divide the total number of observations into a number of equal parts are called partition values. Thus, median may also be regarded as a particular partition value because it divides the given data into two equal parts.

Depending upon the equal number of parts, the important amongst these partition values are

  Quartiles

  Deciles

  percentiles

Note: (i) For all series, for partition values, at first arranging the given data in ascending order.

(ii)    For all partition values, no need to be equal class size but it is necessary to be exclusive class.

(iii)  For all partition values, at first arrange the given data in ascending order of magnitude.

(iv)  𝐐𝟏 = 𝐏𝟐𝟓 ,   Md = 𝐐𝟐 = 𝐃𝟓= 𝐏𝟓𝟎,      𝐏𝟕𝟓 = 𝐐𝟑

Quartiles

Individual series 

After arranging the given data in ascending order of magnitudes,

Quartiles can be obtained by the following formula

Qi = value of   item.

                           Where, i = 1, 2, 3                                      n = No. of observations

Discrete series

  Qi = value of   item.

  N = ∑ f = Total frequency

   i  = 1, 2 & 3

Continuous series or Grouped frequency distribution

 

           where,                  i = 1, 2, 3

          = the size for ith quartile’s class

           L = lower limit of ith quartile's class     f = frequency of ith quartile's class              h = class size or width of ith quartile's class

          c.f. = preceding c.f. of ith quartile's class.

Deciles:

Individual series 

Deciles:

After arranging the given data in ascending order of magnitudes,

Individual series 

Dj = value of   item.

                       Where j = 1, 2, 3. . ., 9                                 n = No. of observations

 Discrete series

Dj = value of   item.         where, N = ∑ f = Total frequency              i= 1, 2 ,3 . . . . . ,9

Continuous series or Grouped frequency distribution

 

Where, j =1, 2,3, . . . . .,9

        Where,

                 = the size for jth decile’s class

                 L = lower limit of jth decile’s class

                 f = frequency of jth decile’s class             h = class size or width  of jth decile's class

                 c.f. = preceding c.f. of jth decile’s class.

Percentiles:

 The variate values which divide the total number of observations into 100 equal parts are called percentiles. 

 Case I: To find the highest value (maximum value) of % failed,  lowest earner, poorest, flattest, shortest etc.

i.e. The highest income of the poorest 40% of the people is given by 40th percentile i.e. P40.

       

Case II: To find the limits (Range) of middle %

i.e. The limits of income of middle 50% of families is given by the 25th and 75th percentiles. i.e. P25 and P75.

Case III: To find the lowest value (or minimum value) of % top, pass, richest, highest earner, longest, tallest etc.

i.e. The lowest income of the richest 40% of the people  is given by 60th percentile i.e. P60.

 

       

 

Percentiles:

Individual series 

Percentiles:

After arranging the given data in ascending order of magnitudes,

Individual series 

Pk = value of   item.

                       Where k = 1, 2, 3. . . . . . . ,99                              n = No. of observations  Discrete series

Pk = value of   item.

        where, N = ∑ f = Total frequency                     k= 1, 2 ,3 . . . . .  . . . ,99

Continuous series or Grouped frequency distribution

 

Where, k =1, 2,3, . . . . . . .,99

        Where,

                 = the size for kth percentile’s class

                 L = lower limit of kth percentile’s class                f = frequency of kth percentile’s class                   h = class size or width of kth percentile's class

                     c.f. = preceding c.f. of kth percentile’s class.

Note: 𝐐𝟏 =  𝐏𝟐𝟓 ,   Md = 𝐐𝟐 = = 𝟓𝟎,      𝐏𝟕𝟓 = 𝐐𝟑

 

Measure of Variation (Measures of Dispersion)

The variability or the scatterness of the items from the central values is called dispersion and its measure is the measure of dispersion or the measure of variation.

Thus, measures of dispersion are statistical tools i.e. descriptive statistical measures which are used to measure the variation or spread or scatterness or deviation of data from the central value.  So, it gives an idea of homogeneity or heterogeneity of the distribution.

 Measures of Dispersion

The various measures of dispersion are as follows.

1.  Range

2.  Quartile deviation or Semi-interquartile range 

3.  Mean deviation or Average deviation.

4.  Standard deviation 

5.  Lorenz curve

6.  Ginni’s coefficient 

Note: But Mean deviation or Average deviation, Lorenz curve and Ginni’s coefficient are beyond of our syllabus.

Range

Range is the simplest of all the measures of dispersion. It is defined as the difference between largest (maximum) value and smallest (minimum) value for the given observations of the distribution.  

For all series

 Range (R) = L – S      

   Where, L = Largest item or observation 

          S = Smallest item or observation

Its relative measure of dispersion is known as coefficient of range and the coefficient of range  is given by

  Coefficient of range =  

Quartile Deviation or Semi-interquartile Range (Q.D.)

Quartile deviation is a measure of dispersion based on the upper quartile  and lower quartile Q1. The difference between the upper quartile Q3 and lower quartile Q1 is known as inter-quartile range. 

Inter-quartile range = Q3-Q1

The half of the inter-quartile range is called semi-interquartile range, which is also known as quartile deviation.

 Quartile Deviation (Q.D) =  

Coefficient of Q.D. =  

Note:  

Quartile deviation (Q.D.) is the most suitable or appropriate measure of dispersion for open end classes.

Less the coefficient of Q.D. implies more will be the uniformity or less will be the variability.

Greater the coefficient of Q.D. implies less will be the uniformity or greater will be the

variability.

For individual series

Quartile deviation (Q.D.) =  

          Where , Q1 = value of   item.

                                   Q3 = value of   item                     n = No. of observations

   For discrete series

Quartile deviation (Q.D.) =  

        Where, Q1 = value of   item.

                     Q3 = value of   item

                 N = ∑ f = Total frequency

For continuous series

Quartile deviation (Q.D.) =  

                 where 

                                 

3 N

                                 

                                   N = ∑ f = Total frequency

Its relative measure is known as coefficient of quartile deviation and is given by

Coefficient of quartile deviation =  

 

Standard Deviation: 

Standard deviation is defined as “the positive square root of   the arithmetic mean of the square of the deviations of the given set of observations from their arithmetic mean.” It is usually denoted by Greek alphabet  (sigma).

Standard deviation is said to be the best measure of dispersion (or ideal measure of dispersion) as it satisfies almost all the requisites (or characteristics) of an ideal or a good measure of dispersion. 

For Individual Series

(i)    S.D. (- direct method

(ii)  S.D. (- Short cut method

           Where,    d = X – a  

                                 a = Assumed mean                                  n = number of observations 

For Discrete Series

(i)    S.D. (- Direct method

(ii)  S.D. (- Short cut method

         Where,    d = X – a    

                              a = Assumed mean

                          N = ∑ f = Total frequency

For Continuous Series

(i)     S.D. - direct method

(ii)    S.D. (- Short cut method

(iii)  S.D. (- Step deviation method

         Where,    d = X – a

                    X = mid value

                              a = Assumed mean

                           N = ∑ f = Total frequency

                                        h = class size or width of class size

Note: But for unequal class size h is taken as common factor

 

Variance

The square of the standard deviation is known as variance. It is denoted by 𝜎2  and given by            σ2 = V(X)          Where V(X) = variance of variable X

        ⟹ σ = √V(X)

Coefficient of Variation (C.V.)

 100 times   the coefficient of standard deviation is called coefficient of variation. In other words, the coefficient of standard deviation expressed in percentage is known as coefficient of variation. Symbolically, 

         C.V. =   × 100%

It is a relative measure of dispersion, so it is independent of units of measurement. It is always expressed in percentage.  Therefore, C.V. can betterly be used to compare two or more than two distributions with regard to their variability, consistency, uniformity, homogeneity, equitability, stability etc.

Coefficient of variation (C.V.) is applicable for the comparison of variability of two or more than two distributions (series) as follows

Less C.V. is considered as

More C.V. is considered as

More consistent 

Less consistent

More homogeneous

Less homogeneous

More uniform

Less uniform

More stable

Less stable

More representative to mean

Less representative to mean

More equitable

Less equitable

Less variable

More variable

Less disparity

More disparity

 

Sample standard deviation (s):  

A standard deviation which is based on sample observations is called sample standard deviation. It is denoted by ‘s’.

 

(ii)     s =  

(iii)    s =  

Sample coefficient of variation (C.V.) =   

 Sample variance (𝐬𝟐) :

 The square of sample standard deviation is called sample variance. It is denoted by s2.

 

Combined Standard deviation:

For two groups (two series)

 

             Where,  d1 = ̅X1- ̅X12                              d2 = ̅X2- ̅X12  

                             

For three groups (Three series)        

Combined standard deviation is

 

                   Where,  d1 = ̅X1- ̅123                                d2 = ̅X2- ̅123  

                        d3 = ̅X2- ̅X123  

                             

    Five-Number Summary

The five-number summary provides the five descriptive measures of the given data set. So, it consists of the smallest value (X smallest), the first quartile or lower quartile (Q1), Median (Md or Q2), third quartile or upper quartile (Q3) and the largest value (X largest). Therefore, the five number summary is 

(Xsmallest ,       Q1 ,       Median  ,      Q3 ,      Xlargest)

 

The Box -and –Whisker plot

A five-number summary can be represented in a diagram known as a box and whisker plot.  Therefore, a box- and –whisker plot is a graphical representation of the data based on the five number summary. That is, smallest value, Q1, Md, Q3 and largest value. It is the graphical method of measuring skewness of the distribution.

The vertical line drawn at the left side of the box represents the location of Q1 and the vertical line at the right side of box represents the location of Q3. Thus, the box contains the middle 50% of the values. The lower 25% of the data are represented by a line (known as whisker) connecting the left  side of the box to the location of the smallest value, X smallest. Similarly, the upper 25% of the data are represented by a line( known as whisker) connecting the right side of the box to X largest  as shown in

 

Comparison

Left skewed

Right skewed

Symmetric

1. The distance  from Xsmallest to the median verses the distance

from the median to X largest

The distance from the

X smallest  to the median is greater than the distance from the

median to X largest

The distance from X

smallest  to the median is less than the distance from the median to X largest

Both distances are the same

2. The  distance from X smallest to

Q1 verses the distance from Q3 to 

X largest  

The distance from X smallest to Q1 is greater than the distance from

Q3 to X largest

The distance from X

smallest to Q1 is less than  the distance from Q3 to

X largest

Both distances are same.

3. The distance from Q1 to the median verses the distance from  the median to Q3.

          

The distance from  Q1 to the median is greater than the distance from the median to Q3

The distance from  Q1 to the median is less  than the distance from the median to Q3

 

Both distances are same

 

 

 

 

 

 

 

Numerical problems

Example1: Compute mean, median and mode of the following data

55

39

45

55

41

35

60

40

55

35

37

55

55

65

Solution:

Arranging given data in ascending order:

X: 35, 35, 37, 39, 40, 41, 45, 55, 55, 55, 55, 55, 60, 65

Mean,   

            

                 = 48

Median (Md.) = Value of  item

                          = Value of  item

                           = Value of 7.5th item

                           = Value of  

                            

                           = 50

Mode (Mo) = Value of variable X which repeats maximum number of times

                     = 55

 

Example 2: Find Q1, D3 and P65 from the given data: 8, 6, 5, 4, 10, 15, 3, 16

Solution

Here, the number of observation, i.e. n = 8

First, the data are arranged in ascending order: 3, 4, 5, 6, 8, 10, 15, 16.

                 Q1 = Value of item

                       = value of  item 

                       = value of 2.25th item

                       = 2nd item +0.25 (3rd – 2nd) item

                       Q1  = 4 + 0.25 (5 – 4)                                     = 4.25

     D3 = Value of item

         = value of  item

        = value of 2.7th item 

        = value of 2nd item + 0.7 (3rd – 2nd) item

D3 = 4 + 0.7 (5 – 4) = 4.7

    P65 = Value of   item

          = value of item 

          = value of 5.85th item

          = value of 5th item + 0.85 (6th item - 5th item)

          = 8 + 0.85 (10 – 8) = 9.7

 P65 = 9.7

 Example 3: The number of telephone calls received at an exchange for 200 successive one-minute intervals are given below.

No. of calls

0

1

2

3

4

5

6

Total

Frequency

15

22

28

35

42

34

24

200

Compute the mean, median and mode.

Solution:

No. of calls (X)

Frequency (f)

Less than c.f.

fX

0

1

2

3

4

5

6

15

22

28

35

42

34

24

15    

37

65

100

142

176

200

 

N = ∑f = 200

 

 

Mean,  

                    

                   = 3.325 

Median (Md.) = Value of  item

                        = Value of  item

                        = Value of 100.5th item

                        = 4

 Mode (Mo) = Value of variable X corresponding to maximum frequency

                 = 4 

 

Example 4: Find upper quartile and upper decile from the given data. Also obtain P77.

X

1

2

3

4

5

6

7

8

9

10

11 

F

2

5

8

10

12

8

6

4

3

2

Solution

Calculation of partition values

X

f

Less than c.f.

1

2

3

4

5

6

7

8

9

10

11

2

5

8

10

12

8

6

4

3

2

1

2

7

15

25

37

45

51

55

58

60

61

 

N =61

 

For Q3

Q3 = value of  item 

     = value of  item 

      = value of 46.5th item.   The value in c.f. just greater than 46.5 is 51.

So upper quartile Q3 = 7.    

For D9 :

 (D9) = value of  item   

          = value of item  

          = Value of 55.8th item 

The value of c.f. just greater than 55.8 is 58.

 So,  D9 = 9. For P77

P77 = value of  item   

     = value of item 

     = value of 47.74th item . 

 The value of c.f. just greater than 47.74 is 51.

 P77 = 7.

 

Example 5: The length power failure in minute are recorded in the following table.

Power Failure time

22

23

24

25

26

27

28

Total 

Frequency

2

5

7

10

4

3

2

33 

Find Q3, D2 and P40 and interpret the results.

Solution:

Power failure time (X)

Frequency (f) 

Less than c.f.

22

23

24

25

26

27

28

2

5

7

10

4

3

2

2

7

14

24

28

31

33

 

N = 33

 

For Q3

Q3 = value of  item 

th

     = value of  item 

     = value of 25.5th  item.   The value in c.f. just greater than 25.5 is 28.

So upper quartile Q3 = 26 minutes. For D2 :

   D2   = value of  item   

          = value of item  

          = Value of 6.8th item 

The value of c.f. just greater than 6.8 is 7.

 So,  D2 = 23 minutes  For P40

  P40 = value of  item   

       = value of item 

       = value of 13.6th item. 

The value of c.f. just greater than 13.6 is 14.

 P40 = 24 minutes

Example 6: The length in meter of 100 VGA Cable used in a company are measured to the nearest 0.01 meter and the results are given below.

Length in meter

Frequency

Length in meter

Frequency

3.80-3.89

3

4.20-4.29

28

3.90-3.99

8

4.30-4.39

18

4.00-4.09

14

4.40-4.49

10

4.10-4.19

19

4.50-4.59

8

Find the value of mean, mode and median.

Solution:

 Correction factor =  

 

Length in meter

Frequency (f)

Less than c.f.

Mid .value (X)

f X

3.80-3.89 3.90-3.99 4.00-4.09 4.10-4.19 4.20-4.29 4.30-4.39 4.40-4.49

4.50-4.59

3

8

14

19

28

18

10

8

3

11

25

44

72

90

100

108

3.845 3.945 4.045 4.145 4.245 4.345 4.445

4.545

11.535

31.56

56.63

78.755 118.86

78.21 44.45

36.36

 

N= ∑ f = 108

 

 

∑ fx =456.36

 

Mean ( 

                    

                   = 4.225 meters

For Mode:

Since, the given frequency distribution is regular and unimodal and maximum frequency is 28. So, modal class is 4.20-4.29 but its exclusive class is 4.195-4.295

 L= 4.195, h = 0.1, f1= 28, f0= 19, f2= 18

Mode (Mo) = L + × h

              = 4.195 + 

                     = 4.24 meters

For median (Md):

∴ Median class is  4.20-4.29  but its exclusive class is 4.195-4.295 L = 4.195,   f = 28,   h = 0.1,   c.f. = 44

Median (Md) = L +  

                        = 4.195 +  

                  = 4.23 meters

Example 7:  The percentage age distribution of urban male population of Nepal from 2011 census is given below:

Age group

Male population

Age group

Male population

0-4

5-9

10-14

15-19

20-24

25-29

30-34

11.8 12.9 12.5 11.2

10.7

8.9

7.2

35-39

40-44

45-49

50-54

55-59

60 and above

6.2 4.7 4.0 2.9 2.3

4.7

Compute the first and third quartiles, 8th decile and 70th percentile.

Solution: 

Correction factor =  

 Age group

Male population (f)

Less than c.f.

0-4

5-9

10-14

15-19

20-24

25-29

30-34

35-39

40-44

45-49

50-54

55-59

60 & above

11.8 12.9 12.5 11.2

10.7

8.9 7.2 6.2 4.7 4.0 2.9 2.3

4.7

11.8 24.7 37.2 48.4 59.1 68.0 75.2 81.4 86.1 90.1 93.0

95.3

100

 

N = ∑ f = 100

 

For lower quartile or first quartile (𝑄1)

 

𝑄1 lies in class 10-14 but its exclusive class (adjusted class) is 9.5-14.5)

   L = 9.5,    f = 12.5,   c.f. = 24.7,   h = 5

 

       = 9.5+  

       = 9.62

 For third quartile (𝑄3)

 

𝑄3 lies in class 30-34 but its exclusive class (adjusted class) is 29.5-34.5 , 

  L =  29.5,  f = 7.2,   c.f. = 68,   h = 5

 

      = 29.5 +  

      = 34.36

8th decile (D8)

 

D8 lies in class 35- 39 but its exclusive class is 34.5-39.5 

  L = 34.5,  f =6.2 ,   c.f. = 75.2,   h = 5

 

      = 34.5 +  

      = 38.37

70th percentile (P70)

         

P70 lies in class 30-34 but its exclusive class is  29.5-34.5

L = 29.5,  f = 7.2,   c.f. = 68,   h = 5

 

       = 29.5 +  

Example 8: The marks distribution of 100 students of a college is as follows.

Marks

10-20

20-40

40-70

70-90

90-100 

No. of students

15

20

30

20

15 

(i)     Find the highest mark of the weakest 30% of the students.

(ii)    Find the lowest mark of top 40 % of the students.

(iii)  Find the lowest marks of top 20% of the students.

(iv)  Find the limits and range of marks of middle 50% of students. Solution:

Marks

No. of students (f) 

Less than c.f.

10-20

20-40

40-70

70-90

90-100

15

20

30

20

15

15

35

65

85

100

 

N =  ∑ f = 100

 

 

(i) The highest marks of the weakest 30% of the students is given by P30

 

       

 

30th percentile (P30)

 

P30 lies in class 20-40

 L = 20,   f = 20,   c.f. = 15,   h = 20

 

       = 20 +    

(ii) The lowest mark of top 40 % of the students is given by P60     

 

 

60th percentile (P60)

 

P60 lies in class 40-70

L = 40,  f = 30,   c.f. = 35,   h = 30

60 N

 

       = 40 +    

(iii) The lowest marks of top 20% of the students is given by P80

 

80th percentile (P80)

 

P80 lies in class 70- 90,

  L = 70,  f = 20,   c.f. = 65,   h = 20

 

         = 70 +  

         = 85 marks

(iv) The limits of marks of middle 50% of students are given by P25 & P75

 

 

 

25th percentile (P25)

 

P25 lies in class 20- 40,

  L = 20,  f = 20,   c.f. = 15,   h = 20

 

       = 20 +  

      = 30 marks

75th percentile (P75)

 

P75 lies in class 70- 90,

  L = 70,  f = 20,   c.f. = 65,   h = 20

 

       = 70 +   

      = 80 marks

Lower limit, P25 = 30 marks

Upper limit, P75  = 80 marks

 Range = 75 - P25 = 80 -30 = 50 marks

 

T.U. 2017 (Spring)

1. (b) The temperature in a chemical reactor was measured every half hour under the same conditions. The results were 78.1, 79.2, 78.9, 80.2, 78.3, 78.8, 79.4. Calculate the mean, median, lower quartile, upper quartile, standard deviation and coefficient of variation. Solution: Arranging the given data in ascending order of magnitude 

Temperature (in (X)

𝑋2

78.1 78.3 78.8 78.9 79.2 79.4

80.2

6099.61

6130.89

6209.44

6225.21

6272.64

6304.36

6432.04

∑ 𝑋 = 552.9

∑ 𝑋2 =43674.19

 

Mean,    

             

                   = 78.985  

Median (Md.) = Value of  item

                         = Value of  item

                         = Value of 4th item

                         = 78.9  

Mode (Mo) = Value of variable X which repeats maximum number of times

                     = no mode

Lower quartile, Q1 = Value of  item

th

                                  = Value of  item

                                  = Value of 2nd item                                   = 78.3  

Upper quartile, Q3 = Value of  item

                                  = Value of  item

                                  = Value of 6th item                                   = 79.4  

Standard deviation  

                                          

                           = √6239.17 − 6238.743

                        

                                        = 0.6534  

Coefficient of variation, C.V. =  

                                            

                                 = 0.8272%

PU 2018 (Spring)

1. (a) The following data set represents the number of new computer accounts registered during ten consecutive days.

43, 37, 50, 51, 58, 105, 52, 45, 45, 10

i.  Compute the mean, median and standard deviation.

ii.              Draw a box and whisker plot and identify whether it is skewed or not. Solution:

Arranging the given data in ascending order of magnitude.

No. of new computer accounts (X)

 

𝑋2

10

37

43

45

45

50

51

52

58

105

100

1369

1849

2025

2025

2500

2601

2704

3364

11025

 

∑ 𝑋 = 496

 

∑ 𝑋2 = 29562

 

(i) Mean,    

                

                    = 49.6

Median (Md.) = Value of  item

                         = Value of  item

                         = Value of 5.5th item

                         = Value of  

                          

                          = 47.5               OR

Median (Md.) = Value of  item

                          = Value of  item

                          = Value of 5.5th item

                    = Value of 5th item + 0.5 (6th item - 5th item )

                          =  45 + 0.5 (50-45)

                    = 45 + 0.5 × 5 

                             = 47.5

Mode (Mo) = Value of variable X which repeats maximum number of times                        = 45

Standard deviation  

                                            

                              = √2956.2 − 2460.16

                             = √496.04 

                              = 22.271

(ii) To construct box and whisker plot:

At first we have to find five number summary

Smallest value = 10

Largest value = 105

Lower quartile, Q1 = Value of  item

                                  = Value of  item

                                  = Value of 2.75th item

                           = Value of 2nd item + 0.75 (3rd item - 2nd item )

                                =  37 + 0.75 (43-37)

                           = 37 + 0.75 × 6 

                                    = 41.5

upper quartile, Q3 = Value of  item

                                = Value of 3 item

                                = Value of 8.25th item

                          = Value of 8th item + 0.25 (9th item - 8th item )

                                =  52 + 0.25 (58-52)

                          = 52 + 0.25 × 6 

                                   = 53.5

Hence, the five-number summary, (smallest, Q1, Md, Q3, largest) is 

(10,41.5, 47.5,53.5,105)

(i)    Length of left whisker (i.e. the distance from the smallest value to Q1) =41.5-10 = 31.5

 Length of right whisker (i.e. the distance from Q3 to the largest value) = 105-53.5 = 51.5 

(ii)  The distance from the smallest value to the Md = 47.5 -`10 = 37.5

      The distance from Md to largest value = 105-47.5 = 57.5

Since, Length of left whisker < Length of right whisker

& the distance from the smallest value to the Md < The distance from Md to largest value. Therefore, the distribution is positively skewed (i.e. right skewed).  

 

PU 2017 (Spring)

Q.No.1 (a) Over a period of 40 days  the percentage relative humidity in a vegetable storage building was measured. Mean daily values were recorded as shown below:

60

63

64

71

67

73

79

80

83

81

86

90

96

98

98

99

89

80

77

78

71

79

74

84

85

82

90

78

79

79

78

80

82

83

86

81

80

76

66

74

(i)              Prepare a stem and leaf display for these data. Show the leaves sorted in order of  increasing magnitude on each stem.

(ii)            Draw a box plot for these data and interpret the data in practical manner. Solution:

(i)    Arranging the given data in ascending order of magnitude:

Percentage relative humidity(X):

60, 63, 64, 66, 67, 71, 71, 73, 74, 74, 76, 77, 78, 78,78,79,79,79,79, 80, 80, 80, 80, 81, 81,82, 82, 83, 83, 84, 85, 86, 86, 89, 90, 90, 96, 98, 98,99

        Stem and leaf display

Stem

Leaves

6

0   3   4    6    7 

7

1    1   3    4    4     6    7    8    8    8    9    9     9    9 

8

0    0   0    0    1     1    2    2    3    3    4     5     6   6    9 

0    0    6    8    8    9

                  

Stem and leaf display shows the ordered values from the smallest value to the largest (i.e. leaves sorted in order of increasing magnitude on each stem) and where the concentration of the data occurs.

(ii)  To construct box and whisker plot (Box plot):

At first we have to find five number summary

Smallest value = 60 Largest value = 99 

th

              Q1 = Value of  item

                   = Value of  item

                   = Value of 10.25th item

              = Value of 10th item + 0.25 (11th item - 10th item )

                 = 74 + 0.25 (76-74)

              = 74 + 0.25 × 2 

                     = 74.5

Median (Md.) = Value of  item

                          = Value of  item

                          = Value of 20.5th item

                    = Value of 20th item + 0.5 (21th item - 20th item )

                        = 80 + 0.5 (80-80)

                     = 80 + 0.5 × 0 

                             = 80

 Upper quartile ,Q3 = Value of  item

                                = Value of 3 item

                                = Value of 30.75th item

                           = Value of 30th item + 0.75 (31th item - 30th item )

                            = 84 + 0.75 (85-84)

                           = 84 + 0.75 × 1 

                                 = 84.75

Hence, the five-number summary, (smallest, Q1, Md, Q3, largest) is  (60, 74.5, 80, 84.75,   99)

 

 = 74.5-60= 14.5

 Length of right whisker (i.e. the distance from Q3 to the largest value) = 99-84.75 = 14.25 

 (ii) The distance from the smallest value to the Md = 80-`60 = 20

      The distance from Md to largest value = 99-80 = 19

Since, Length of left whisker > Length of right whisker

& the distance from the smallest value to the Md > The distance from Md to largest value. 

Therefore, the distribution is negatively skewed (i.e. left skewed). It indicates there is a high frequency of high values of percentage relative humidity in a vegetable storage building are concentrated on the right side  and low frequency of less values which are on the left tailed. In other words, there is a high frequency of high values and low frequency of less values of percentage relative humidity in a vegetable storage building.

 

 

 

PU 2014 (Spring)

Q.No.2:The following are the number of minutes that a person had to wait for the bus to work on 15 working days :

 10,  1,  13,  9,  5,  9,  2,  10, 3,  8,  6,  17,  2,  10,  15  Draw a box  plot and interpret the  result.

Solution:  To construct box and whisker plot (Box plot):      At first we have to find five number summary

Arranging the given data in ascending order of magnitude

1,  2,   2,   3,    5,   6,    8,   9,    9,   10,  10,   10,    13,    15,    17   

           Smallest value = 1 minute

             Largest value = 17 minutes

     Q1 = Value of  item

            = Value of  item

            = Value of 4th item              = 3

Median (Md.) = Value of  item

                          = Value of  item

                          = Value of 8th item

                    = 9

  Q3 = Value of  item

        = Value of 3 item

        = Value of 12th item

         = 10

 Length of right whisker (i.e. the distance from Q3 to the largest value) = 17- 10 = 7 

 (ii) The distance from the smallest value to the Md = 9- 1 = 8

      The distance from Md to largest value = 17-9 = 7

Since, Length of left whisker < Length of right whisker

But, the distance from the smallest value to the Md > The distance from Md to largest value.  Therefore, the the distribution is not uniformly distributed.

 

PU 2016(Fall)

1.(b) A random sample was taken of the thickness of insulation in transformer windings, and the following thickness (in millimetres) were recorder:

18  21  22    29   25    31   37     38    41      39     44    48    54    56    56   57    47     38    35   36  29    37     32     42     43      40     48     36  37    37 (i)  Prepare a stem-and leaf display for these data.

(ii)  Prepare a box plot for these data.

Solution: Arranging the given data in ascending order of magnitude:

18,   21,   22, 25,   29,   29,   31,  32,  35,   36,   36,   37,   37,    37,   37     38,    38,   39,   40,   41,   42,    43,  44,   47,   48,   48,   54,   56,   56,    57

 

(i)                Stem and leaf display

Stem

Leaves

1

8   

2

1      2      5     9      9

3

1      2      5     6      6    7     7      7      7     8      8      9

4

0      1      2     3      4    7     8      8

5

4      6      6      7

(ii)              To construct box and whisker plot (Box plot):

     At first we have to find five number summary

           Smallest value = 18 millimetres

           Largest value = 57 millimetres

     Q1 = Value of  item

            = Value of  item

            = Value of 7.75th item

            = Value of 7th item + 0.75 (8th item - 7th item )

            =  31 + 0.75 (32-31)

            = 31 + 0.75 × 1 

            = 31.75 millimetres

Median (Md.) = Value of  item

                          = Value of  item

                          = Value of 15.5th item

                    = Value of 15th item + 0.5 (16th item - 15th item )

                          =  37 + 0.5 (38-37)

                    =  37+ 0.5 × 1 = 37.5 millimetres

          Q3 = Value of  item

                 = Value of 3 item

                 = Value of 23.25th item

            = Value of 23th item + 0.25 (24th item - 23th item )

                 =  44 + 0.25 (47-44)

            = 44 + 0.25 × 3 

                      = 44.75 millimetres

Hence, the five-number summary is (smallest, Q1, Md, Q3, largest) is  (18,  31.5,  37.5,  44.75,   57)

 (ii) The distance from the smallest value to the Md = 37.5- 18 = 19.5

      The distance from Md to largest value = 57—37.5 = 19.5

Since, Length of left whisker > Length of right whisker

But the distance from the smallest value to the Md = The distance from Md to largest value. 

Therefore, the distribution is slightly left skewed. (i.e. the distribution is not uniformly distributed)

 

PU 2018 (Fall)

 1 (a) An investigator wants to study the speed of cars at Araniko high and he collected the speed of 30 vehicles and speeds were:

35,  37,  42,  45,  47,  48,  50,  55,  67,  70,  75,  80, 90,  95,  94,   48,  55,  60,  71,  63,  70,  65,  80,  55,  40,  35,  36,  85,  79,  30.

(i)    Present the above data in stem and leaf display.

(ii)  Construct continuous frequency distribution using Struge’s  rule and 

Construct the cumulative curve and find median speed, speed of first 25% vehicles, speed of first 75% vehicles and also compute the percentage of vehicles whose speed lies between 40 to 70 km. 

Solution: Arranging the given data in ascending order of magnitude:

Speed (in km) X :30,  35,  ,35,  36,  37,  40,  42,  45,  47,  48,  48,  50,  55,  55,  55,  60,  63, 65,  67,  70,  ,70,  ,71,  75,  ,79,  80,   80,  85,  90,  94,  95

  

Stem and leaf display

Stem

Leaves

3

0     5     5     6     7

4

0     2     5      7      8     8

5

0     5      5      5

6

0     3     5    7

7

0     0    1     5     9

8

9

0     0    5 

0    4     5 

(ii) Since, class size (h) is not given, therefore at first it needs to find the approximate number of class intervals (k) and class size (h) 

Number of observations, n = 30

 S = smallest value = 30  L = Largest value = 95 

By Struge’s formula, 

Number of classes, k = 1 + 3.322logn 

                                      = 1 + 3.322 log30 

                                      = 1+3.322× 1.4771

                                     = 5.9069 6          

  Class width or class size, h =   

Continuous frequency distribution:

Speed (in km)

Tally bar

Frequency (f)

30 – 41 

|||| |

6

41– 52

|||| |

6

52 – 63

||||

4

63 – 74

|||| | 

6

74– 85

||||

4

85 – 96

||||

4

Less than cumulative frequency distribution

Speed (in km)

Less than c.f.

Less than 41 

6

Less than 52

12

Less than 63

16

Less than 74

22

Less than 85

26

Less than 96

30

Median speed, Md = 58 km

 The speed of first 25% vehicles is given by

    P25 =  = 44 km

& the speed of first 75% vehicles is given by 


  P75 = Q3 = 76 km

The number of vehicles whose speed lies between 40 to 70 km

    = 4 + 5 + 5  

    = 14

The percentage of vehicles whose speed lies between  40 to 70 km    

 

 = 46.67%

Note: 𝐐𝟏 = 𝐏𝟐𝟓 ,   Md = 𝐐𝟐 = 𝐃𝟓= 𝐏𝟓𝟎,      𝐏𝟕𝟓 = 𝐐𝟑   

PU 2018 (Spring)

Q.No.1.(b): After the implementation of an economic program to uplift the economic condition of a community following information were found.

Monthly  income (Rs. 000)

4-6

6-8

8-10

10-12

12-14

16-16

16-18

After         the          plan (no. of

families)

8

65

37

15

15

5

5

Construct an ogive to find 

(i)    Find the number of families whose monthly income is between Rs. 8,000 to Rs. 14,000

(ii)  Find the number of families whose monthly income is above Rs. 12,000

 Solution: Less than cumulative frequency distribution

‘Monthly income (Rs. 000)

Less than c.f.

Less than 6 

8

Less than 8

73

Less than 10

110

Less than 12

125

Less than 14

140

Less than 16

145

Less than 18

150

(i)                  The number of families whose monthly income is between Rs. 8,000 to Rs. 14,000 = 9+ 20+20+20+1 = 70

(ii)                 The number of families whose monthly income is above Rs. 12,000 = 17 +10 =27  OR

The number of families whose monthly income is above Rs. 12,000

 = 150-123

 = 27

PU 2015 (Spring)

Q.No.1 (a): The following table shows length of eighty bally bridge:

68

84

73

82

68

90

62

88

76

93

73

79

75

73

60

93

71

59

85

75

61

65

88

87

74

62

95

78

63

72

66

78

75

75

94

77

69

74

68

60

96

78

82

61

75

95

60

79

83

71

79

62

89

97

78

85

76

65

71

75

65

80

67

57

88

78

62

76

53

74

86

67

73

81

72

63

76

75

85

77

With the reference of above table.

(i)Construct the grouped frequency distribution having class width 10.

(ii)    Draw less than ogive and more than ogive in same graph and hence locate median.

(iii)  By the help of less than ogive , find the number of bridge having length less than 65 meters.

Solution: (i) Grouped frequency distribution having class width 10

Length of bally bridge

         Tally bar

Frequency (f)

50-60

|||

3

60 –70 

|||| |||| |||| |||| |

21

70 –80

|||| |||| |||| |||| |||| |||| ||| 

33

80 –90

||||| |||| |||| 

15

90 –100

|||| |||

8

 

 

N = ∑ f = 80

Solution:

(ii)  Less than and more than cumulative frequency distribution

Length of bally bridge

Less than c.f.

Length of bally bridge

More than c.f.

Less than60

3

More than 50

80

Less than 70 

24

More than 60

77

Less than80

57

More than 70

56

Less than 90

72

More than 80

23

Less than100

80

More than 90

8

35

From ogive curve median (Md) = 75 metres


(iii)By the help of less than ogive, the number of bridge having length less than 65 meters    = 10+3 = 13 metres

 

PU 2017 (Fall)

Q.No.1(a): Following data represents the tensile strength of steel-rod manufactured by company A located at Biratnagar.

65

36

49

84

79

56

28

43

67

36

43

78

37

40

68

72

70

55

62

82

88

50

60

56

57

46

39

57

22

65

59

48

76

74

80

69

51

40

56

45

35

21

62

52

63

32

86

64

53

34

Construct a frequency distribution and represent the data by means of Cumulative frequency curve. Identify the median and first quartile from the curve. Also interpret the result of 1st quartile

Solution: 

Since, class size (h) is not given, therefore at first it needs to find the approximate number of class intervals (k) and class size (h) 

Number of observations, n = 50

 S = smallest value = 21  L = Largest value = 88 

By Struge’s formula, 

Number of classes, k = 1 + 3.322logn 

36

                                  = 1 + 3.322 log50 

                                      = 1+3.322× 1.6989

                                      = 6.643 ≈ 7     

 Class width or class size, h =   

Frequency distribution

Tensile strength of steel rod

Tally bar

Frequency (f)

20 – 30 

|||

3

30– 40

 |||| ||

7

40 – 50

|||| |||

8

50 – 60

||||  ||||  |

11

60– 70

|||| |||| 

10

70 – 80

|||| |

6

80-90

||||

5

 

 

N = ∑ f = 50

 

ess than cumulative frequency distribution 

Tensile strength of steel rod

Less than c.f.

Less than 30 

3

Less than  40

10

Less than  50

18

Less than 60

29

Less than 70

39

Less than 80

45

Less than 90

50

 

Less than ogive curve (or Less than cumulative frequency curve)


 

From ogive curve,  

 Median (Md) = 59 First quartile (Q1) = 44

Thus, the first quartile indicates that the tensile strength of  first 25% steel-rod is 44. PU 2013 (Fall)

(1) (a) Following information shows the daily wage of workers of certain hydropower company, prepare suitable ogive that helps to give the answers of following questions

Daily wages

0-20

20-40

40-60

60-80

80-100

No. of workers

41

51

64

38

7

(i)     About what wage above that 50% workers earn?

(ii)    What would be the daily wage limit of middle 30% workers?

Additional question:

(iii)  If 20% workers are the lowest earners, find the highest wage of them.

(iv)  If 10% workers are the highest earners, find the lowest wage of highest 10 % of the workers.

Solution:

Less than cumulative frequency distribution

Daily wages

No. of workers (Less than c.f.) 

Less than 20

41

Less than 40

92

Less than 60

156

Less than  80

194

Less than  100 

201

38

(i)     From ogive curve,

The wage above that 50% workers earn is Rs 42.

(ii)    The daily wage limits of middle 30% workers are given by P35 and P65

From ogive curve,

P35  = Rs. 28

P65 = Rs.52

 Lower limit, P35  = Rs. 28

    Upper limit, P65 = Rs.52

(iii)  The highest wage of 20%  lowest earners (workers)  is given by P20

From ogive curve,

 20 = Rs. 19

(iv)  The lowest wage of highest 10 % of the workers is given by P90.

P90 = Rs. 73

PU 2013 (Fall)

1. (b) The following information shows the income distribution.

Income ($’000)

0-10

10-20

20-30

30-40

40-50

50-60 

No. of persons

5

10

18

23

7

Construct less than ogive. Also use it to find (i) the number of persons having income less than $35,000 and (ii) percentage of persons having income between $20,000 and $50,000 

Additional:

(iii) Percentage of persons having income more than $40,000  Solution:

Less than cumulative frequency distribution


Income ($ 000)

No. of persons(Less than c.f.) 

Less than 10

5

Less than 20

15

Less than 30

33

Less than  40

56

Less than  50 

63

Less than  60

69

(i)                 The number of persons having income less than $35000 = 10 +10+10+10+ 5 = 45

(ii)                The number of persons having income between $ 20000 and $50000 = 4 +10+10+10+10+2 = 46  The Percentage of persons having income between $20,000 and $50,000    

  

 = 66.67%

(iii)              The number of persons having income more than $40,000 

   = 10+10+10+10+10+6

   = 56

The percentage of persons having income more than $40,000

 

 = 81.159%

PU 2016(spring)

Q.No.1 (a) From the following frequency distribution,

Income (Rs 000)

0-10

10-20

20-30

30-40

40-50

50-60 

No. of persons

5

10

18

23

7

Construct an ogive that will help you the answer to find the number of persons having income:

(i) Less than Rs. 35000

(ii). Between Rs. 20000 and Rs.50000 (iii). More than Rs.25000 Solution:

Less than cumulative frequency distribution 

Income (Rs.000)

No. of persons(Less than c.f.) 

Less than 10

5

Less than 20

15

Less than 30

33

Less than  40

56

Less than  50 

63

Less than  60

69

 

 Less than ogive curve (or  Less than cumulative frequency curve)


 


From ogive curve,  

(i)     The number of persons having income less than Rs. 35000 = 10 +10+10+10+ 5 = 45

(ii)    The number of persons having income between Rs. 20000 and Rs.50000 = 4 +10+10+10+10+2 = 46

(iii)  The number of persons having income more than Rs.25000

= 7+10+10+10+9= 46

 

PU 2015 1(a): 

The test scores of the students in probability and statistics are listed below. Construct a stem-and leaf plot of the scores.

92    78   73   89   98   89   83   75   83    94    99    69     71     96     67    81     73    88   86    82   63   73    76    82    84    89     92      95     78     87

Also, find the lowest score of the best 25% of the students. Solution:

Arranging the given data in ascending order of magnitude:

63,   67,    69,  71,  73,  73,   73,   75,  76, 78,  78, 81,   82,    82,   83,   83,   84,     86,    87   88,   89,  89,   89,   92,  92, 94,  95,   96,      98,     99

 Stem and leaf plot

Stem

Leaves

6

3    7    9   

7

1    3    3    3     5      6     8     8       

8

1    2    2    3     3     4      6    7    8     9     9     9   

2    2    4    5     6      8     9

                   

The lowest score of the best 25% of the students is given by P75

          

 

          P75 = Value of  item

                      = Value of 75 item

                      = Value of 23.25th item

                = Value of 23th item + 0.25 (24th item - 23th item )

                      =  89 + 0.25 (92-89)

                = 89 + 0.25 × 3 

                         = 89.75 scores

PU 2013(Spring)

1. (a) The weight (in lbs) of 40 boys in a class are as follows:

138    172    145    147    150     119      158      152       168        142

157    147     102   144    165     136      164      163        128       135

126    150     146    148    145    125      146      153        138       156

173    140     135    149    140    144      132      154        142       135 (i) Construct a frequency distribution. 

(ii) Draw less than ogive and find no. of boys whose weight is less than 165 lbs.

Solution: Solution: 

Since, class size (h) is not given, therefore at first it needs to find the approximate number of class intervals (k) and class size (h) 

Number of observations, n = 40

 S = smallest value = 102  L = Largest value = 173

By Struge’s formula, 

Number of classes, k = 1 + 3.322logn 

                                      = 1 + 3.322 log40 

                                      = 1+3.322× 1.602

                                     = 6.322 ≈ 6    

 Class width or class size, h =   

Frequency distribution

Weight (in lbs)

Tally bar

No.of boys(f)

102 – 113 

 |

1

113–124

 |

1

124 –135

 |||| 

4

135 –146

||||  ||||  ||||

14

146 –157

|||| |||| ||

12

157–168

|||| 

5

168 –179

 |||

3

 

 

N = ∑ f = 40

(ii) Less than cumulative frequency distribution 

Weight (in lbs)

Less than c.f.

Less than 113

1

Less than  124

2

Less than  135

6

Less than  146

20

Less than 157

32

Less than 168

37

Less than 179

40

 

 

 

 

42

PU 2013(spring)

Q.No.1.(b): From the following distribution of mark of 500 students of a college, find the minimum pass mark if only 20% of student had failed and also the minimum mark obtained by the top 25% of the students.


Represent the data by histogram.

Marks

0-20

20-40

40-50

50-60

60-80

80-100

No. of students

50

100

150

90

60

50

 

Marks

No. of students (f)

Less than c.f.

0-20

20-40

40-50

50-60

60-80

80-100

50

100

150

90

60

50

50

150

300

390

450

500

 

N =  ∑ f = 500

 

If 20 % of the students failed i.e. 80% students passed, the minimum marks of  20% of the failed  students is given by P80

       

80th percentile (P80)

 

P80 lies in class 60- 80,

  L = 60,   f = 60,   c.f. = 390,   h = 20

 

       = 60 +  

        = 63.33 marks

& the minimum marks obtained by the top 25 % of the  students is given by P75

 

       

75th percentile (P75)

 

P75 lies in class 50- 60,

  L = 50,  f = 90,   c.f. = 300,   h = 10

 

       = 50 +   

      = 58.33 marks

For histogram

This is the case of unequal class interval, therefore adjustment of the frequencies must be made. The class size of third and fourth class intervals is 10, that of first, second, fifth and sixth is 20 which is double of 10. So, the frequencies of first, second, fifth and sixth classes are divided by 2 i.e. 50/2 = 25


PU2014 (fall)

1. (a) Represent the following data by means of histogram, frequency curve and polygon.

Salaries

300-310

310-320

320-330

330-350

350-370

370-400

No.   of

worker

7

19

28

15

12

12

 Solution:

This is the case of unequal class interval; therefore adjustment of the frequencies must be made. The class size of first three class intervals is 10, that of fourth and fifth is 20 which is double of 10. So, the frequencies of fourth and fifth classes are divided by 2 i.e. 15/2 =7.5 and 12 /2 = 6. Also, class size of

1. (a) The daily wages of workers of a factory are given below:

Wages (Rs.)

300-310

310-320

320-330

330-350

350-370

370-410

No. of workers

8

10

20

18/2

16/2

12/4

(i)    Construct a histogram and frequency polygon for the data.

(ii)  Draw an ogive for the data and estimate the median age.

Solution:


(i) This is the case of unequal class interval; therefore adjustment of the frequencies must be made. The class size of first three class intervals is 10, that of fourth and fifth is 20 which is double of 10. So, the frequencies of fourth and fifth classes are divided by 2 i.e. 18/2 =9 and 16 /2 = 8. Also, class size of last class is 40 which is 4 times of 10. so, the frequency of last class is divided by 4 i.e. 12/4 = 3.

Wages (Rs.)

No. of workers(Less than c.f.) 

Less than 310

8

Less than 320

18

Less than 330

38

Less than  350

56

Less than  370 

72

Less than  410

84

Note: For ogive curve, it not necessary to be equal class size. So, no need of adjustment. 

 

 

 

 

 

 

Md = Rs. 337

Example 9: from the following data, obtain interquartile range, Q.D. & coefficient of Q.D. Daily production: 25, 20, 23, 18, 22, 17, 26 Solution:


Arranging the given data in ascending order, we get,

        17, 18, 20, 22, 23, 25, 26  Now, 

      Q1 = value of   item.

           = value of   item.

           = value of 2nd  item.            = 18

  Q3 = value of 3  item.

th         = value of   item.

        = value of 6th  item.

        = 25

Interquartile range = Q3 − Q1

                           = 25 -18                             = 7

Quartile deviation or Semi-interquartile range (Q.D.) = 

                                                                                       

                                                                            = 3.5

Coefficient of Q.D. =  

Example 10: Find the interquartile range, Q.D. and Coefficient of Q.D. from the following series X : 9, 10, 5, 6, 7, 2 , 8, 4

Solution: Arranging the given data in ascending order X : 2, 4, 5, 6, 7, 8, 9, 10

Q1 = value of   item.

      = value of   item.

      = value of 2.25th  item.

      = Value of 2nd item + 0.25( 3th item  - 2nd  item)

 Q1= 4 + 0.25 (5 – 4)           

     = 4.25

Q3 =  value of   item.

      = value of   item.

      = value of 6.75th  item.

      = Value of 6th item + 0.75( 7th item -  6th item)

  Q3= 8 + 0.75 (9 – 8) = 8.75 Interquartile range =  𝑄3 − 𝑄1

                            = 8.75  4.25

                            = 4.5

 Quartile deviation or Semi-interquartile range (Q.D.) =  

                                                                                              

                                                                                              

                                                                                   = 2.25

Coefficient of Q.D. =  

Example11:Compute the quartile deviation of the following distribution giving the screen size of Laptop available in Nepalese Laptop Market.

Size of Screen (cm) 

No. of Laptop

Size of Screen (cm)

No. of Laptop

9.5

10.0 10.5 11.0 11.5 12.0 12.5

13.0

1

8

20

30

50

95

110

150

13.5 14.0 14.5 15.0 15.5 16.0 16.5

17.0

200

250

280

245

80

40

35

5

Solution:

Size of screen (cm) X

No. of Laptop (f) 

Less than c.f.

9.5

10

10.5

11

11.5

1

8

20

30

50

1

9

29

59

109

12

12.5

13

13.5

14

14.5

15

15.5

16

16.5

17

95

110

150

200

250

280

245

80

40

35

5

204

314

464

664

914

1194

1432

1519

1559

1594

1599

 

N= ∑ f = 1599

 

 

Quartile deviation (Q.D.) =    Q1 = value of   item.

       = value of   item

       = value of 400th  item        = 13 cm.

 Q3 = value of   item.

       = value of   item

       = value of 1200th  item        = 15 cm.

Quartile deviation (Q.D.) = 

                                               

                                               = 1 cm

Example12. The following frequency distribution represents the weight of 200 laptops.

Weight in lbs

Frequency

Weight in lbs

Frequency

4-5

5-6

6-7

7-8

20

24

35

48

8-9

9-10

10-11

11-12

32

24

8

2

Compute the first three quartiles and quartile deviation. Solution:

 

 

 

Weight in lbs

Frequency (f ) 

Less than c.f.

4-5

5-6

6-7

7-8

8-9

9-10

10-11

11-12

20

24

35

48

32

24

8

2

 

N= ∑ f = 193

 

For lower quartile or first quartile (𝑄1)

 

𝑄1 lies in class 6-7 

  L = 6,    f = 35,   c.f. = 44,   h = 1

 

        

       = 6.12 lbs

For 2nd  quartile (Q2)

 

Q2 lies in class 7-8 

  L = 7,    f = 48,   c.f. = 79,   h = 1

 

       

      = 7.36 lbs

  For 3rd  quartile (Q3)

 

Q3 lies in class 8-9

  L = 8,    f = 32,   c.f. = 127,   h = 1

 

       

      = 8.55 lbs

Quartile deviation (Q.D.) = 

                                           

                                     =1.215 lbs

Coefficient of Q.D. =  

Example 13:  The scores obtained by 10 students in Statistics I of an IT college are given below. Compute range and standard deviation

55    35       60      55       55        65      40      45      35      42

Solution:

Score (X)

X2

55

35

60

55

55

65

40

45

35

42

3025

1225

3600

3025

3025

4225

1600

2025

1225

1764

∑ X= 487

       X2 = 24739

Range (R) = L-S 

                 = 65 -35

                = 30 score

Standard deviation  

                                        

                                   = 10.109 score

Example 14: Find standard deviation (S.D.) and variance of the following data.

Variable (X)

10

14

15

18

20 

Frequency (f)

3

5

7

6

Solution:

 Variable (X)

Frequency (f ) 

fX

f𝑋2

10

14

15

18

20

3

5

7

6

4

30

70

105

108

80

300

980

1575

1944

1600

 

N = ∑ 𝑓 = 25

∑ f X= 393

∑ f X2 = 6399

 

Standard deviation ( 

                                         

                                              = √8.841                                          = 2.973

            Variance ( 

                                             = 8.841

Example 15: The frequency distribution of time required to open the operating system of 200 computers is given below.

Time in seconds

No. of computers

Time in seconds

No. of computers

0-4

5-9

10-14

15-19

2

20

35

40

20-24

25-29

30-34

35-39

48

32

18

5

Compute the standard deviation.

Solution:  Let a = 17.5

Customer service time (in minutes)

No.of customers (f)

Mid. value (X)

 

f𝑑

f𝑑′2

0-5

5-10

10-15

15-20

20-25

25-30

2

8

26

30

28

6

2.5

7.5

12.5 17.5 22.5

27.5

-3

-2

-1

0

1

2

-6

-16

-26

0

28

12

18

32

26

0

28

24

 

N = ∑ 𝑓 = 100

 

 

∑ 𝑓𝑑 = -8

∑ f𝑑′2 = 128

Standard deviation (σ ) √∑ fd′2 − (∑ fd′)2 × h 

N N

                                       

                                  = 5.64 minutes

Example 16: The following data gives on temperature of Kathmandu for a week in summer. Compute the range and quartile deviation.

Day

Sun

Mon

Tue

Wed

Thu

Fri

Sat 

Temp.()

34

35

32

35

36

34

35 

Solution:  

Range (R) = L – S 

                   = 36 – 32

                   = 4  

Arranging the given data in ascending order Temp.() X  : 32, 34, 34, 35, 35, 35, 36 Q1 = value of   item.

      = value of   item.

      = value of 4th  item.       = 35  

Q3 = value of   item.

      = value of   item.

      = value of 6th  item.

      = 35  

Quartile deviation (Q.D.) = 

                                               

                                              = 0.5  

Example 17: The number of runs scored by two group of cricket players in a test match are

Group A

10

25

85 

72

115

80 

52

45

30

10 

Group B

120

15

30 

35

42

65 

80

34

25

15 

Test which group is more consistent.

Solution:

For Group A

No. of runs(X)

𝑋2 

10

25

85

72

115

80

52

45

30

10

100

625

7225

5184

13225

6400

2704

2025

900

100

∑ 𝑋= 524

∑ 𝑋2=38488

̅X = ∑ X = 524 = 52.4 n    10

S.d.  

 C.V. (Group A) =   

                             

                        = 63.38% 

 

 

 

 

For Group B

No. of runs(X)

𝑋2 

120

15

30

35

42

65

80

34

25

15

14400

225

900

1225

1764

4225

6400

1156

625

225

∑ 𝑋= 461

∑ 𝑋2=29929

̅X = ∑ X = 461 = 46.1 n    10

S.d.  

 C.V. (Group B) =   

                             

                        = 63.86% 

Since, C.V. (Group A) < CV. (Group B).Therefore, group A is more consistent.

For Group B

No. of runs(X)

𝑋2 

120

15

30

35

42

65

80

34

25

15

14400

225

900

1225

1764

4225

6400

1156

625

225

∑ 𝑋= 461

∑ 𝑋2=29929

̅X = ∑ X = 461 = 46.1 n    10

S.d.  

C.V. (Group B) =   

                            

                       = 63.86% 

Since, C.V.(Group A) < CV( Group B).Therefore, group A is more consistent.

Example 18: The following data represents the scores made in an intelligent test by two groups of students from section A and section B of a college.

Students no.

Section A

Section B

Students no.

Section A

Section B

1

2

3

4

5

9

8

10

6

7

10

8

6

8

9

6

7

8

9

10

8

5

6

7

8

8

7

8

5

8

Test which group is more consistent.

Example 19:What are the roles of measure of dispersion in descriptive statistics? Following table gives the frequency distribution of thickness of computer chips (in nanometre) manufactured by two companies.

Thickness of computer chips

5

10

15

20

25

30

Number of chips by 

Company A

10

15

24

20

18

13

Company B

12

18

20

22

24

4

Which company may be considered more consistent in terms of thickness of computer chips? Apply appropriate descriptive statistics.

Solution: For company A

Thickness of computer chips(X) 

No. of chips (f) 

d = X-15

fd

fd2

5

10

15

20

25

30

10

15

24

20

18

13

-10

-5

0

5

10

15

-100

-75

0

100

180

195

1000

375

0

500

1800

2925

 

N =∑ f = 100

 

∑ fd= 300

∑ fd2= 6600

Mean ( 

                  =  15 +                    = 18

S.d.  

 C.V. (Company A) =   

                                  

                             = 41.93% 

 

 

For company B

Mean ( 

                  =  15 +  

                  = 17

Thickness of computer chips(X) 

No. of chips (f) 

d = X-15

fd

fd2

5

10

15

20

25

30

12

18

20

22

24

4

-10

-5

0

5

10

15

-120

-90

0

110

240

60

1200

450

0

550

2400

900

 

N =∑ f = 100

 

∑ fd=  200

∑ fd2= 5500

S.d.  

 C.V. (Company B) =   

                                  

                                =  42.005%

Since, C.V.(Company A) < CV( Company B).Therefore, company A is considered more consistent than company B in terms of thickness of computer chips.

Example 20: The following table shows the monthly expenditure of ward no.1 and ward no. 2 of Kathmandu Metropolitan City in certain locality.

Expenditure (in 000 Rs.)

0-5

5-10

10-15

15-20

20-25

25-30

No. of families (ward no.1)

5

12

50

20

10

3

No. of families (ward no.2)

7

15

40

18

12

8

Which ward of people has uniform expenditure?

Solution: The ward of people has more uniform expenditure whose Coefficient of variation (C.V.) is less.                            

 

 

 

 

 

Expenditure          (in 000 Rs.)

No. of families

(ward no.1) f

mid. value    (x)

 

f𝑑

f𝑑′2

0-5

5-10

10-15

15-20

20-25

25-30

5

12

50

20

10

3

2.5

7.5

12.5 17.5 22.5

27.5

-2

-1

0

1

2

3

-10

-12

0

20

20

9

20

12

0

20

40

27

 

N = 100

 

 

∑ f𝑑= 27

∑ f𝑑′2 = 119

                                         For ward no.1 

 

     

 

     

C.V. (Ward no. 1) = 19.3223%

For ward no.2  

Expenditure (in 000 Rs.)

No. of families

(ward no.1) f

mid.      value

(x)

 

f𝑑

f𝑑′2

0-5    

5-10

10-15

15-20

20-25

25-30

7

15

40

18

12

8

2.5

7.5

12.5 17.5 22.5

27.5

-2

-1

0

1

2

3

-14

-15

0

18

24

24

28

15

0

18

48

72

 

N = 100

 

 

∑ f𝑑= 37

∑ f𝑑′2 = 181

      

 

      

C.V.(Ward no. 2) = 45.06911%

Since, C.V.(Ward no. 1) < C.V.(Ward no. 2). Therefore, people of ward 1 has more uniform expenditure than ward no. 2.

 

 

Example 21: The following table gives the two bike models and their corresponding life:

Life (in years)

 

0-2

2-4

4-6

6-8

8-10

No. of bikes

Model T 1

1

9

12

11

8

Model T2

5

7

11

19

9

Which model of bike has greater uniformity?

Solution:

We have to compute coefficient of variation (C.V.) to determine the uniformity.

Computation of Sum of Values for Mean and S.D.

                                 For Model 𝑻𝟏

Life 

(in years)  

No. of bikes

(f)

mid. value    (x)

 

f𝑑

f𝑑′2

0-2

2-4

4-6

6-8

8-10

 

1

9

12

11

8

1

3

5

7

9

 

-2

-1

0

1

2

 

-2

-9

0

11

16

 

4

9

0

11

32

 

 

N = ∑ 𝑓 = 41

 

 

∑ f𝑑= 16

∑ f𝑑′2 = 56

5 +0.78 = 5.78 years

      = 2.202 years

C.V. (Model38.09%

            

 For Model 𝑻𝟐

Life 

(in years)  

No. of bikes  (f)

mid. (x)

value 

f𝑑

f𝑑′2

0-2

2-4

4-6

6-8

8-10

 

5

7

11

19

9

1

3

5

7

 

 

9

-2

-1

0

1

2

 

-10

-7

0

19

18

 

20

7

0

19

36

 

 

N = ∑ 𝑓 = 51

 

 

 

∑ f𝑑= 20

∑ f𝑑′2 = 82

 ̅X = a +∑ fd′ × h = 5+ 20 × 2 = 5 +0.7843 = 5.7843 years

N     51 √∑ fd′2 − (∑ fd′)2    × h 

 σ  =

N      N

      = 2.4116 years

C.V. (Model41.692%

Since, C.V. (Model T1) < C.V. (Model T2). Therefore, model T1 of bike has greater uniformity than model T2

 

PU 2014 (Spring), 2015 (Spring), 2018 (Fall)

b) The lives of two models (A and B) of refrigerators in recent survey are shown below:

Life (No. of years)

No. of refrigerators

Model A

Model B 

0-2

2-4

4-6

6-8

8-10

10-12

5

16

13

7

5

4

2

7

12

19

9

1

i.   What is the average life of each model of these refrigerators?

ii.  Which models has greater uniformity?

Solution: We have to compute coefficient of variation (C.V.) to determine the uniformity.

For Model A

Life 

(in years)  

No. of refrigerators

(f)

mid. (x)

value 

fx

f𝑥2

0-2

2-4

4-6

6-8

8-10

10-12 

5

16

13

7

5

4

1

3

5

7

9

11 

 

5

48

65

49

45

44 

5

144

325

343

405

484 

 

N = ∑ f = 50

 

 

∑ fx = 256

∑ fx2 = 1701

 ̅X = ∑ f𝑥 =  256 = 5.12 years

N          50

 

      =  2.793 years

C.V. (Model A) = %

 

 

 

For Model B

Life 

(in years)  

No. of refrigerators (f)

mid. value    (x)

fx

f𝑥2

0-2

2-4

4-6

6-8

8-10

10-12 

2

7

12

19

9

1

1

3

5

7

9

11 

2

21

60

133

81

11

2

63

300

931

729

121

 

N = ∑ f = 50

 

∑ fx = 308

∑ fx2 = 2146

 ̅X = ∑ f𝑥 =  308 = 6.16 years

N          50

 

      =  2.2303 years

C.V. (Model B) = %

(i)  The average life of each model of these refrigerators are 

̅X (Model A) = 5.12 years 

& ̅X (Model B) = 6.16 years 

(ii)Since, C.V. (Model A) > C.V. (Model B). Therefore, model B of refrigerator  has greater uniformity than model A. PU 2015 (Fall)

1. b) Lives of two models A & B of objects in a recent survey are:

Life

0-2

2-4

4-6

6-8

8-10

10-12

Model A

5

16

13

7

5

4

Model B

2

7

12

19

9

1

Which model has greater uniformity?

PU 2016 (Fall)

1. (a) For a computer controlled lathe whose performance was below par, workers record the following causes and their frequencies:

Power fluctuation                6

Controller not stable           22

Operator error                     13

Worn tool not replaced       2

Other                                  5

Construct Pareto chart.

(i)    What percentage of the cases are due to an unstable controller?

(ii)  What percentage of the cases is due to either unstable controller or operator error?  

 Solution

Arrange data in descending order and obtain frequencies and percentage cumulative frequencies as follow;

Categories

Frequency

Cumulative frequency

% cumulative frequency

Controller not stable

22

22

46

Operator error

13

35

73

Power fluctuation

6

41

85

Worn tool not replaced

5

46

96

Others

2

48

100

 

(i)    The percentage of the cases are due to an unstable controller = 100 = 45.83 %

(ii)  The number of cases is due to either unstable controller or operator error = 22 + 13 = 35

The percentage of the cases is due to either unstable controller or operator error 100 

                                                                                                                    = 72.92 %

PU 2016 (Spring)

1.(b) An analysis of monthly wages paid to the workers in two firms A and B belonging to the same industry gives the following results: (use population)

 

Firm A

Firm B

No. of workers

500

600

Average monthly wages (Rs) 

186

175

Variance of distribution of wages (Rs) 

81

100

i.    Which firm, A or B has a larger wage bill?

ii.  In which firm, A or B is there greater variability in individual wages?

iii.Calculate (a) the average monthly wages (b) the variance of the distribution of wages, of all the workers in the firm A and B taken together. Solution:        

For Firm A       For firm B n1 = 500         n2 = 600      

̅X1 = Rs 186    ̅X2 =  Rs175  σ12 = 81   σ22 = 100

      

σ1 = √81 = 9             σ2 = √100 = 10

(i) For firm A 

 

or, ∑ X1 = n1 × ̅X1 = 500× 186 = Rs. 93000 For firm B 

 

or, ∑ X2 = n2 × ̅X2 = 600× 175 = Rs.105000

Since ∑ X1  < X2  therefore, firm B has a larger wage bill than firm A.

(ii) 

C.V. 

                                       

                    = 4.838%                          

C.V.    

                            

                           = 5.714%   

Since, C.V. (Firm A) < C.V.(Firm B). Therefore, in  firm B  there greater variability in individual wages than firm A . 

(iii) (a) The average monthly wages of all the workers in the firm A and B taken together is given by 

  

        

      = Rs. 180       

(b) The variance of the distribution of wages of all the workers in the firm A and B taken together is

      

                            

     

                                       = 121.363

                                   Where  d1 = ̅X1 − ̅X12= 186−180 = 6 

                                  d2 = ̅X2 − ̅X12 = 175−180 =- 5

Example  

For a group of 200 candidates, the mean and standard deviation were found to be 40 and 15. Later on it was discovered that the score 53 was misread as 35. Find the correct mean and standard deviation corresponding to the correct figures.

Solution:  

We have given,

                 n   = 200           Mean (𝑋̅) = 40

                 Standard deviation   = 15

                 Wrong observation (i.e. wrong score) = 35                 Corrected observation (i.e. correct score) = 53           Corrected Mean (𝑋̅ correct) =?

        Corrected standard deviation(𝜎𝑐𝑜𝑟𝑟𝑒𝑐𝑡) =?

         

We know, 

              

              or, 40 =  

        or, ∑ X = 200 × 40

        or, ∑ X  =  8000

 Corrected ∑ X = ∑ X – Wrong observation + Correct observation

                      = 8000 – 35 + 53 = 8018

 Correct mean 

     

Again,

S.D. 

   or, 15  =             or, 15  =             

   or, 15  =  

Squaring both sides

     or, 225 =   or,   225 + 1600 =  or, 1825 =  

or, ∑ 𝑋2 = 1825× 200 = 365000

  Corrected  ∑ 𝑋2 = ∑ 𝑋2 – (Wrong observation)2  + (Correct observation)2          = 365000 – (35)2 + (53)2 = 366584

Corrected S.D.  =      

                                        

                           = 15.02  

 

 

 

Example  

The mean and standard deviation of a set of 100 workers were found to be 40 and 12 respectively. On checking, it was found that two workers were wrongly taken as 23 and 15 instead of 43 and 18.

Calculate the correct mean and standard deviation. Also, find correct variance.

Solution:

        We have given,

                 Total no. of observations (n) = 100

                 Mean (𝑋̅)= 40

                 Standard deviation   = 12             

                 Wrong observations   = 23 and 15

                 Correct observations = 43 and 18

        We know, 

                       or, 40 =  

        or, ∑ 𝑋 = 100 × 40

        or, ∑ 𝑋  =  4000

Corrected ∑ 𝑋 = ∑ 𝑋 – Wrong observations + Correct observations                       = 8000 – 23–15 + 43+18 = 4023

 Correct mean 

     Again,

        S.D.

           or, 12  =           or, 12  =     

                or, 12  =  

Squaring both sides

         

 or, 144  =   or,   144 + 1600 =  or, 1744 =  or, ∑ 𝑋2 = 1744× 100 = 174400

   Corrected  ∑ 𝑋2 = ∑ 𝑋2 – (Wrong observations)2  + (Correct observations)2

                                          = 174400 – (23)2 – (15)2 + (43)2 + (18)2

                           = 174400 – 529 – 225 + 1849 + 324                               = 175819                     

Corrected S.D.  =       

                                        

                     = 11.82

Correct variance (σ2correct ) = correct)2  = (11.82)2  = 139.737

           Corrected mean (̅X) = 40.23

                  Corrected standard = 11.82

                  & Correct variance (σ2correct ) = 139.737 Additional question

A factory produces two types of CFL bulbs A and B . The following results were obtained relating to their life

 

Bulb A

Bulb B

No. of bulbs

100

90

Average length of life

900 hours

1000 hours

Variance

121

144

(a)  Compare the variability of life of two types of CFL bulbs.

(b)  Calculate the standard deviation of both types of CFL bulbs taken together.

(c)  Also compute coefficient of variation of both types of CFL bulbs taken together. Solution:        

(a) For Bulb A  

 

 

For Bulb B

n1 = 100              

 

   

n2 = 90      

̅X1 = 900 hours 

 

 

̅X2 = 1000 hours

     

σ1 = √121 = 11                                   σ2 = √144 = 12

C.V.           C.V. 

                                                           

                        =        1.222%                                             = 1.2%

 Since, C.V. (Bulb A) > C.V.(Bulb B). Therefore, the life of type of Bulb A is more variability than type of Bulb B. That is, the life of type of Bulb B is more consistent than type of Bulb A.

 The standard deviation of both types of CFL bulbs taken together (i.e. combined standard deviation) is 

    

      

             

       

Where,  

                = 947.3684

                  1 = ̅X1 − ̅X12= 900947.3684 = - 47.3684 

                   d2 = ̅X2 − ̅X12 = 1000947.3684= 52.631   

c) Coefficient of variation of both types of CFL bulbs taken together (i.e. combined C.V.) is

        Combined C.V. =   

                                         

                                    = 0.392% 

 

PU2014 (fall)

1. (b) The first two groups have 100 items with mean 45 and variance 49. If the combined group has 250 items with mean 51 and variance 130, find the mean and standard deviation of the second group.

Solution:

 

 

                  

  first group 

 

 

  second group            Combined group

n1 = 100   

 

 

    n2 = 150                  n1 + n2 = 250

̅X1 = 45

 

 

     ̅X2 = ?          ̅X12 = 51

σ12 = 49                           σ2 = ?            122 = 130

 

or,  or,  

or, 4500 + 150̅X2 = 12750 or, 150̅X2 = 12750 - 4500 or, 150̅X2 = 8250

or,  

        ̅X2 = 55

And

            

or, 130 =              or,  

or, 10900 + 150 σ22 = 19500 or, 150 σ22 = 19500 – 10900 or, 150 σ22 = 8600

or, 

or, σ22 = 57.33

    𝜎2  = 7.571 Where, 

  d1 = ̅X1 − ̅X12 = 45 −51 = - 6

  d2 = ̅X2 − ̅X12 = 5551 = 4

 

 

              The End.