Basic Statistics – III – Introduction to Probability

Basic Statistics - III - Introduction to Probability 1 Probability is a concept which is simple but powerful if applied correctly. A Very Simple example is that when there is coin tossed up,there are two outcomes possible. HEAD and Tail. If the coin has no bias, both the out comes are equally possible. We say that there is a possibility of 50% (0.5) each. Let us extend this to another commonly used game. If we throw a dice, there is a possibility of one out of the six outcomes. The dice will have 6 sides with numbers 1-6 on each side. Here the Probability is 1/6. Meaning each side has a 16.666% (0.01666) chances. In both these cases, assume that each outcome is an event, and  chance of occurrence is called probability.
There are many definitions to Probability.
The simplest is that “The measure of likelihood of occurrences“.The classical definition of probability is stated as below.
If there are “n” Exhaustive, Mutually exclusive and equally likely events, and “m” of them are favorable to an event “E” then the probability of occurrence of event “E”, denoted by Pr[E] is 

Pr[E] = m/n
Here N need to fulfill 3 conditions 1 – Mutually Exclusive (The events are Mutually Exclusive if there is no possibility of them occurring together. Ex : Head and Tail of a same coin.),  2- Collectively Exhaustive (All the possible events are to be taken into account. ex: in the coin it is 2 ) and 3 – Equally Likely ( there shall not be any bias towards any event.)
Statistical (Empirical) definition of Probability:
If an experiment is repeated many times, under identical conditions, then the limit of the ratio of number of times that an event happens(m) to the total number of trials(n), as the umber of trials increases indefinitely, is called as probability of happening of the event. 
Please share this on Facebook. if you like this blog, like us on https://www.facebook.com/excellencementor

Originally posted 2014-02-23 11:10:00.

Basic Statistics – II What is a Variable and What are variable types

Variable is a characteristic, number, or quantity that increases or decreases over time, or takes different values in different situations. The variables are the basic units used in statistics for measuring , collecting and analyzing. Variables can be classified in to different categories depending on the usage at the point of analysis. The different variable types are

Dependent and Independent Variable types 

An independent Variable can take any value and can be controlled and measured. These are the inputs used for the study. These are also called factors.
A Dependent Variable cannot be controlled. it can only be measured. these are generally output of the changes done to the independent variables. The value of the dependent variable is dependent on the relation on the independent variable. These are called as responses.
It is notable that the dependent and independent variables are not fixed. a dependent variable in one experiment or study may become factor in a different experiment or study. 
For Example, The heat generated is dependent on the amount of fuel burnt. (in this case, heat is a dependent variable and amount of fuel is an independent variable. 
In a different experiment, the time taken for completely evaporate a substance is dependent on the amount of heat supplied. in this case, the time taken is the dependent variable and amount of heat is an independent variable. It is notable here that amount of heat is dependent in one experinment and independent in another experiment.

Qualitative and Quantitative Variable types

Variables are also classified according to the type of the data they represent. This classification depends on the type of the value associated with the variable.
A Qualitative variable describes the characters in a non numerical form. They are also called as categorical variables. Examples of the values which a categorical variable can take are Good, Bad, Red, Blue, Light, heavy, etc. The variables are result, color, weight etc. This is also called as nominal variable.
A Quantitative variable has a numerical value associated with it. This would be a counted or a measured value. These are also called as Numerical variables. Examples of the values a variable are in numbers, 0, -1, 1,2 etc. the variables are height, weight etc. 
Notable that the same variable can be a qualitative or quantitative depending upon the value it takes. for example, if height is give a measured value such as 1.72 Meters, height is a quantitative variable. If the same height is expressed in a comparative value such as tall, short, height is a Qualitative Variable.

Discrete and Continuous Variables.

A discrete variable is something which is an output of counting. This can take only a set of values including negative and fractional values. Examples for a discrete variable are Number of people, charge on electron, etc…. . As a thumb rule, if there a prefix “number of” to the variable, it can be treated as a discrete variable.
A continuous variable can take any value within a specified range. This is generally a measured value. examples of continuous variables are speed, height, distance etc.
Discrete and continuous variables are subset of Numerical variable types

Binomial, Nominal and Ordinal Variables.

A binomial variable can take only two possible values. There is no third option available. For example, result of a test (pass or Fail), Result of tossing a coin (head or tail) etc
A Nominal variable can take several un-ordered values. Examples such as color red, blue, green), Type of bank account( savings, checking etc).
An ordinal variable can have any of the several ordered values. There is clear distinction between the order of the values which are assigned example such as height (tall, short), or response in a survey of satisfaction (excellent, good, poor, etc)
Binomial, Nominal and Ordinal variables are subset of the Qualitative variable types
Please share this on Facebook. if you like this blog, like us on https://www.facebook.com/excellencementor

Originally posted 2014-01-01 18:25:00.

Sixth principle of SPC – causes for Variation

According to the sixth principle of SPC  a frequency distribution will be deviating from normal distribution, only in the presence of any assignable cause.
A frequency distribution is a tally of measurements that shows the number of times the measurement is included int he tally. From this frequency distribution we can see if there are only chance causes present in the process of any assignable causes are acting.
If there is a distortion from the normal curve, we can say that there is presence of assignable causes. This finding can actually help us to find the causes and address them.
Various effects of the presence of assignable causes, will tend to distort the shape in center, or the spread as sees earlier. This indication forms the basis of various techniques used in Statistical Process Control.
Please share this on facebook. if you like this blog, like us on https://www.facebook.com/excellencementor

Originally posted 2012-05-02 01:31:00.

Fifth Principle of SPC – shape of the distribution

The Fifth Principle of SPC  states that it is possible to determine the shape of the distribution form the measurements of any process. We can learn abut what the process is doing, against what we want the process to do. For this we need to measure the output of the process with the design specifications.the process can be altered if we donot like the comparison, especially if we see a variation.
We need to address eh variation so that it falls in the required pattern. The variation is due to mainly of 2 types. Common Cause variation and Special Cause Variation.
If the variation in output is caused only by common causes, the output will vary in a normal and predictable manner. In such cases, the process is said to be “stable” or “in a state of Statistical Control”.  While the individual measurements may differ from each other, they tend to follow a Normal Distribution.
The normal distribution is characterized by the following

  • Location (Typical Value)
  • Spread – Amount by which the smaller values differ from the center.

The shape of the distribution will deviate from the normal curve in case of any un usual occourances.  These changes can be called as Assignable causes.

The presence of assignable causes will result in difference from the usual normal curve, either in Shape, or in spread or a combination of both.
Fifth Principle of SPC - shape of the distribution 2
Non Normal
some changes are given below. 
Fifth Principle of SPC - shape of the distribution 3
Normal

Fifth Principle of SPC - shape of the distribution 4
Non Normal

The above findings will lead us to the sixth principle of SPC – Variation due to assignable causes tend to distort the normal distribution curve.

Please share this on Facebook. if you like this blog, like us on https://www.facebook.com/excellencementor

Originally posted 2012-05-02 01:24:00.

Fourth Principle of SPC – the shape is like a bell

Fourth Principle of SPC is logical extension of the third principle which is covered in my last post.  In which it was said that most measurements will be clustered around the middle. In Fact it was proved by statisticians that we can make failry accurate predictions of the percentage measurements in the various sections of the frequency distribution curve.

Fourth Principle of SPC - the shape is like a bell 5
Frequency curve with normal distribution
You can see this graph Most measurements fall clso to the middle. This is applicable in general. You will find about 68.2% (34.1%+34.1%)of the measurements will be in the two middle sections of this graph.
28%(14%+14%) of the measurements will fall within the next two sections after the middle sections.
About 4.2%(2.1%+2.1%) will fall in the two outside sections.
A very minute percentage of the measurements will fall outside these sections.  This seems to be a bit odd, bu this is a proven fact. However, absense of external conditions is mandatory.
This Curve shown above will be called as a normal distribution. In fact many statsistical theories are centered around the theme of Normal distibution.
The above example will lead to our fifth principle of Statistical Process Control (SPC)- It is possible ot determine the shape of the distribution curve for parts/output produced by any process.
Please share this on facebook. if you like this blog, like us on https://www.facebook.com/excellencementor

Originally posted 2012-03-03 02:14:00.

Third Principle of SPC – Things Vary in a pattern

The Third Principle of SPC is extension of the second one. In the last post on Second Principle of SPC,  I mentioned that we notice a Feature if the measurements of the output are analysed.

If we want to see the pattern, all we need to do is plot the individual data points or the measures taken onto a tally form. we will see definite pattern begin to form after several measurements are plotted.
An easy way to demonstrate this is to roll a pair of dice about 50 times or more and record them on a tally sheet. the pattern we see is a frequency distribution.
We can make a frequency distribution curve by enclosing the tally marks in a curved line. The curve you see will have more measurements at the middle and fewer as we go away from middle. It can be seen that the curve looks like a bell.
Whenver one takes a group of measurements, a frequency distribution curve appears.

This would be explaind by the fourth basic principle of Statistical Process Control (SPC)

Please share this on facebook. if you like this blog, like us on https://www.facebook.com/excellencementor

Originally posted 2012-03-03 01:01:00.

Second Principle of SPC – Variation can be measured

We have already discussed about the same thing done by us giving different output in the first principle of SPC. The second principle is based on the first principle and states that the variation in the process can be measured.
Some Variation is always inherent to our job and this is acceptable to some extent so far as the variation is within the Tolerance. However, the Variation tends to increase over a period of time. We need to measure and monitor our job to see that the variation is well within the normal expectations. If we donot make an effort to do so, we land up in trouble and the consequences add to the costs.
Even though it is always desirable to Measure the output of a process, it becomes necessary to measure the output of the process or operation to know when the trouble is brewing.
The measurements can be on the characteristics of the output. It can be the Continuous Variables dimensions, or attribute Variables like colour, shape, finish etc.
After collecting the information as described above, we must analyse to see if things are OK. When we check the output of the feature,  we will quickly notice a Feature. This feature noticed is the basis of third principle of Statistical Process control  – Things Vary according to a definite pattern
Please share this on facebook. if you like this blog, like us on https://www.facebook.com/excellencementor

Originally posted 2012-02-28 01:36:00.