![]() |
![]()
|
| Main Site > Software / IT Channel > Statistics > Hypothesis Testing | Search: | for |
How to Compare Data Sets - ANOVA
In 1920, Sir Ronald A. Fisher invented a statistical way to compare data sets. Fisher called his method the Analysis of Variance, which was later dubbed an ANOVA. This method eventually evolved into Six Sigma data set comparisons. An ANOVA is a guide for determining whether or not an event was most likely due to the random chance of natural variation. Or, conversely, the same method provides guidance in saying with a 95% level of confidence that a certain factor (X) or factors (X, Y, and/or Z) were the more likely reason for the event. The F ratio is the probability information produced by an ANOVA. It was named for Fisher. The orthogonal array and the Results Project, DMAIC designed experiment's cube were also his inventions. An ANOVA can be, and ought to be, used to evaluate differences between data sets. It can be used with any number of data sets, recorded from any process. The data sets need not be equal in size. Data sets suitable for an ANOVA can be as small as three or four numbers, to infinitely large sets of numbers. How to Complete an Excel ANOVA Here is how you could use an Excel ANOVA to determine who is a better bowler. You could and can use an ANOVA to compare any scores. Lengths of stay, days in AR, the number of phone calls, readmission rates, stock prices and any other measure are all fair game for an ANOVA. Below are six game scores for three bowlers. Which bowler is best? If there is a best bowler, is the difference between bowlers statistically significant?
Step 1. Recreate the columns using Excel. Each bowler's name is the field title. The Graphic ANOVA As you advance in your Six Sigma learning you may want to learn to use a more advanced Six Sigma software program. One such program is called Stat-Ease Design-Expert. Stat-Ease calculates an ANOVA and graphically shows statistical differences between sets of data. It is all achieved with mouse clicks. You won't have to look at, or calculate an equation.
Each I-Bar has a black square in its center. This square identifies the average score for each bowler. The top and bottom of each I-Bar extends two standard deviations, 2s, above and below the mean. Think of these as the Upper Control Limit (UCL) and Lower Control Limit (LCL) for each bowler's score. Each I-Bar covers 95% of an imaginary, on-its-side bell curve for each bowler. This Six Sigma data array of fields and records would tell us a little about each observation. The more fields, the richer our understanding can be. For example, in the same amount of space the following table has twice as much data. Rich data, meaning each column/field has a crystal clear operational definition, can yield rich information. Many times we collect dozens of fields for each recorded observation. Since data collection is time consuming and expensive, design your collection plan with care before you begin. Note the overlapping values in red circle markers. On occasion both Pat and Sheri could have bowled a better game than Mark. But, when the data are viewed using the mean value ( About The Author Reproduction Without Permission Is Strictly Prohibited Copyright Requests Publish an Article: Do you have a Six Sigma tip, learning or case study? Share it with the largest community of Six Sigma professionals, and be recognized by your peers. It's a great way to promote your expertise and/or build your resume. Read more about submitting an article. Download the iSixSigma Toolbar for 1-Click access. Search Your Way. Everyday. Without Delay.
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Home | Discussion Forum | Event Calendar | Job Shop | |
| Link To iSixSigma | Rate This Page | Report A Problem | Free Content For Your Site | Submit Article For Publishing | |
| Terms of Service. ©2000-2008 iSixSigma. All rights reserved. v3.0lb, 3.8-C-246 |
About iSixSigma · Contact Us · Privacy Policy · Site Map. |