Difference between revisions of "Significance of E. Coli Evolution Experiments"
m (Typo in my previous edit summary. It should have said, "Removed unsupported claims.") |
(per talk page; more changes are needed) |
||
| Line 106: | Line 106: | ||
|} | |} | ||
| − | When the | + | When the Monte Carlo test is used to compute the significance of this data, the p-value is 0.0085 (see Table 2 of the paper). This p-value is considered statistically significant. However, when the data is analyzed using the chi-square test the p-value is 0.19. This p-value is much larger than the one from the paper and suggests that there is no reason to reject the null hypothesis. The chi-square test p-value for experiment two is small (0.0004). Similarly, the p-value produced by a chi-square test on the data from the third experiment is 0.22, which would lead to rejection of the experimental hypothesis. |
| − | The chi-square test is a common statistical method.<ref>''Mathematical Statistics with Applications'' by Wackerly, Mendenhall, and Scheaffer, Section 14.4.</ref> | + | The chi-square test is a common statistical method but it is only considered a valid test if the data used in the calculation satisfy a number of assumptions. The chi-square test always produces an approximate p-value, which approaches the 'true' p-value when the numbers used in the calculation are large. When the assumptions are violated, for example when the numbers in the 'expected' cells are below threshold levels, it is considered a violation of good statistical practice to use the chi-square test. For small cell counts, the chi-square test has an unacceptably high type II error; that is to say that it produces approximate p-values that are higher than they should be, and thereby supports the null hypothesis when it is false. The chi-square test also assumes that there is no relationship between the categories into which the experimental data are dividided; therefore in situations in which the categories are linked (for example 'patients before treatment; and 'same patients after treatment' or 'IQ scores of the same study group at times A, B, and C'), the chi-square test is not to be used. In this article, the categories entered into the chi-square test are not independent, so the results of a chi-square test are not valid. Furthermore, the numbers in the cells are below the threshold for chi-square acceptability.<ref>http://www.okstate.edu/ag/agedcm4h/academic/aged5980a/5980/newpage28.htm |
| + | </ref> <ref>http://faculty.chass.ncsu.edu/garson/PA765/chisq.htm</ref> | ||
| + | <ref>http://www.wellesley.edu/Psychology/Psych205/chisquareindep.html</ref> | ||
| + | <ref>http://www.minitab.com/support/docs/Answers/Chi-Square%20Test%20Assumptions.pdf</ref> | ||
| + | <ref>http://mysite.du.edu/~jcalvert/econ/chisquar.htm</ref> | ||
| + | <ref>http://books.google.com/books?id=yU15rUiLRI8C&pg=PA201&lpg=PA201&dq=chi-square+test+assumptions&source=bl&ots=FRY0LwQ3z_&sig=FyIvzJx3hjQ8nWlu2cpmZj3pwXY&hl=en&ei=fm-1SayaNI_MMKX5tO4E&sa=X&oi=book_result&ct=result#PPA185,M1 | ||
| + | </ref><ref>http://www.basic.northwestern.edu/statguidefiles/gf-dist_ass_viol.html</ref><ref>''Mathematical Statistics with Applications'' by Wackerly, Mendenhall, and Scheaffer, Section 14.4.</ref> | ||
| + | |||
| + | The chi-square test can be implemented in Microsoft Excel. If the numbers from the last four columns of the experiment one data table (excluding the “totals” row) are entered into Excel in rows 1-12 and columns A-D, then the p-value can be computed by entering “=CHITEST(A1:B12,C1:D12)” into any empty cell of the spreadsheet. | ||
==Experiment Three Data== | ==Experiment Three Data== | ||
Revision as of 17:02, March 15, 2009
Blount, Borland, and Lenski[1] claimed that a key evolutionary innovation was observed during a laboratory experiment. That claim is false. The claim was based on incorrect measurements of statistical significance. Rather than using a test from the statistics literature, a flawed test was contrived and used to measure significance. The flawed test (“mean mutation generation”) produced artificially low p-values.
Contents
Experiment One Data
The data from experiment one of the paper is shown below (see Table 1 of the paper). The expected outcomes under the null hypothesis (no evolutionary innovation occurs) are also shown.
| Generation | Trials | Mutants | Statics | Expected Mutants | Expected Statics |
|---|---|---|---|---|---|
| 0 | 6 | 0 | 6 | 0.333 | 5.667 |
| 10000 | 6 | 0 | 6 | 0.333 | 5.667 |
| 20000 | 6 | 0 | 6 | 0.333 | 5.667 |
| 25000 | 6 | 0 | 6 | 0.333 | 5.667 |
| 27500 | 6 | 0 | 6 | 0.333 | 5.667 |
| 29000 | 6 | 0 | 6 | 0.333 | 5.667 |
| 30000 | 6 | 0 | 6 | 0.333 | 5.667 |
| 30500 | 6 | 1 | 5 | 0.333 | 5.667 |
| 31000 | 6 | 0 | 6 | 0.333 | 5.667 |
| 31500 | 6 | 1 | 5 | 0.333 | 5.667 |
| 32000 | 6 | 0 | 6 | 0.333 | 5.667 |
| 32500 | 6 | 2 | 4 | 0.333 | 5.667 |
| Total | 72 | 4 | 68 | 4 | 68 |
When the Monte Carlo test is used to compute the significance of this data, the p-value is 0.0085 (see Table 2 of the paper). This p-value is considered statistically significant. However, when the data is analyzed using the chi-square test the p-value is 0.19. This p-value is much larger than the one from the paper and suggests that there is no reason to reject the null hypothesis. The chi-square test p-value for experiment two is small (0.0004). Similarly, the p-value produced by a chi-square test on the data from the third experiment is 0.22, which would lead to rejection of the experimental hypothesis.
The chi-square test is a common statistical method but it is only considered a valid test if the data used in the calculation satisfy a number of assumptions. The chi-square test always produces an approximate p-value, which approaches the 'true' p-value when the numbers used in the calculation are large. When the assumptions are violated, for example when the numbers in the 'expected' cells are below threshold levels, it is considered a violation of good statistical practice to use the chi-square test. For small cell counts, the chi-square test has an unacceptably high type II error; that is to say that it produces approximate p-values that are higher than they should be, and thereby supports the null hypothesis when it is false. The chi-square test also assumes that there is no relationship between the categories into which the experimental data are dividided; therefore in situations in which the categories are linked (for example 'patients before treatment; and 'same patients after treatment' or 'IQ scores of the same study group at times A, B, and C'), the chi-square test is not to be used. In this article, the categories entered into the chi-square test are not independent, so the results of a chi-square test are not valid. Furthermore, the numbers in the cells are below the threshold for chi-square acceptability.[2] [3] [4] [5] [6] [7][8][9]
The chi-square test can be implemented in Microsoft Excel. If the numbers from the last four columns of the experiment one data table (excluding the “totals” row) are entered into Excel in rows 1-12 and columns A-D, then the p-value can be computed by entering “=CHITEST(A1:B12,C1:D12)” into any empty cell of the spreadsheet.
Experiment Three Data
The experiment three data from Blount et al. is shown in the table below. The expected numbers of mutants under the null hypothesis (constant mutation rate) is also shown.
| Generation | Trials | Mutants | Statics | Expected Mutants | Expected Statics |
|---|---|---|---|---|---|
| 0 | 200 | 0 | 200 | 0.571 | 199.429 |
| 10000 | 200 | 0 | 200 | 0.571 | 199.429 |
| 20000 | 200 | 0 | 200 | 0.571 | 199.429 |
| 25000 | 200 | 0 | 200 | 0.571 | 199.429 |
| 27500 | 200 | 2 | 198 | 0.571 | 199.429 |
| 29000 | 200 | 0 | 200 | 0.571 | 199.429 |
| 30000 | 200 | 2 | 198 | 0.571 | 199.429 |
| 30500 | 200 | 0 | 200 | 0.571 | 199.429 |
| 31000 | 200 | 0 | 200 | 0.571 | 199.429 |
| 31500 | 200 | 0 | 200 | 0.571 | 199.429 |
| 32000 | 200 | 1 | 199 | 0.571 | 199.429 |
| 32500 | 200 | 1 | 199 | 0.571 | 199.429 |
| Total | 2800 | 8 | 2792 | 8 | 2792 |
Comparison of p-Values
The following table compares the p-values reported in Table 2 of Blount et al. to the chi-square p-values for the same experiments. For experiments one and three, the chi-square p-values are much larger than the "mean generation" test p-values from the paper.
| Experiment 1 | Experiment 2 | Experiment 3 | |
|---|---|---|---|
| p-Value from Paper | 0.0085 | 0.0007 | 0.082 |
| Chi-square p-value | 0.19 | 0.0004 | 0.22 |
References
- ↑ http://www.pnas.org/content/105/23/7899.full.pdf
- ↑ http://www.okstate.edu/ag/agedcm4h/academic/aged5980a/5980/newpage28.htm
- ↑ http://faculty.chass.ncsu.edu/garson/PA765/chisq.htm
- ↑ http://www.wellesley.edu/Psychology/Psych205/chisquareindep.html
- ↑ http://www.minitab.com/support/docs/Answers/Chi-Square%20Test%20Assumptions.pdf
- ↑ http://mysite.du.edu/~jcalvert/econ/chisquar.htm
- ↑ http://books.google.com/books?id=yU15rUiLRI8C&pg=PA201&lpg=PA201&dq=chi-square+test+assumptions&source=bl&ots=FRY0LwQ3z_&sig=FyIvzJx3hjQ8nWlu2cpmZj3pwXY&hl=en&ei=fm-1SayaNI_MMKX5tO4E&sa=X&oi=book_result&ct=result#PPA185,M1
- ↑ http://www.basic.northwestern.edu/statguidefiles/gf-dist_ass_viol.html
- ↑ Mathematical Statistics with Applications by Wackerly, Mendenhall, and Scheaffer, Section 14.4.
See Also
http://www.sciencenews.org/index/feature/activity/view/id/40006/title/Molecular_Evolution
http://sciencenews.org/view/generic/id/40649/title/FOR_KIDS_Hitting_the_redo_button_on_evolution