Difference between revisions of "Flaws in Richard Lenski Study"

From Conservapedia
Jump to: navigation, search
(Removed some repetition and cleaned up the science a little)
(Intelligent design theorists on the implications of Richard Lenski's experiment)
 
(16 intermediate revisions by 7 users not shown)
Line 1: Line 1:
[[Richard Lenski]] turned down an offer to ship frozen specimens of bacteria to an unspecified facility so that his bacteria mutation data could be made more readily available to the public,<ref>See [[Conservapedia:Lenski dialog]].</ref> but the following serious flaws are emerging about his work<ref>Blount et al., "Historical contingency and the [[evolution]] of a key innovation in an experimental population of ''Escherichia coli'', 105 PNAS 7899-7906 (June 10, 2008).</ref> even without a full disclosure of the data.  Note that the peer review on Lenski's paper took somewhere between zero (non-existent) and fourteen days (including administrative time), and Lenski himself does not have any obvious expertise in statistics.  In fact, Richard Lenski admits in his paper that he based his statistical conclusions on the use of Monte Carlo resampling techniques, using giftware from the website called "statistics101.net".
+
Regarding his [http://www.pnas.org/content/105/23/7899.full.pdf experiment on historical contingency in evolution], [[Richard Lenski]] rejected a request to fully release his bacteria mutation data to the public. An analysis of his work,<ref>Blount et al., "Historical contingency and the [[evolution]] of a key innovation in an experimental population of ''Escherichia coli'', [http://www.pnas.org/content/105/23/7899.full 105 PNAS 7899-7906] (June 10, 2008).</ref> however, reveals serious flaws even without a full disclosure of the data.  Note that the peer review on Lenski's paper took somewhere between 0 (non-existent) and at most 14 days (including administrative time), and Lenski himself does not have any obvious expertise in statistics.  In fact, Richard Lenski admits in his paper that he based his statistical conclusions on use of a website called "statistics101".
  
# Lenski's "historical contingency" hypothesis, as specifically depicted in Figure 3, is apparently contradicted by the data presented in the Third Experiment in Table 1 of his paper.  The historical contingency hypothesis in Figure 3 suggests that the rate at which bacteria mutated to Cit<sup>+</sup> increased at some point due to another mutation, the third (and largest) experiment in Table 1 shows Cit<sup>+</sup> arising only in generations revived from the 20,000th generation onwards, suggesting this other mutation took place in the colony around that time.
+
1.  Lenski's "historical contingency" hypothesis, as specifically depicted in Figure 3, is contradicted by the data presented in the Third Experiment in Table 1 of his paper.  Figure 3 proposes a step-up in mutation rate to Cit<sup>+</sup> due to a historical contingency (potentiating mutation) occurring at about the 31,000th generation, yet the Third (and largest) Experiment in Table 1 shows Cit<sup>+</sup> arising just as often before the 31,000th generation as after.  The abstract, in further contradiction with Figure 3, suggests that the historical contingency (potentiating mutation) occurred prior to the 20,000th generation.
# Lenski's rare mutation hypothesis suggests a fixed mutation rate, but the failure of the mutations in his experiments to increase based on scale (number of samples) tends to disprove this hypothesis and hence back up the historical contingency hypothesis.
+
 
# Richard Lenski included generations of the ''E. coli'' already known to contain Cit<sup>+</sup> variants in his experiments.<ref>Richard Lenski included generations 31,500, 32,000 and 32,500.</ref>  Once these generations are removed from the analysis, the data from the first (smallest) and third (largest) replay experiments still support Lenski's conclusions, whilst the second replay experiment fails to provide any evidence to that end.
+
2.  Lenski's two alternative hypotheses suggest a fixed mutation rate, but the failure of the mutations in his experiments to increase based on scale (number of samples) tends to disprove both of Lenski's alternative hypotheses.  Yet Lenski's paper fails to address adequately this obvious flaw in the paper.
# The paper applies a Monte Carlo resampling test to test the data against the null hypothesis - that is the rare mutation hypothesis.
+
 
# Lenski's third experiment lent the least support to his hypothesis with statistical significance.
+
3.  Richard Lenski incorrectly included generations of the ''E. coli'' already known to contain Cit<sup>+</sup> variants in his experiments.<ref>Richard Lenski incorrectly included generations 31,500, 32,000 and 32,500.</ref>  Once these generations are removed from the analysis, the data disprove Lenski's hypothesis.
# Lenski claims that if the Third Experiment is erroneously combined with the other two experiments based on outcome rather than sample size, then a claim of overall statistical significance is achieved. He also points out that if the experiments are combined with respect to their relative sample sizes then this overall statistical significance is retained.
+
 
# Lenski's paper points out that the results of his largest experiment (Third Experiment) have the highest Monte Carlo ''P'' value, and hence the least statistical significance.  This is owing to the development of Cit<sup>+</sup> variants in the 20,000th generation in this experimentIf the prior mutation necessary for the historical contingency occurred around the 20,000th generation then this result is the least significant in explaining itAll works published in PNAS are clear in defining statistical significance in the traditional way.<ref>See, e.g., [http://www.pnas.org/cgi/content/full/0701990104 Cholera toxin induces malignant glioma cell differentiation]</ref>
+
4. The paper incorrectly applied a Monte Carlo resampling test to exclude the null hypothesis for rarely occurring events. The Third Experiment results are consistent with the null hypothesis, contrary to the paper's claim.
# Lenski's paper claims that "During [30,000 generations], each population experienced billions of mutations,<ref>Lenski cites one of his own prior articles for this.</ref> far more than the number of possible point mutations in the [approximately] 4.6-million-bp genome.  This ratio implies, to a first approximation, that each population tried every typical one-step mutation many times."  Lenski concludes that it the mutation required to evolve the Cit<sup>+</sup>phenotype must be 'difficult' in some sense, the sense being made clear by the historical contingency hypothesis.
+
 
 +
5.  Lenski's largest experiment (Third Experiment) failed to support his hypothesis with statistical significance. Even though this largest experiment was nearly ten times the size of his other experiments, Richard Lenski did not weight this largest experiment correctly in combining his results.
 +
 
 +
6. It was error to include generations of the E. coli already known to contain trace Cit+ variants. The highly improbable occurrence of four Cit+ variants from the 32,000th generation in the Second Experiment suggests an origin from undetected, pre-existing Cit+ variants.
 +
 
 +
7. The Third Experiment was erroneously combined with the other two experiments based on outcome rather than sample size, thereby yielding a false claim of overall statistical significance. Lenski's paper applied the Whitlock Z-transformation incorrectly, perhaps intentionally so, in making a claim that Lenski's results were "extremely significant": "We also used the Z-transformation method to combine the probabilities from our three experiments, and '''the result is extremely significant (P < 0.0001) whether or not''' the experiments are weighted by the number of independent Cit+ mutants observed in each one."<ref>Lenski paper at 7902 (citation to Whitlock paper omitted, emphasis added).</ref>  Lenski's "whether or not" refers to two incorrect applications of the Whitlock technique, obscuring how the straightforward, correct weighting based on sample size was ''not'' used.  A reader could conclude that the Lenski paper deliberately conceals the misapplication.
 +
 
 +
8.  Lenski's paper is not clear in explaining how the results of his largest experiment (Third Experiment) failed to confirm his hypothesis with statistical significance, even with the incorrect inclusion of the Cit<sup>+</sup> variant generationsInstead, his paper refers to his largest experiment as "marginally ... significant," which serves to obscure its statistical insignificanceOther works published in PNAS are clear in defining statistical significance in the traditional way, which Lenski's Third Experiment (even with incorrect inclusion of the above-referenced generations) failed to satisfy.<ref>See, e.g., [http://www.pnas.org/cgi/content/full/0701990104 Cholera toxin induces malignant glioma cell differentiation]</ref>
 +
 
 +
9.  The long lag time (over 12,000 generations) between the historical contingency (potentiating mutation) in the largest experiment disproves Lenski's implicit assumption that the potentiating mutation likely occurred in proximity with the occurrence of the Cit<sup>+</sup> variant, and that the first occurrence of the Cit<sup>+</sup> variant in the Third Experiment at the 20,000th generation somehow implies that a potentiating mutation occurred in its proximity.
 +
 
 +
10.  Lenski's paper claims that "During [30,000 generations], each population experienced billions of mutations,<ref>Lenski cites one of his own prior articles for this.</ref> far more than the number of possible point mutations in the [approximately] 4.6-million-bp genome.  This ratio implies, to a first approximation, that each population tried every typical one-step mutation many times."  Lenski's conclusion is nonsensical because it assumes that the mutations are completely random '''and''' that each mutation has a roughly equal probability.
 +
 
 +
11.  In Table 2 of [http://www.pnas.org/content/105/23/7899.full.pdf], the expected mean should be 26,382 generations, not 28,382.
 +
 
 +
12.  The p-value computed for experiment two was incorrectly listed as 0.0007 instead of 0.0006 in [http://www.pnas.org/content/105/23/7899.full.pdf]. These p-values are meaningless because the paper used a flawed test statistic (see: [[Significance of E. Coli Evolution Experiments#Test Statistics]]). However, the error illustrates the need to use enough random realizations when using Monte Carlo methods to estimate p-values.
 +
 
 +
== Intelligent design theorists on the implications of Richard Lenski's experiment ==
 +
 
 +
The biologist [[Michael Behe]] criticized Richard Lenski's claims concerning the significance of his experiment (see: [http://behe.uncommondescent.com/2008/06/multiple-mutations-needed-for-e-coli/ Multiple Mutations Needed for E. Coli] by Michael Behe).
 +
 
 +
Jonathon Witt of the [[Discovery Institute]] indicates that the biologist Dustin Van Hofwegen punctured the evolutionists' claims for Richard Lenski’s long-term evolution experiment (see: [https://evolutionnews.org/2021/06/biologist-dustin-van-hofwegen-punctures-claims-for-lenskis-long-term-evolution-experiment/ Biologist Dustin Van Hofwegen Punctures Claims for Lenski’s Long-Term Evolution Experiment]).
  
 
== References ==
 
== References ==
Line 17: Line 38:
  
 
*[[Letter to PNAS]]
 
*[[Letter to PNAS]]
 +
*[[Significance of E. Coli Evolution Experiments]]
 
[[Category:Science]]
 
[[Category:Science]]

Latest revision as of 21:29, June 21, 2021

Regarding his experiment on historical contingency in evolution, Richard Lenski rejected a request to fully release his bacteria mutation data to the public. An analysis of his work,[1] however, reveals serious flaws even without a full disclosure of the data. Note that the peer review on Lenski's paper took somewhere between 0 (non-existent) and at most 14 days (including administrative time), and Lenski himself does not have any obvious expertise in statistics. In fact, Richard Lenski admits in his paper that he based his statistical conclusions on use of a website called "statistics101".

1. Lenski's "historical contingency" hypothesis, as specifically depicted in Figure 3, is contradicted by the data presented in the Third Experiment in Table 1 of his paper. Figure 3 proposes a step-up in mutation rate to Cit+ due to a historical contingency (potentiating mutation) occurring at about the 31,000th generation, yet the Third (and largest) Experiment in Table 1 shows Cit+ arising just as often before the 31,000th generation as after. The abstract, in further contradiction with Figure 3, suggests that the historical contingency (potentiating mutation) occurred prior to the 20,000th generation.

2. Lenski's two alternative hypotheses suggest a fixed mutation rate, but the failure of the mutations in his experiments to increase based on scale (number of samples) tends to disprove both of Lenski's alternative hypotheses. Yet Lenski's paper fails to address adequately this obvious flaw in the paper.

3. Richard Lenski incorrectly included generations of the E. coli already known to contain Cit+ variants in his experiments.[2] Once these generations are removed from the analysis, the data disprove Lenski's hypothesis.

4. The paper incorrectly applied a Monte Carlo resampling test to exclude the null hypothesis for rarely occurring events. The Third Experiment results are consistent with the null hypothesis, contrary to the paper's claim.

5. Lenski's largest experiment (Third Experiment) failed to support his hypothesis with statistical significance. Even though this largest experiment was nearly ten times the size of his other experiments, Richard Lenski did not weight this largest experiment correctly in combining his results.

6. It was error to include generations of the E. coli already known to contain trace Cit+ variants. The highly improbable occurrence of four Cit+ variants from the 32,000th generation in the Second Experiment suggests an origin from undetected, pre-existing Cit+ variants.

7. The Third Experiment was erroneously combined with the other two experiments based on outcome rather than sample size, thereby yielding a false claim of overall statistical significance. Lenski's paper applied the Whitlock Z-transformation incorrectly, perhaps intentionally so, in making a claim that Lenski's results were "extremely significant": "We also used the Z-transformation method to combine the probabilities from our three experiments, and the result is extremely significant (P < 0.0001) whether or not the experiments are weighted by the number of independent Cit+ mutants observed in each one."[3] Lenski's "whether or not" refers to two incorrect applications of the Whitlock technique, obscuring how the straightforward, correct weighting based on sample size was not used. A reader could conclude that the Lenski paper deliberately conceals the misapplication.

8. Lenski's paper is not clear in explaining how the results of his largest experiment (Third Experiment) failed to confirm his hypothesis with statistical significance, even with the incorrect inclusion of the Cit+ variant generations. Instead, his paper refers to his largest experiment as "marginally ... significant," which serves to obscure its statistical insignificance. Other works published in PNAS are clear in defining statistical significance in the traditional way, which Lenski's Third Experiment (even with incorrect inclusion of the above-referenced generations) failed to satisfy.[4]

9. The long lag time (over 12,000 generations) between the historical contingency (potentiating mutation) in the largest experiment disproves Lenski's implicit assumption that the potentiating mutation likely occurred in proximity with the occurrence of the Cit+ variant, and that the first occurrence of the Cit+ variant in the Third Experiment at the 20,000th generation somehow implies that a potentiating mutation occurred in its proximity.

10. Lenski's paper claims that "During [30,000 generations], each population experienced billions of mutations,[5] far more than the number of possible point mutations in the [approximately] 4.6-million-bp genome. This ratio implies, to a first approximation, that each population tried every typical one-step mutation many times." Lenski's conclusion is nonsensical because it assumes that the mutations are completely random and that each mutation has a roughly equal probability.

11. In Table 2 of [1], the expected mean should be 26,382 generations, not 28,382.

12. The p-value computed for experiment two was incorrectly listed as 0.0007 instead of 0.0006 in [2]. These p-values are meaningless because the paper used a flawed test statistic (see: Significance of E. Coli Evolution Experiments#Test Statistics). However, the error illustrates the need to use enough random realizations when using Monte Carlo methods to estimate p-values.

Intelligent design theorists on the implications of Richard Lenski's experiment

The biologist Michael Behe criticized Richard Lenski's claims concerning the significance of his experiment (see: Multiple Mutations Needed for E. Coli by Michael Behe).

Jonathon Witt of the Discovery Institute indicates that the biologist Dustin Van Hofwegen punctured the evolutionists' claims for Richard Lenski’s long-term evolution experiment (see: Biologist Dustin Van Hofwegen Punctures Claims for Lenski’s Long-Term Evolution Experiment).

References

  1. Blount et al., "Historical contingency and the evolution of a key innovation in an experimental population of Escherichia coli, 105 PNAS 7899-7906 (June 10, 2008).
  2. Richard Lenski incorrectly included generations 31,500, 32,000 and 32,500.
  3. Lenski paper at 7902 (citation to Whitlock paper omitted, emphasis added).
  4. See, e.g., Cholera toxin induces malignant glioma cell differentiation
  5. Lenski cites one of his own prior articles for this.

See also