Difference between revisions of "Talk:Significance of E. Coli Evolution Experiments"

Revision as of 15:02, March 9, 2009

SJohnson, your assessment, while good in the utilization of the chi-squared test is unfortunately incorrect. The Monte Carlo resampling gives a more accurate p-value than the chi-squared. You may research the literature (i.e. publications in statistical mathematics, many pubs actualy compare Monte Carlo vs Chi Squared) to discover that this method is commonly used in advance statistical work and how it is more accurate than the chi-squared test.--Able806 17:00, 4 March 2009 (EST)

It doesn’t make sense to compare the chi-square test, which is a specific statistical hypothesis test, to Monte Carlo methods, which can be used for anything from fluid motion modeling to p-value computations. You can use Monte Carlo methods to compute the p-values of the chi-square test!

Monte Carlo methods involve the generation of random realizations. Your broad claim the Monte Carlo methods are “more accurate” than the chi-square test is obviously incorrect because the accuracy of Monte Carlo methods always depends on the number of random realizations generated. When p-values are small, Monte Carlo methods are notoriously inaccurate unless the number of realizations generated is enormous.

Which publications compare Monte Carlo to chi-square and show that the former is more accurate? Could you provide specific examples? Thanks. SJohnson 18:50, 4 March 2009 (EST)

In furtherance of SJohnson's remarks with respect to rarely occurring events, the use of the basic Monte Carlo method is plainly incorrect for modeling a rarely occurring event, as the Lenski paper did. This has long been pointed out in Flaws in Richard Lenski Study. I know evolutionists will never admit a flaw in anything promoting their pet theory, but this (and other) flaws in that paper is undeniable.

Watch how evolutionists defended obvious errors in the Lenski paper, and then realize why the Piltdown Man fraud was taught for 40 years without evolutionists admitting it was a hoax.--Andy Schlafly 09:55, 5 March 2009 (EST)

Andy, how exactly is the Monte Carlo method incorrect to use in this case? I have seen it used in publications with much smaller datasets.--Able806 10:29, 5 March 2009 (EST)

Able806, I'm interested in looking at the publications you mentioned that use Monte Carlo methods to analyze small data sets. Could you provide some examples? Thanks. SJohnson 16:41, 5 March 2009 (EST)

Able806, you still seem to miss the point about how inappropriate the Monte Carlo method (as used in the Lenski paper) is for evaluating rarely occurring events. You need to open your mind to be productive. If you simply cling to a view that Lenski (who I don't think has any meaningful education in statistics) must somehow be right, then you're not going to make any progress in understanding the flaws.--Andy Schlafly 17:07, 5 March 2009 (EST)

Sjohnson, I believe you just proved my point. In the literature of mean and covariance structure analysis, non-central chi-square distribution is commonly used to describe the behavior of the likelihood ratio statistic under alternative hypothesis; it is widely believed that the non-central chi-square distribution is justified by statistical theory. Actually, when the null hypothesis is not trivially violated, the non-central chi-square distribution cannot describe the LR statistic well even when data are normally distributed and the sample size is large. Monte Carlo results compare the strength of the normal distribution against that of the non-central chi-square distribution. In an association analysis comparing cases and controls with respect to allele frequencies at a highly polymorphic locus, a potential problem is that the conventional chi-squared test may not be valid for a large, sparse contingency table. Reliance on statistics with known asymptotic distribution is unnecessary, as Monte Carlo simulations can be performed to estimate the significance level of the test statistic.

Here is a link to a great page the provides an interactive example as to why the Chi Squared test would provide poor results compared to the Monte Carlo in relation to the Lenski data workup.

Something you may have overlooked was that the data set is actually too small to use the chi square method correctly. It is often accepted that is any of the analyzed data falls under 10 for a particular cell of the data set then the Yates correction needs to be applied; unfortunately the Yates correction can over correct thus skewing the p-value. Lenksi seemed to understand this by supporting his Monte Carlo p-value results with the Fisher z-transformation p-value.

I hope this helps.--Able806 10:27, 5 March 2009 (EST)

I’m still waiting to hear which literature says that “Monte Carlo resampling” is “more accurate than the chi-squared test”. The page mentioned above [1] is a discussion of why statisticians “fail to reject the null” rather than “accepting the null” when the p-value is above 0.05 or so. The page says nothing about superiority of Monte Carlo methods. Why were alternate hypothesis distributions mentioned? Only the null hypothesis distribution is used to calculate a p-value. Yates’s correction is for 2x2 contingency tables [2]. It doesn’t apply in this case. Finally, what the heck do “covariance structure analysis” and “allele frequencies at a highly polymorphic locus” have to do with this problem? SJohnson 16:38, 5 March 2009 (EST)

Regarding the "Fisher z-transformation p-value" from the paper, garbage in garbage out. If the p-values were bad to begin with, then why would a combination of them be meaningful? SJohnson 10:49, 9 March 2009 (EDT)

Quick question for SJohnson: How many degrees of freedom did you choose when calculating the p-value? I'd like to know upon what condition you base that number. Thanks.--Argon 11:05, 5 March 2009 (EST)

The degree of freedom for a contingency table is rows minus one times columns minus one. That is,

\text{[math]}

. Here’s a pretty good tutorial I came across: [3]. For the experiments from [4], the DOFs are 11, 11, and 13. For experiment one, the chi-square test statistic is

\text{[math]}

\text{[math]}

\text{[math]}

\text{[math]}

where

\text{[math]}

is the observed value and

\text{[math]}

is the expected null hypothesis value. So if you have MS Excel, another way to arrive at the p-value of 0.19 is to type “=CHIDIST(14.82,11)” into a cell. Cheers! SJohnson 16:38, 5 March 2009 (EST)

OK, thanks for the info. From what I'd calculated and looked up in tables, the numbers seemed close to a df=11 for a chi-square of ~14. (Aside: With terms having 17/3 in the denominator in the figures above, were you using the test of independence? I was using Pearson's test for fit of a distribution which returns a chi-squared value of 14 and roughly matched the p-values you reported, assuming the df was 11).

Also, the first sentence of the article reads: "Blount, Borland, and Lenski[1] claimed that a key evolutionary innovation was observed during a laboratory experiment. That claim is false." A small correction: There were several claims in the paper. The 'key evolutionary innovation' was acquiring the ability to utilize citrate as a food source. That claim was demonstrated multiple times. The claim, which pertains to this statistics discussion was that the Cit+ phenotype arose in a multi-step process, first requiring a rare, pre-adaptive mutation before additional mutation(s) lead to the subsequent development of citrate utilization.--Argon 20:46, 5 March 2009 (EST)

My biology-degreed wife assures me that mutation does not necessarily mean that evolution occurred. What the paper claimed is that evolution (a “key innovation”) occurred in the lab. The key innovation supposedly increased the mutation rate. In the experiments, the observed mutation rate increased after generation 31,000, but not enough to make a statistically significant claim that the rate is not constant. The analysis in the paper was similar to flipping a coin ten times, counting six heads and claiming that the coin must be biased against tails. In reality, there’s nothing surprising about a fair coin producing slightly more of one outcome than the other. Just like there's nothing surprising about there being slightly more mutations in later generations than early generations given the null hypothesis (constant mutation rate). SJohnson 10:46, 9 March 2009 (EDT)

Let’s go back to the beginning. There appears to be confusion about the difference between test statistics and methods for computing p-values. As is noted at the beginning of the page [5], the fundamental problem with the paper is that it used a flawed test statistic, not that it used Monte Carlo methods to find the p-value for that flawed statistic.

Every hypothesis test uses a test statistic to reduce the data to a single number. The p-value for the test statistic can be calculated analytically (as I’ve done for the chi-square test statistic) or by Monte Carlo methods. In the paper, Monte Carlo methods were used to compute the p-value of the “mutation generation” test statistic. The key problem with the analysis from the paper is that it doesn’t work to use a weighted average to test for variations in mutation rate. This is like trying to use the sample variance to test for an increase in the mean in Gaussian-distributed data. A statistic should be selected based on the null and alternate hypothesis distributions of the data. The chi-square test (unlike the weighted average from the paper) is a reasonable choice for data that mutates at a constant rate under the null hypothesis, but mutates at varying rates under the alternate hypothesis.

Able806, you made a good point about the contingency table cell frequencies being relatively low, but were wrong when you said ”the data set is actually too small to use the chi square method correctly”. In the low cell frequency case the chi-square test is still effective, but the null hypothesis distribution of the chi-square statistic starts to look less like the chi-square distribution. Thus, p-values calculated using the chi-square distribution may be a bit off. However, Monte Carlo p-values are always imperfect as well because it's impossible to generate an infinite number of random realizations. There are imperfections in p-values generated by analytic and Monte Carlo methods. However, low cell frequencies does not explain the >20x and >2.5x differences between chi-square p-values and p-values from the paper for experiments one and three. The reason for those huge differences was the use of the flawed test statistic (“mutation generation”) in the paper. SJohnson 16:38, 5 March 2009 (EST)

Misinterpretation of test

SJohnson, Your analysis misinterprets the test. You say the null hypothesis is that this mutation cannot happen. They saw a mutation (4 mutations, in fact, in the data set you show) so the null hypothesis (as you state is) is disproved. That's perfectly straightforward.

I don't know what the "mean mutation generation" test is but you're doing when you apply a chi-squared test to this dataset is to test if the mutations are evenly distributed throughout the generations. Your test says they are, so there's no strong evidence to suppose that mutations are likely to occur in one generation rather than another in the series of tests. Blount's test says thay aren't, so it's more likely that the mutation will occur later in the series of tests. I can't tell which test is right without knowing more about the test that Blount used.

But that point (the foregoing paragraph) has no bearing at all on the null hypothesis, as you describe it. The mutation appeared, so that means the hypothesis that the mutation can't happen is disproved. Very simple. FredFerguson 21:10, 8 March 2009 (EDT)

I never said that “the null hypothesis is that this mutation cannot happen”. The chi-square test statistic I'm using wouldn’t be defined if the null hypothesis mutation rate was zero because the

\text{[math]}

term in the denominator of the statistic (see above equation) would be zero.

The test statistic from the paper is the average of the generation numbers of observed mutations. For experiment one this number is

\text{[math]}

The same number is shown in Table 2 of the paper. SJohnson 10:46, 9 March 2009 (EDT)

SJohnson, the way you're calculating the chi-squared statistic implies that you're testing the null hypothesis of a constant mutation rate over time against an alternative hypothesis of a mutation rate which varies over time. FredFerguson 11:02, 9 March 2009 (EDT)

@@ Line 84: / Line 84: @@
 </math>
 :The same number is shown in Table 2 of the paper. [[User:SJohnson|SJohnson]] 10:46, 9 March 2009 (EDT)
+:: SJohnson, the way you're calculating the chi-squared statistic implies that you're testing the null hypothesis of a constant mutation rate over time against an alternative hypothesis of a mutation rate which varies over time. [[User:FredFerguson|FredFerguson]] 11:02, 9 March 2009 (EDT)

Difference between revisions of "Talk:Significance of E. Coli Evolution Experiments"

Revision as of 15:02, March 9, 2009

Misinterpretation of test

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Popular Links

donate

Edit Console