科学网

 找回密码
  注册
科学网 标签 test 相关日志

tag 标签: test

相关日志

[转载]【转载】柯尔莫可洛夫-斯米洛夫检验(K-S test)
zdenglish211 2013-4-4 15:27
原文地址:http://blog.sina.com.cn/s/blog_5ecfd9d90100cigp.html 在统计学中,柯尔莫可洛夫-斯米洛夫检验基于累计分布函数,用以检验两个经验分布是否不同或一个经验分布与另一个理想分布是否不同。 In statistics , the Kolmogorov – Smirnov test (K–S test) is a form of minimum distance estimation used as a nonparametric test of equality of one-dimensional probability distributions used to compare a sample with a reference probability distribution (one-sample K–S test), or to compare two samples (two-sample K–S test). The Kolmogorov–Smirnov statistic quantifies a distance between the empirical distribution function of the sample and the cumulative distribution function of the reference distribution, or between the empirical distribution functions of two samples. The null distribution of this statistic is calculated under the null hypothesis that the samples are drawn from the same distribution (in the two-sample case) or that the sample is drawn from the reference distribution (in the one-sample case). In each case, the distributions considered under the null hypothesis are continuous distributions but are otherwise unrestricted. The two-sample KS test is one of the most useful and general nonparametric methods for comparing two samples, as it is sensitive to differences in both location and shape of the empirical cumulative distribution functions of the two samples. The Kolmogorov–Smirnov test can be modified to serve as a goodness of fit test. In the special case of testing for normality of the distribution, samples are standardized and compared with a standard normal distribution. This is equivalent to setting the mean and variance of the reference distribution equal to the sample estimates, and it is known that using the sample to modify the null hypothesis reduces the power of a test. Correcting for this bias leads to the Lilliefors test . However, even Lilliefors' modification is less powerful than the Shapiro–Wilk test or Anderson–Darling test for testing normality. Kolmogorov–Smirnov statistic The empirical distribution function F n for n iid observations X i is defined as where is the indicator function , equal to 1 if X i ≤ x and equal to 0 otherwise. The Kolmogorov–Smirnov statistic for a given cumulative distribution function F ( x ) is img class="tex" alt="D_n=\sup_x |F_n(x)-F(x)|," src="http://upload.wikimedia.org/math/3/b/8/3b8599f003f2a131d8084621b1c39640.png" real_src="http://upload.wikimedia.org/math/3/b/8/3b8599f003f2a131d8084621b1c39640.png" title="Kolmogorov–Smirnov test" style="margin:0px;padding:0px;border:0px;list-style:none;" / where sup S is the supremum of set S . By the Glivenko–Cantelli theorem , if the sample comes from distribution F ( x ), then D n converges to 0 almost surely . Kolmogorov strengthened this result, by effectively providing the rate of this convergence (see below). The Donsker theorem provides yet stronger result. Kolmogorov distribution The Kolmogorov distribution is the distribution of the random variable img class="tex" alt="K=\sup_{t\in }|B(t)|," src="http://upload.wikimedia.org/math/1/b/7/1b7fd8f556e7382d973cb6bf95a245ea.png" real_src="http://upload.wikimedia.org/math/1/b/7/1b7fd8f556e7382d973cb6bf95a245ea.png" title="Kolmogorov–Smirnov test" style="margin:0px;padding:0px;border:0px;list-style:none;" / where B ( t ) is the Brownian bridge . The cumulative distribution function of K is given by img class="tex" alt="\operatorname{Pr}(K\leq x)=1-2\sum_{i=1}^\infty (-1)^{i-1} e^{-2i^2 x^2}=\frac{\sqrt{2\pi}}{x}\sum_{i=1}^\infty e^{-(2i-1)^2\pi^2/(8x^2)}." src="http://upload.wikimedia.org/math/2/8/9/2899bf257fc0aa1f48b3ffcff8f783ae.png" real_src="http://upload.wikimedia.org/math/2/8/9/2899bf257fc0aa1f48b3ffcff8f783ae.png" title="Kolmogorov–Smirnov test" style="margin:0px;padding:0px;border:0px;list-style:none;" / Kolmogorov–Smirnov test Under null hypothesis that the sample comes from the hypothesized distribution F ( x ), img class="tex" alt="\sqrt{n}D_n\xrightarrow{n\to\infty}\sup_t |B(F(t))|" src="http://upload.wikimedia.org/math/8/4/2/842d0b1d85ca11aa30ccc90a09936fa4.png" real_src="http://upload.wikimedia.org/math/8/4/2/842d0b1d85ca11aa30ccc90a09936fa4.png" title="Kolmogorov–Smirnov test" style="margin:0px;padding:0px;border:0px;list-style:none;" / in distribution , where B ( t ) is the Brownian bridge . If F is continuous then under the null hypothesis img class="tex" alt="\sqrt{n}D_n" src="http://upload.wikimedia.org/math/1/e/c/1ec425f3720cd63ffabd65504c798972.png" real_src="http://upload.wikimedia.org/math/1/e/c/1ec425f3720cd63ffabd65504c798972.png" title="Kolmogorov–Smirnov test" style="margin:0px;padding:0px;border:0px;list-style:none;" / converges to the Kolmogorov distribution, which does not depend on F . This result may also be known as the Kolmogorov theorem ; see Kolmogorov's theorem for disambiguation. The goodness-of-fit test or the Kolmogorov–Smirnov test is constructed by using the critical values of the Kolmogorov distribution. The null hypothesis is rejected at level α if img class="tex" alt="\sqrt{n}D_nK_\alpha,\," src="http://upload.wikimedia.org/math/8/9/1/891bbf7487bdbedcc202cb47bee880ac.png" real_src="http://upload.wikimedia.org/math/8/9/1/891bbf7487bdbedcc202cb47bee880ac.png" title="Kolmogorov–Smirnov test" style="margin:0px;padding:0px;border:0px;list-style:none;" / where K α is found from img class="tex" alt="\operatorname{Pr}(K\leq K_\alpha)=1-\alpha.\," src="http://upload.wikimedia.org/math/b/b/4/bb4772bb6ae01da6b6a3d1d6b3b43097.png" real_src="http://upload.wikimedia.org/math/b/b/4/bb4772bb6ae01da6b6a3d1d6b3b43097.png" title="Kolmogorov–Smirnov test" style="margin:0px;padding:0px;border:0px;list-style:none;" / The asymptotic power of this test is 1. If the form or parameters of F ( x ) are determined from the X i , the inequality may not hold. In this case, Monte Carlo or other methods are required to determine the rejection level α .
4156 次阅读|0 个评论
[转载]test
cai7net 2013-3-20 10:16
1 次阅读|0 个评论
ODE test papar (Grade 11)
sobolev 2013-3-8 12:31
11ode试卷A.pdf
2138 次阅读|0 个评论
approximate Likelihood-Ratio Test 和 standard bootstrap区别
zczhou 2013-3-7 00:46
aLRT (parametric bootstrap)和 standard bootstrap(nonparametric bootstrap)的区别,aLRT 是phyML计算支持率的另外一种方法,其中Chi2-based aLRT (approximate Likelihood-Ratio Test) for branches 得到的支持率比较松散,SH-like 得到的比较相近 -b (or --bootstrap) int int = -1 : approximate likelihood ratio test returning aLRT statistics. int = -2 : approximate likelihood ratio test returning Chi2-based parametric branch supports. int = -3 : minimum of Chi2-based parametric and SH-like branch supports. int = -4 : SH-like branch supports alone. aLRT is a statistical test to compute branch supports. It applies to every (internal) branch and is computed along PhyML run on the original data set. Thus, aLRT is much faster than standard bootstrap which requires running PhyML 100-1,000 times with resampled data sets. As with any test, the aLRT branch support is significant when it is larger than 0.90-0.99. With good quality data (enough signal and sites), the sets of branches with bootstrap proportion 0.75 and aLRT0 aLRT (approximate Likelihood-Ratio Test) for branches -b (or --bootstrap) int int = -1 : approximate likelihood ratio test returning aLRT statistics. int = -2 : approximate likelihood ratio test returning Chi2-based parametric branch supports. int = -3 : minimum of Chi2-based parametric and SH-like branch supports. int = -4 : SH-like branch supports alone. aLRT is a statistical test to compute branch supports. It applies to every (internal) branch and is computed along PhyML run on the original data set. Thus, aLRT is much faster than standard bootstrap which requires running PhyML 100-1,000 times with resampled data sets. As with any test, the aLRT branch support is significant when it is larger than 0.90-0.99. With good quality data (enough signal and sites), the sets of branches with bootstrap proportion 0.75 and aLRT0.9 (SH-like option) tend to be similar. Perform bootstrap and number of resampled data sets -b (or --bootstrap) int int 0 : int is the number of bootstrap replicates. int = 0 : neither approximate likelihood ratio test nor bootstrap values are computed. When there is only one data set you can ask PhyML to generate resampled bootstrap data sets from this original data set. PhyML then returns the bootstrap tree with branch lengths and bootstrap values, using standard NEWICK format. The "Print pseudo trees" option gives the pseudo trees in a *_boot_trees.txt file. option) tend to be similar. Perform bootstrap and number of resampled data sets -b (or --bootstrap) int int 0 : int is the number of bootstrap replicates. int = 0 : neither approximate likelihood ratio test nor bootstrap values are computed. When there is only one data set you can ask PhyML to generate resampled bootstrap data sets from this original data set. PhyML then returns the bootstrap tree with branch lengths and bootstrap values, using standard NEWICK format. The "Print pseudo trees" option gives the pseudo trees in a *_boot_trees.txt file. reference linking: http://www.atgc-montpellier.fr/phyml/usersguide.php?type=command http://www.atgc-montpellier.fr/phyml/alrt/
7031 次阅读|0 个评论

Archiver|手机版|科学网 ( 京ICP备07017567号-12 )

GMT+8, 2024-5-18 19:01

Powered by ScienceNet.cn

Copyright © 2007- 中国科学报社

返回顶部