正態分佈在分析數據時的應用

✔ 最佳答案

http://en.wikipedia.org/wiki/Normality_test
統計我唔熟，只係知哩的通常交比軟件如SAS, SPSS, MINTAB 等等去做，只係打幾個 command 就會同你做晒咁多個 Tests。
但係點分析同用邊個Test分析就要睇你個model有乜野known parameter(s)，如population mean and/or population variance，如果係乜野parameter都無且sample size少於30，咁就要做non-parametric test...所以你最好講清楚的條件，唔係無人識答你。

2008-03-10 12:04:05 補充：
Normality tests include D^Agostino^s K-squared test, the Jarque-Bera test, the Anderson-Darling test, the Cramér-von-Mises criterion, the Lilliefors test for normality (itself an adaptation of the Kolmogorov-Smirnov test), the Shapiro-Wilk test, the Pearson^s chi-square test, and the Shapiro-Francia test for normality.
我係根據wiki比的意見。
雖然我都係用matlab，但係我唔建議一開始用matlab 去分析數據。因為我地哩類user，基本上係無任何 idea，所以係唔知應該用邊個 test 去做，如果你用得 matlab，基本上你係對的數據有所了解，要做的tailor-made 既 manipulation。
所以我認為應該用其他軟件，如SAS 去做初步的Analylsis去搵idea，SAS的基本command 如 univariate，就會一次個做幾個tests, 如下：
The UNIVARIATE Procedure
Variable: y
Moments
N 30 Sum Weights 30
Mean 100.838755 Sum Observations 3025.16265
Std Deviation 2.95744478 Variance 8.74647963
Skewness -0.555295 Kurtosis 0.05011072
Uncorrected SS 305307.283 Corrected SS 253.647909
Coeff Variation 2.93284539 Std Error Mean 0.53995307
Tests for Normality
Test (--Statistic--- ) -----p Value------ (gt-greater than, lt-less than)
Shapiro-Wilk ( W 0.966698) Pr lt W 0.4532
Kolmogorov-Smirnov ( D 0.076831) Pr gt D gt 0.1500
Cramer-von Mises ( W-Sq 0.037879) Pr gt W-Sq gt 0.2500
Anderson-Darling ( A-Sq 0.285016) Pr gt A-Sq gt 0.2500
有幾個 tests 去比較係好關鍵，因為不同既tests 會對的 parameters有不同的 sensitivty 所以個conclusion 有機會唔一致。尤其係個的 marginal case如：
siginficant level = 5% but p-value = 0.06 or 0.047
這時無論係accept or reject 個 hypothesis test 都變得好牽強，最後有多幾個Test reuslts 去比較結果去決定。例如：
Shapiro-Wilk p-value = 0.0510
Kolmogorov-Smirnov p-value = 0.1000
Cramer-von Mises p-value = 0.0418
Anderson-Darling p-value = 0.089
這時雖然有一個 test 的 p-value 少於 0.05 significant level，但是我們還是會偏向相信accept normality assumption。如果要非常認真做，哩個例子我們會認為的 information 唔夠去作出決定，會要求有more sample data，作 further analysis。
根據wiki和本人的經驗，我地最初都係用 A-D Test 去check normality，再睇Skewness及Kurtosis 去了解個 data structure(哩步之後我就完全唔識，只係道聽途說，後面講的野你最好再請教其他高人)，然後再決定信邊個test 多的，我覺得你個sample 應該算係要用 non-parametric testing，即係要相信Kolmogorov-Smirnov test 多些，而你提出的Lilliefors test 係 K-S test既變種(by wiki)，所以應該係不錯的選擇。

2008-03-10 12:06:12 補充：
的 inequality sign 亂晒碼，咁即係話 Y! 既世界係無大無世，呵呵。

正態分佈在分析數據時的應用

回答 (1)