統計學上的 p-value 是什麼意思呀?

2007-03-05 4:04 am
想請問統計學上的 p-value 是什麼意思呀?

回答 (3)

2007-03-05 10:53 am
✔ 最佳答案
我見兩個回都是英文的,現嘗試給你一個中文回答吧!

其實 p-value 是 statistical hypoythesis testing(應否釋作「統計假說檢定」?)裡的一個概念。當我們檢驗一個假說時,可能出現四種情況:

1假說是真的,並且通過檢驗
2假說是假的,並且不通過檢驗
3假說是真的,但不通過檢驗
4假說是假的,但通過檢驗

首兩種情況中,檢驗的結果並沒有錯誤,而後兩種情況的檢驗結果錯了。
「假說是真的,但不通過檢驗」,這種錯誤稱為 type I error
「假說是假的,但通過檢驗」,這種錯誤稱為 type II error

先回答你的問題。 簡單來說,p-value 即是第3種情況發生的概率,即 Pr(type I error)。

若一個假說本來是假的話,即可能是第2或第4種情況,我們當然希望減少第4種情況發生。要做到這樣的話,最保守(極端)的方法,便是永不容許檢驗通過。然而實際上不知假說本來是真是假,所以這樣做則會提升第3種情況發生的機會。類似地,我們可以看出,type I error 和 type II error 的發生機會往往是此消彼長的。

可是,我們設計一個統計檢驗時,我們卻希望減少它們出現的。然而,我們能控制的只有 type I error。(由於對事實的無知,故只能作這型式的推論:若某假說是真的話,則會有某某結果……;統計上很難說:若某假說是假的話,則另外有些是真的,並在這些情況下便有如此如此的結果……,除非我們熟知統計對象的整個分佈,然而若知的話又為何要作統計檢驗呢?)

因此,我們試圖從統計結果去推斷假說是不是真的。統計上因為有隨機的偏離,所以我們的考慮形式其實是:若某假說是真的話,則會有某某結果會有這樣的分佈……

若果觀察結果與預期(若假說是真的話)所得的結果有太大的偏離,則我們傾向相信假說不是真的。

然而,既然隨機性存在,那麼太大的偏離之存在仍可能是因為隨機性,而非因假說不是真的。所以,當我們得到統計結果時,我們可以先假設假說是真的,然後再指出得到這個統計結果的機會;一般來說,我們卻是指出「得到一個至少有這麼極端的統計結果」的概率(probability of obtaining a statistic which is as least as extreme as the statistic that we have obtained)。這種概率就是 p-value 了。

假如 p-value 很小,則看似結果不可能發生,那麼我們便趨於不通過檢驗。

以上是對 p-value 的說明。以下會補充一下 p-value 和 significant level 的關係。

Significance level α 是理解為可接受最多的 Pr(type I error)有多少。如果 p-value 小於 α ,便會不通過檢驗(這時我們說這個 statistic是 significant 的)。
α 越大,便越趨於不通過檢驗。相反地,α 越小,便越是讓檢驗通過。

所以可以說:
p-value is an observed significance of a statistic. It tells HOW sig. the stat. is
α is a required upper bound of p-value. It tells whether the stat IS sig. enough.

這是基於我對 Statistical Hypthesis Testing 的認識而寫的。
要知道 p-value,最好明白它和 significance level 的關係。
參考: 這是基於我對 Statistical Hypthesis Testing 的認識而寫的,可以去維基百科找參考。
2007-03-05 6:37 am
Simply speaking, p-value is the probability that a random variable is equal to a particular value. That is, the lower the p-value, the lower the probability that the particular value occurs by chance (that is it is more likely to happen by purpose).
2007-03-05 6:02 am
In statistical hypothesis testing, the p-value is the probability of obtaining a result at least as extreme as a given data point, assuming the data point was the result of chance alone. The fact that p-values are based on this assumption is crucial to their correct interpretation.
More technically, the p-value of an observed value tobserved of some random variable T used as a test statistic is the probability that, given that the null hypothesis is true, T will assume a value as or more unfavorable to the null hypothesis as the observed value tobserved. "More unfavorable to the null hypothesis" can in some cases mean greater than, in some cases less than, and in some cases further away from a specified center.
Example
For example, say an experiment is performed to determine if a coin flip is fair (50% chance of landing heads or tails), or unfairly biased toward heads (> 50% chance of landing heads). If we choose a two-tailed test then the null hypothesis is that the coin is fair, and that any deviations from the 50% rate can be ascribed to chance alone. Suppose that the experimental results show the coin turning up heads 14 times out of 20 total flips. The p-value of this result would be the chance of a fair coin landing on heads at least 14 times out of 20 flips (as larger values in this case are also less favorable to the null hypothesis of a fair coin) or at most 6 times out of 20 flips. In this case the random variable T has a binomial distribution. The probability that 20 flips of a fair coin would result in 14 or more heads is 0.0577. Since this is a two-tailed test, the probability that 20 flips of the coin would result in 14 or more heads or 6 or less heads is 0.115.
Generally, the smaller the p-value, the more people there are who would be willing to say that the results came from a biased coin.

Interpretation
Generally, one rejects the null hypothesis if the p-value is smaller than or equal to the significance level, often represented by the Greek letter α (alpha). If the level is 0.05, then the results are only 5% likely to be as extraordinary as just seen, given that the null hypothesis is true.
In the above example, the calculated p-value exceeds 0.05, and thus the null hypothesis - that the observed result of 14 heads out of 20 flips can be ascribed to chance alone - is not rejected. Such a finding is often stated as being "not statistically significant at the 5 % level".
However, had a single extra head been obtained, the resulting p-value would be 0.02. This time the null hypothesis - that the observed result of 15 heads out of 20 flips can be ascribed to chance alone - is rejected. Such a finding would be described as being "statistically significant at the 5 % level".
There is often an alternative hypothesis, but the construction of the test does not allow for 'supporting' a specific alternative.
Critics of p-values point out that the criterion used to decide "statistical significance" is based on the somewhat arbitrary choice of level (often set at 0.05). A proposed replacement for the p-value is p-rep, which is the probability that an effect can be replicated.


收錄日期: 2021-04-16 23:24:55
原文連結 [永久失效]:
https://hk.answers.yahoo.com/question/index?qid=20070304000051KK04701

檢視 Wayback Machine 備份