Power and Sample Size
Theory
๊ฒ์ ๋ ฅ (Power) ์ด๋ ํน์ ํ๋ณธ ์กฐ๊ฑด (size and variability) ์์ ํน์ ํ ํจ๊ณผ ํฌ๊ธฐ (effect size) ๋ฅผ ์์๋ผ ์ ์๋ ํ๋ฅ ์ ์๋ฏธํ๋ค. ์ค์ ์ฐจ์ด๊ฐ ํฌ๋ฉด ํด์๋ก ๊ทธ๊ฒ์ ๋ฐํ๋ผ ๊ฐ๋ฅ์ฑ๋ ๋ฐ๋ผ์ ์ปค์ง ๊ฒ์ด๊ณ , ๊ทธ ์ฐจ์ด๊ฐ ์์์๋ก ๋ ๋ง์ ๋ฐ์ดํฐ๊ฐ ํ์ํ๊ฒ ๋๋ค.
Example)
25ํ์์์ 3ํ 3ํผ ํ์์ 2ํ ํ์๋ฅผ ๊ตฌ๋ถํ ์ ์์ ํ๋ฅ ์ 0.75์ด๋ค. โ n=25์ผ ๋์ ์คํ์ 0.130์ ํจ๊ณผ ํฌ๊ธฐ์ ๋ํด 0.75(75%)์ ๊ฒ์ ๋ ฅ์ ๊ฐ์ง๋ค๊ณ ๋ณผ ์ ์๋ค.
๊ฒ์ ๋ ฅ์ ์ฌ์ ์ ์ ์
๋๋ฆฝ๊ฐ์ค์ด ์ฌ์ค์ผ ๋, ์ด๋ฅผ ์ฌ์ค๋ก์ ๊ฒฐ์ ํ ํ๋ฅ
๊ฒ์ ๋ ฅ์ด 90%๋ผ๊ณ ํ๋ฉด, ๋๋ฆฝ๊ฐ์ค์ด ์ฌ์ค์์๋ ๋ถ๊ตฌํ๊ณ ๊ท๋ฌด๊ฐ์ค์ ์ฑํํ ํ๋ฅ (2์ข ์ค๋ฅ, ฮฒ error) ์ ํ๋ฅ ์ 10%์ด๋ค.
๊ฒ์ ๋ ฅ = 1 - ฮฒ
๊ฒ์ก๋ ฅ์ ์ ์์์ผ ํ๋๊ฐ?
๊ฒ์ ๋ ฅ ๊ณ์ฐ์ ์ฃผ๋ ์ฉ๋๋ ํ๋ณธํฌ๊ธฐ๊ฐ ์ด๋ ์ ๋ ํ์ํ๊ฐ๋ฅผ ์ถ์ ํ๋ ๊ฒ์.
'ํจ๊ณผํฌ๊ธฐ'๊ฐ ํ๋ณธํฌ๊ธฐ๋ฅผ ์ข์ฐํจ! (๊ธฐ๋ํ๋ ํจ๊ณผ ํฌ๊ธฐ๊ฐ ์์์๋ก ํ๋ณธ์ฌ์ด์ฆ๊ฐ ์ฆ๊ฐ๋์ด์ผ ํจ)
๊ฒ์ ๋ ฅ/ํ๋ณธํฌ๊ธฐ ๊ณ์ฐ์ 4์์
ํ๋ณธํฌ๊ธฐ (Sample size)
ํ์งํ๊ณ ์ ํ๋ ํจ๊ณผํฌ๊ธฐ (Effect size)
๊ฐ์ค๊ฒ์ ์ ์ํ ์ ์์์ค (Significance level)
๊ฒ์ ๋ ฅ (Power)
Practice (MATLAB)
Ref) MATLAB sampsizepwr
sampsizepwr
sampsizepwr
sampsizepwr
computes the sample size, power, or alternative parameter value for a hypothesis test, given the other two value. For example, you can compute the sample size required to obtain a particular power for a hypothesis test, given the parameter value of the alternative hypothesis.
Example1
A company runs manufacturing process that fills empty bottles with 100 mL of liquid. To monitor quality, the company randomly selects several bottles and measures the volume of liquid inside. Determine the sample size the compnay must use for a t-test to detect a difference between 100 mL and 102 mL with a power of 0.80.
nout = sampsizepwr('t', [100 5], 102, 0.80)

The compnay must test 52 bottles to detect the difference between a mean volume of 100 mL and 102 mL with a power of 0.80.
Generate a power curve to visualize how the sample size affects the power of test.
nn = 1:100;
pwrout = sampsizepwr('t', [100 5], 102, [], nn);
figure;
plot(nn, pwrout, 'b-', nout, 0.8, 'ro')
title('Power versus Sample Size')
xlabel('Sample Size')
ylabel('Power')

Example2
An employee wants to buy a house near her office. She decides to eliminate from consideration any house that has a mean morning commute time greater than 20 minutes. The null hypothesis for this right-sided test is H0: ฮผ = 20, and the alternative hypothesis is HA: ฮผ > 20. The selected significance level is 0.05.
To determine the mean commute time, the employee takes a test drive from the house to her office during rush hour every morning for one week, so her total sample size is 5. She assumes that the standard deviation, ฯ, is equal to 5.
The employee decides that a true mean commute time of 25 minutes is too different from her targeted 20-minute limit, so she wants to detect a significant departure if the true mean is 25 minutes. Find the probability of incorrectly concluding that the mean commute time is no greater than 20 minutes.
Compute the power of the test, and then subtract the power from 1 to obtain ฮฒ.
power = sampsizepwr('t',[20 5],25,[],5,'Alpha',0.05,'Tail','right')

beta = 1 - power

The employee decides that this risk is too high, and she wants no more than a 0.01 probability of reaching an incorrect conclusion. Calculate the number of test drives the employee must take to obtain a power of 0.99.
nout = sampsizepwr('t',[20 5],25,0.99,[],'Tail','right')
The results indicate that she must take 18 test drives from a candidate house to achieve this power level.
The employee decides that she only has time to take 10 test drives. She also accepts a 0.05 probability of making an incorrect conclusion. Calculate the smallest true parameter value that produces a detectable difference in mean commute time.
p1out = sampsizepwr('t',[20 5],[],0.95,10,'Tail','right')

Given the employee's target power level and sample size, her test detects a significant difference from a mean commute time of at least 25.6532 minutes.
Last updated
Was this helpful?