Power and Sample Size
Theory
κ²μ λ ₯ (Power) μ΄λ νΉμ νλ³Έ 쑰건 (size and variability) μμ νΉμ ν ν¨κ³Ό ν¬κΈ° (effect size) λ₯Ό μμλΌ μ μλ νλ₯ μ μλ―Ένλ€. μ€μ μ°¨μ΄κ° ν¬λ©΄ ν΄μλ‘ κ·Έκ²μ λ°νλΌ κ°λ₯μ±λ λ°λΌμ μ»€μ§ κ²μ΄κ³ , κ·Έ μ°¨μ΄κ° μμμλ‘ λ λ§μ λ°μ΄ν°κ° νμνκ² λλ€.
Example)
25νμμμ 3ν 3νΌ νμμ 2ν νμλ₯Ό ꡬλΆν μ μμ νλ₯ μ 0.75μ΄λ€. β n=25μΌ λμ μ€νμ 0.130μ ν¨κ³Ό ν¬κΈ°μ λν΄ 0.75(75%)μ κ²μ λ ₯μ κ°μ§λ€κ³ λ³Ό μ μλ€.
κ²μ λ ₯μ μ¬μ μ μ μ
λ립κ°μ€μ΄ μ¬μ€μΌ λ, μ΄λ₯Ό μ¬μ€λ‘μ κ²°μ ν νλ₯
κ²μ λ ₯μ΄ 90%λΌκ³ νλ©΄, λ립κ°μ€μ΄ μ¬μ€μμλ λΆκ΅¬νκ³ κ·λ¬΄κ°μ€μ μ±νν νλ₯ (2μ’ μ€λ₯, Ξ² error) μ νλ₯ μ 10%μ΄λ€.
κ²μ λ ₯ = 1 - Ξ²
κ²μ‘λ ₯μ μ μμμΌ νλκ°?
κ²μ λ ₯ κ³μ°μ μ£Όλ μ©λλ νλ³Έν¬κΈ°κ° μ΄λ μ λ νμνκ°λ₯Ό μΆμ νλ κ²μ.
'ν¨κ³Όν¬κΈ°'κ° νλ³Έν¬κΈ°λ₯Ό μ’μ°ν¨! (κΈ°λνλ ν¨κ³Ό ν¬κΈ°κ° μμμλ‘ νλ³Έμ¬μ΄μ¦κ° μ¦κ°λμ΄μΌ ν¨)
κ²μ λ ₯/νλ³Έν¬κΈ° κ³μ°μ 4μμ
νλ³Έν¬κΈ° (Sample size)
νμ§νκ³ μ νλ ν¨κ³Όν¬κΈ° (Effect size)
κ°μ€κ²μ μ μν μ μμμ€ (Significance level)
κ²μ λ ₯ (Power)
Practice (MATLAB)
Ref) MATLAB sampsizepwr
sampsizepwr
sampsizepwr
sampsizepwr
computes the sample size, power, or alternative parameter value for a hypothesis test, given the other two value. For example, you can compute the sample size required to obtain a particular power for a hypothesis test, given the parameter value of the alternative hypothesis.
Example1
A company runs manufacturing process that fills empty bottles with 100 mL of liquid. To monitor quality, the company randomly selects several bottles and measures the volume of liquid inside. Determine the sample size the compnay must use for a t-test to detect a difference between 100 mL and 102 mL with a power of 0.80.
nout = sampsizepwr('t', [100 5], 102, 0.80)

The compnay must test 52 bottles to detect the difference between a mean volume of 100 mL and 102 mL with a power of 0.80.
Generate a power curve to visualize how the sample size affects the power of test.
nn = 1:100;
pwrout = sampsizepwr('t', [100 5], 102, [], nn);
figure;
plot(nn, pwrout, 'b-', nout, 0.8, 'ro')
title('Power versus Sample Size')
xlabel('Sample Size')
ylabel('Power')

Example2
An employee wants to buy a house near her office. She decides to eliminate from consideration any house that has a mean morning commute time greater than 20 minutes. The null hypothesis for this right-sided test is H0: ΞΌ = 20, and the alternative hypothesis is HA: ΞΌ > 20. The selected significance level is 0.05.
To determine the mean commute time, the employee takes a test drive from the house to her office during rush hour every morning for one week, so her total sample size is 5. She assumes that the standard deviation, Ο, is equal to 5.
The employee decides that a true mean commute time of 25 minutes is too different from her targeted 20-minute limit, so she wants to detect a significant departure if the true mean is 25 minutes. Find the probability of incorrectly concluding that the mean commute time is no greater than 20 minutes.
Compute the power of the test, and then subtract the power from 1 to obtain Ξ².
power = sampsizepwr('t',[20 5],25,[],5,'Alpha',0.05,'Tail','right')

beta = 1 - power

The employee decides that this risk is too high, and she wants no more than a 0.01 probability of reaching an incorrect conclusion. Calculate the number of test drives the employee must take to obtain a power of 0.99.
nout = sampsizepwr('t',[20 5],25,0.99,[],'Tail','right')
The results indicate that she must take 18 test drives from a candidate house to achieve this power level.
The employee decides that she only has time to take 10 test drives. She also accepts a 0.05 probability of making an incorrect conclusion. Calculate the smallest true parameter value that produces a detectable difference in mean commute time.
p1out = sampsizepwr('t',[20 5],[],0.95,10,'Tail','right')

Given the employee's target power level and sample size, her test detects a significant difference from a mean commute time of at least 25.6532 minutes.
Last updated
Was this helpful?