1. ã¯ããã«
å·¥åŠçããã³å°è³ªåŠç調æ»ã宿œããå Žåã調æ»å¯Ÿè±¡ããå®éšå®€ãžã®ãµã³ãã«ã®æ£ãã茞éã確èªããããã«ãåãåå£ã§ã®ãã£ãŒã«ããšå®éšå®€ã®ç ç©¶ããã®ããŒã¿ãæ¯èŒããããšã«é¢é£ããã¿ã¹ã¯ãçºçããããšããããŸã (ãµã³ãã«ã¯å€åœ¢ããã³/ãŸãã¯ç Žå£ãããŠããŸãã茞éäžïŒã
ãã®åé¡ã®å®åŒåã«ãããæ¬¡ã®ãã©ã¡ãŒã¿ãŒã䜿çšã㊠A/B ãã¹ãææ³ãé©çšã§ããŸãã
枬å®åºæºã¯ããµã³ãã«ã®è¿œå ãç¹åŸŽä»ããåå£éªšæ Œã®å¯åºŠ (p dãg / cm 3 )ã®å¹³åå€ã«ãªããŸãããã®å€ã«ã¯æ£èŠååžã®æ³åããããŸãã
仮説ããã¹ãããããã®åºæºã¯ãtæ€å®ïŒã¹ãã¥ãŒãã³ãæ€å®ïŒã«ãªããŸãïŒ äºã€ã®ç¬ç«ãããµã³ãã«ã®ãããæ¯èŒïŒèŒžéåïŒãã£ãŒã«ããšïŒèŒžéåŸã®ïŒå®éšå®€ã®ããŒã¿ã¯ãç°ãªãåå£ãµã³ãã«ã§å®æœããå Žåã 2 ã€ã®åŸå±ãµã³ãã«ã®å Žåãåããµã³ãã«ã§ç ç©¶ãè¡ãããå Žåã
ãã®ãããã¯ã®æ çµã¿ã®äžã§ã2 ã€ã®ã©ã³ãã ãªãµã³ãã«ãçæããããããæ¯èŒããçµ±èšç仮説ãçå®ãããããããã¹ãããŠçµè«ãââå°ãåºããŸãã
2. ãµã³ãã«ã®çæ
2.1 ãµã³ãã«ãµã€ãºã®æšå®
å®éšèšç»ã®äžç°ãšããŠãå¯åºŠãµã³ãã«ãçæããåã«ãäžãããã广ãµã€ãº (ES - 广ãµã€ãº)ã ãã¯ãŒ ããã㳠蚱容ã¿ã€ã I ãšã©ãŒ (α) ã«å¿ èŠãªããªã¥ãŒã ãèŠç©ãããŸããã (ãããã®çšèªã®å®çŸ©ã¯ä»¥äžã«ç€ºãããŠããŸã)ãstatsmodelsããã±ãŒãžã䜿çšããŠèšç®ãè¡ããŸã ã
广é (æšæºå) ã¯ãæ€åºãããå·®ãç¹åŸŽä»ããå€ã§ããããµã³ãã«éã®å¹³åå€ã®å·®ãšå éæšæºåå·®ã®æ¯çã«çãããç§ãã¡ã®å Žåã«ã¯ïŒ
åããµã€ãºã®ãµã³ãã«ã«ããŒã«ãããå éæšæºåå·® Sã¯ã次ã®åŒã䜿çšããŠèšç®ã§ããŸãã
(Cohen, 1988) ES = 0.2 - ; 0.5 - ; 0.8 - .
â II ( 80%).
I II :
H0 |
H1 |
|
|---|---|---|
H0 |
H0 |
II (β) |
H0 |
I (α) |
H0 (power = 1-β) |
:
α = 0.05 ( )
ES = 0.5 ( ).
Power = 0.8 ( ).
:
#
import numpy as np
from statsmodels.stats.power import TTestIndPower
from matplotlib.pyplot import figure
import matplotlib.pyplot as plt
import scipy
from statsmodels.stats.weightstats import *
#
effect = 0.5
alpha = 0.05
power = 0.8
analysis = TTestIndPower()
#
size = analysis.solve_power(effect, power=power, alpha=alpha)
print(f' , .: {int(size)}')
, .: 63
, 63 . 65 .
.
plt.figure(figsize=(10, 7), dpi=80)
results = dict((i/10, analysis.solve_power(i/10, power=power, alpha=alpha))
for i in range(2, 16, 1))
plt.plot(list(results.keys()), list(results.values()), 'bo-')
plt.grid()
plt.title(' \n ')
plt.ylabel(' n, .')
plt.xlabel(' ES, ..')
for x,y in zip(list(results.keys()),list(results.values())):
label = "{:.0f}".format(y)
plt.annotate(label,
(x,y),
textcoords="offset points",
xytext=(0,10),
ha='center')
plt.show()
, ES. : 0,03 /3 0,1 /c3 (ES = 0,03 /3 / 0,1 /3 = 0,3 ..), 175 (power=0.80, α=0.05).
2.2
, numpy.
( ) . (XÌ) (S):
â XÌ1= 1,65 /3, S1 = 0.15 /3;
â XÌ2 = 1,60 /3, S2 = 0.15 /3.
loc_1 = 1.65
sigma_1 = 0.15
loc_2 = 1.60
sigma_2 = 0.15
sample_size = 65
#
sample_1 = np.random.normal(loc=loc_1, scale=sigma_1, size=sample_size)
sample_2 = np.random.normal(loc=loc_2, scale=sigma_2, size=sample_size)
" " .
fig, axes = plt.subplots(ncols=2, figsize=(18, 5))
max_y = np.max(np.hstack([sample_1,sample_2]))
# 1
count_1, bins_1, ignored_1 = axes[0].hist(sample_1, 10, density=True,
label=" 1", edgecolor='black',
linewidth=1.2)
axes[0].plot(bins_1, 1/(sigma_1 * np.sqrt(2 * np.pi)) *
np.exp( - (bins_1 - loc_1)2 / (2 * sigma_12)),
linewidth=2, color='r', label=' ')
axes[0].legend()
axes[0].set_xlabel(u' , ')
axes[0].set_ylabel(u' , .')
axes[0].set_ylim([0, 5])
axes[0].set_xlim([1.1, 2.2])
# 2
count_2, bins_2, ignored_2 = axes[1].hist(sample_2, 10, density=True,
label=" 2", edgecolor='black',
linewidth=1.2, color="green")
axes[1].plot(bins_2, 1/(sigma_2 * np.sqrt(2 * np.pi)) *
np.exp( - (bins_2 - loc_2)2 / (2 * sigma_22)),
linewidth=2, color='r', label=' ')
axes[1].legend()
axes[1].set_xlabel(u' , ')
axes[1].set_ylabel(u' , .')
axes[1].set_ylim([0, 5])
axes[1].set_xlim([1.1, 2.2])
plt.show()
#
fig, ax = plt.subplots(figsize=(8, 8))
axis = ax.boxplot([sample_1, sample_2], labels=[' 1', ' 2'])
data = np.array([sample_1, sample_2])
means = np.mean(data, axis = 1)
stds = np.std(data, axis = 1)
for i, line in enumerate(axis['medians']):
x, y = line.get_xydata()[1]
text = ' ÎŒ={:.2f}\n Ï={:.2f}'.format(means[i], stds[i])
ax.annotate(text, xy=(x, y))
plt.ylabel(' , /3')
plt.show()
3.
. :
1. , t- ;
2. , t- .
.
1.
.
H0: Ό1 = Ό2.
H1: ÎŒ1â ÎŒ2.
:
: T(X1n1,X2n2)â~St(Μ), Μ
ttest_ind stats.
t_st, p_val = scipy.stats.ttest_ind(sample_1, sample_2, equal_var = False)
print(f't- {round(t_st, 2)}')
print(f' t- \
(p-value) {round(p_val, 3)}')
t- 2.92
t- (p-value) 0.004
â 1
H0 , , 0,05 ( p-value 0.004) .
.
c_m = CompareMeans(DescrStatsW(sample_1), DescrStatsW(sample_2))
print("95%% : \
[%.4f, %.4f]" % c_m.tconfint_diff(usevar='unequal'))
95% : [0.0235, 0.1228]
95% , , 5%.
2.
, ( ) ( ) . , .
H0: Ό1 = Ό2.
H1: ÎŒ1â ÎŒ2.
:
: T(X1n, X2n) ~ St(n-1)
ttest_rel stats.
t_st, p_val = stats.ttest_rel(sample_1, sample_2)
print(f't- {round(t_st, 2)}')
print(f' t- \
(p-value) {round(p_val, 3)}')
t- 2.79
t- (p-value) 0.007
â 2
H0 , , 0,05 ( p-value 0.007).
æç¢ºã«ããããã«ããããã®ãµã³ãã«ã®å¹³åéã®ééãæšå®ããŸãããã
print("95%% confidence interval: [%.4f, %.4f]"
% DescrStatsW(sample_1 - sample_2).tconfint_mean())
95% ä¿¡é Œåºé: [0.0208, 0.1255]
ãŒãã¯èæ ®ããã 95% ä¿¡é Œåºéå ã«ãªããããèæ ®ããããµã³ãã«ã®å¹³åå€ã¯ç°ãªããšçµè«ä»ããããšãã§ããŸãã
5. çµæ
ãã®èšäºã§ã¯ãåæšå°è³ªåŠã®å®è·µçãªåé¡ã解決ããããã« Python èšèªã䜿çšããå¯èœæ§ãæ€èšãã仮説ããã¹ãããããã«å¿ èŠãªãµã³ãã« ãµã€ãºã®åé¡ã«ã€ããŠã調æ»ããŸããã