
4ã€ã®äžè¬çãªãã¥ãŒã©ã«ãããã¯ãŒã¯ãã¬ãŒãã³ã°ãªããã£ãã€ã¶ãå®è£ ããŠæ¯èŒããŸããã€ã³ãã«ã¹ãªããã£ãã€ã¶ãrmsäŒæ¬ããããããåŸé éäžãããã³é©å¿ãã«ã¯æšå®ã§ãããªããžããªãå€ãã®Pythonã³ãŒããšãã®åºåãèŠèŠåãããã³åŒã¯ãã¹ãŠã«ãããããŠããŸãã
åæžã
ã¢ãã«ã¯ãããã€ãã®ããŒã¿ã§å®è¡ãããŠããæ©æ¢°åŠç¿ã¢ã«ãŽãªãºã ã®çµæã§ããã¢ãã«ã¯ãã¢ã«ãŽãªãºã ã«ãã£ãŠåŠç¿ããããã®ã衚ããŸããããã¯ããã¬ãŒãã³ã°ããŒã¿ã«å¯ŸããŠã¢ã«ãŽãªãºã ãå®è¡ããåŸãåç¶ããããã®ãã§ãããã«ãŒã«ãæ°å€ãããã³ã¢ã«ãŽãªãºã ã«åºæã§äºæž¬ã«å¿ èŠãªãã®ä»ã®ããŒã¿æ§é ã衚ããŸãã
ãªããã£ãã€ã¶ãŒãšã¯äœã§ããïŒ
ããã«é²ãåã«ãæå€±é¢æ°ãäœã§ããããç¥ãå¿ èŠããããŸããæå€±é¢æ°ã¯ãäºæž¬ã¢ãã«ãæåŸ ãããçµæïŒãŸãã¯å€ïŒãã©ãã ãé©åã«äºæž¬ãããã®å°ºåºŠã§ããæå€±é¢æ°ã¯ã³ã¹ã颿°ãšãåŒã°ããŸãïŒè©³çްã¯ãã¡ãïŒã
ãã¬ãŒãã³ã°äžã¯ãæ©èœã®æå€±ãæå°éã«æãããã©ã¡ãŒã¿ãŒãæŽæ°ããŠç²ŸåºŠãåäžãããŸãããã¥ãŒã©ã«ãããã¯ãŒã¯ã®ãã©ã¡ãŒã¿ã¯éåžžããªã³ã¯ã®éã¿ã§ãããã®å Žåããã©ã¡ãŒã¿ã¯ãã¬ãŒãã³ã°æ®µéã§èª¿æ»ãããŸãããããã£ãŠãã¢ã«ãŽãªãºã èªäœïŒããã³å ¥åããŒã¿ïŒããããã®ãã©ã¡ãŒã¿ãŒã調æŽããŸãã詳现ã«ã€ããŠã¯ããã¡ããã芧ãã ããã
ãããã£ãŠããªããã£ãã€ã¶ãŒã¯ãããè¯ãçµæãéæããããã®æ¹æ³ã§ãããåŠç¿ã®ã¹ããŒãã¢ããã«åœ¹ç«ã¡ãŸããã€ãŸããã¢ãã«ãæ£ããè¿ éã«å®è¡ãç¶ããããã«ãéã¿ãåŠç¿çãªã©ã®ãã©ã¡ãŒã¿ãŒã埮調æŽããããã«äœ¿çšãããã¢ã«ãŽãªãºã ã§ããããã¯ããã£ãŒãã©ãŒãã³ã°ã§äœ¿çšãããããŸããŸãªãªããã£ãã€ã¶ãŒã®åºæ¬çãªæŠèŠãšããã®ã¢ãã«ã®å®è£ ãçè§£ããããã®ç°¡åãªã¢ãã«ã§ãããã®ãªããžããªã®ã¯ããŒã³ãäœæããåäœãã¿ãŒã³ã芳å¯ããŠå€æŽãå ããããšã匷ããå§ãããŸãã
ããã€ãã®äžè¬çã«äœ¿çšãããçšèªïŒ
- 誀差éäŒææ³
ããã¯ãããã²ãŒã·ã§ã³ã®ç®æšã¯åçŽã§ããå šäœçãªãšã©ãŒã®åå ã«å¿ããŠããããã¯ãŒã¯å ã®åéã¿ã調æŽããŸããåéã¿ã®èª€å·®ãç¹°ãè¿ãæžãããšãé©åãªäºæž¬ãè¡ãäžé£ã®éã¿ãåŸãããŸããæå€±é¢æ°ã®åãã©ã¡ãŒã¿ãŒã®åŸé ãèŠã€ããåŸé ãå·®ãåŒãããšã«ãã£ãŠãã©ã¡ãŒã¿ãŒãæŽæ°ããŸãïŒè©³çްã¯ãã¡ãïŒã

- åŸé éäž
åŸé éäžã¯ãè² ã®åŸé å€ã«ãã£ãŠå®çŸ©ãããæãæ¥ãªéäžã«åãã£ãŠç¹°ãè¿ãç§»åããããšã«ãã£ãŠé¢æ°ãæå°åããããã«äœ¿çšãããæé©åã¢ã«ãŽãªãºã ã§ããæ·±å±€åŠç¿ã§ã¯ãåŸé éäžã䜿çšããŠã¢ãã«ãã©ã¡ãŒã¿ãæŽæ°ããŸãïŒè©³çްã¯ãã¡ãïŒã
- ãã€ããŒãã©ã¡ãŒã¿
ã¢ãã«ãã€ããŒãã©ã¡ãŒã¿ã¯ã¢ãã«ã®å€éšã®æ§æã§ããããã®å€ãããŒã¿ããæšå®ããããšã¯ã§ããŸãããããšãã°ãé ãããã¥ãŒãã³ã®æ°ãåŠç¿çãªã©ã§ããããŒã¿ããåŠç¿çãæšå®ããããšã¯ã§ããŸããïŒè©³çްã¯ãã¡ãïŒã
- åŠç¿ç
åŠç¿çïŒÎ±ïŒã¯ãæå€±é¢æ°ã®æå°å€ã«åãã£ãŠç§»åããªãããåå埩ã§ã®ã¹ããããµã€ãºã決å®ããæé©åã¢ã«ãŽãªãºã ã®èª¿æŽãã©ã¡ãŒã¿ãŒã§ãïŒè©³çްã¯ãã¡ãïŒã
人æ°ã®ãªããã£ãã€ã¶ãŒ

以äžã¯ãæã人æ°ã®ããSEOã®äžéšã§ãã
- 確ççåŸé éäžïŒSGDïŒã
- ã¢ã¡ã³ã¿ã ãªããã£ãã€ã¶ãŒã
- ã«ãŒãå¹³åäºä¹äŒæ¬ïŒRMSPropïŒã
- é©å¿ãã«ã¯æšå®ïŒã¢ãã ïŒã
ããããã«ã€ããŠè©³ããèããŠã¿ãŸãããã
1.確ççåŸé éäžïŒç¹ã«ãããããïŒ
ã¢ãã«ãïŒçŽç²ãªSGDã§ïŒãã¬ãŒãã³ã°ãããã©ã¡ãŒã¿ãŒãæŽæ°ãããšãã«ãäžåºŠã«1ã€ã®äŸã䜿çšããŸãããã ããã«ãŒãã«ã¯å¥ã®ãã®ã䜿çšããå¿ èŠããããŸããæéãããããŸãããããã£ãŠããããããSGDã䜿çšããŸãã
ãããããåŸé éäžã¯ã確ççåŸé éäžã®å ç¢æ§ãšãããåŸé éäžã®å¹çã®ãã©ã³ã¹ããšãããšããŸããããã¯ã深局åŠç¿ã§äœ¿çšãããåŸé éäžã®æãäžè¬çãªå®è£ ã§ãããããããSGDã§ã¯ãã¢ãã«ããã¬ãŒãã³ã°ãããšãã«ãäŸã®ã°ã«ãŒããåããŸãïŒããšãã°ã32ã64ã®äŸãªã©ïŒããã®ã¢ãããŒãã¯ããã¹ãŠã®äŸã§ã¯ãªãããããããã«å¯ŸããŠ1ã€ã®ã«ãŒãããå¿ èŠãšããªããããããé©åã«æ©èœããŸããããããã±ãŒãžã¯å埩ããšã«ã©ã³ãã ã«éžæãããŸããããªãã§ããïŒãããã±ãããã©ã³ãã ã«éžæãããå ŽåãããŒã«ã«ã®æå°å€ã§ã¹ã¿ãã¯ãããšããã€ãºã®å€ãã¹ãããã«ãã£ãŠãããã®æå°å€ãçµäºããå¯èœæ§ããããŸãããªããã®ãªããã£ãã€ã¶ãå¿ èŠãªã®ã§ããïŒ
- ãã©ã¡ãŒã¿ã®æŽæ°ã¬ãŒãã¯ãåçŽãªãããåŸé éäžãããé«ãã屿çãªæå°å€ãåé¿ããããšã§ãããä¿¡é Œæ§ã®é«ãåæãå¯èœã«ãªããŸãã
- ãããæŽæ°ã¯ã確ççåŸé éäžãããèšç®å¹çã®é«ãããã»ã¹ãæäŸããŸãã
- RAMãå°ãªãå Žåã¯ãããããã±ãŒãžãæé©ãªãªãã·ã§ã³ã§ããã¡ã¢ãªãšã¢ã«ãŽãªãºã ã®å®è£ ã«ãã¹ãŠã®ãã¬ãŒãã³ã°ããŒã¿ããªãããããããåŠçã¯å¹ççã§ãã
ã©ã³ãã ãªãããã±ãããçæããã«ã¯ã©ãããã°ããã§ããïŒ
def RandomMiniBatches(X, Y, MiniBatchSize):
m = X.shape[0]
miniBatches = []
permutation = list(np.random.permutation(m))
shuffled_X = X[permutation, :]
shuffled_Y = Y[permutation, :].reshape((m,1)) #sure for uptpur shape
num_minibatches = m // MiniBatchSize
for k in range(0, num_minibatches):
miniBatch_X = shuffled_X[k * MiniBatchSize:(k + 1) * MiniBatchSize,:]
miniBatch_Y = shuffled_Y[k * MiniBatchSize:(k + 1) * MiniBatchSize,:]
miniBatch = (miniBatch_X, miniBatch_Y)
miniBatches.append(miniBatch)
#handeling last batch
if m % MiniBatchSize != 0:
# end = m - MiniBatchSize * m // MiniBatchSize
miniBatch_X = shuffled_X[num_minibatches * MiniBatchSize:, :]
miniBatch_Y = shuffled_Y[num_minibatches * MiniBatchSize:, :]
miniBatch = (miniBatch_X, miniBatch_Y)
miniBatches.append(miniBatch)
return miniBatches
ã¢ãã«ã®ãã©ãŒãããã¯ã©ããªããŸããïŒ
深局åŠç¿ã«äžæ £ããªæ¹ã®ããã«ãã¢ãã«ã®æŠèŠã説æããŸããããã¯æ¬¡ã®ããã«ãªããŸãã
def model(X,Y,learning_rate,num_iter,hidden_size,keep_prob,optimizer):
L = len(hidden_size)
params = initilization(X.shape[1], hidden_size)
for i in range(1,num_iter):
MiniBatches = RandomMiniBatches(X, Y, 64) # GET RAMDOMLY MINIBATCHES
p , q = MiniBatches[2]
for MiniBatch in MiniBatches: #LOOP FOR MINIBATCHES
(MiniBatch_X, MiniBatch_Y) = MiniBatch
cache, A = model_forward(MiniBatch_X, params, L,keep_prob) #FORWARD PROPOGATIONS
cost = cost_f(A, MiniBatch_Y) #COST FUNCTION
grad = backward(MiniBatch_X, MiniBatch_Y, params, cache, L,keep_prob) #BACKWARD PROPAGATION
params = update_params(params, grad, beta=0.9,learning_rate=learning_rate)
return params
次ã®å³ã§ã¯ãSGDã«å€§ããªå€åãããããšãããããŸããåçŽæ¹åã®åãã¯å¿ èŠãããŸããïŒæ°Žå¹³æ¹åã®åãã ããå¿ èŠã§ããåçŽæ¹åã®åããæžãããæ°Žå¹³æ¹åã®åããå¢ãããšãã¢ãã«ã®åŠç¿ãéããªããŸããã

äžèŠãªæ¯åãæå°éã«æããæ¹æ³ã¯ïŒæ¬¡ã®ãªããã£ãã€ã¶ã¯ããããæå°åããåŠç¿ãã¹ããŒãã¢ããããã®ã«åœ¹ç«ã¡ãŸãã
2.ã€ã³ãã«ã¹ãªããã£ãã€ã¶
SGDãåŸé éäžã«ã¯å€ãã®èºèºããããŸããäžäžã§ã¯ãªããåé²ããå¿ èŠããããŸããã¢ãã«ã®åŠç¿çãæ£ããæ¹åã«äžããå¿ èŠãããããããã¢ã¡ã³ã¿ã ãªããã£ãã€ã¶ãŒã§è¡ããŸãã

äžã®åçã§ãããããã«ããã«ã¹ãªããã£ãã€ã¶ãŒã®ã°ãªãŒã³ã©ã€ã³ã¯ä»ã®ã©ã€ã³ãããé«éã§ããå€§èŠæš¡ãªããŒã¿ã»ãããšå€ãã®å埩ãããå Žåããã°ããåŠç¿ããããšã®éèŠæ§ãããããŸãããã®ãªããã£ãã€ã¶ãŒãå®è£ ããæ¹æ³ã¯ïŒ

βã®éåžžã®å€ã¯çŽ0.9ã§ããéäŒæãã©ã¡ãŒã¿ãŒãã
2ã€ã®ãã©ã¡ãŒã¿ãŒïŒvdWãšvdbïŒãäœæããããšãããããŸããå€Î²= 0.9ãèãããšãæ¹çšåŒã¯æ¬¡ã®åœ¢åŒã«ãªããŸãã
vdw= 0.9 * vdw + 0.1 * dw
vdb = 0.9 * vdb + 0.1 * db
ã芧ã®ãšãããvdwã¯ãdwã§ã¯ãªãvdwã®ä»¥åã®å€ã«äŸåããŠããŸããã¬ã³ããªã³ã°ãã°ã©ãã®å ŽåãMomentumOptimizerãéå»ã®åŸé ãèæ ®ããŠæŽæ°ãã¹ã ãŒãºã«ããŠããããšãããããŸãããããå€åãæå°éã«æããããšãã§ããçç±ã§ããSGDã䜿çšããå ŽåããããããåŸé éäžããã©ãçµè·¯ã¯åæã«åãã£ãŠæ¯åããŸãããMomentum Optimizerã¯ããããã®å€åãæžããã®ã«åœ¹ç«ã¡ãŸãã
def update_params_with_momentum(params, grads, v, beta, learning_rate):
# grads has the dw and db parameters from backprop
# params has the W and b parameters which we have to update
for l in range(len(params) // 2 ):
# HERE WE COMPUTING THE VELOCITIES
v["dW" + str(l + 1)] = beta * v["dW" + str(l + 1)] + (1 - beta) * grads['dW' + str(l + 1)]
v["db" + str(l + 1)] = beta * v["db" + str(l + 1)] + (1 - beta) * grads['db' + str(l + 1)]
#updating parameters W and b
params["W" + str(l + 1)] = params["W" + str(l + 1)] - learning_rate * v["dW" + str(l + 1)]
params["b" + str(l + 1)] = params["b" + str(l + 1)] - learning_rate * v["db" + str(l + 1)]
return params
ãªããžããªã¯ãã¡ã
3.ã«ãŒãå¹³åäºä¹ã¹ãã¬ãã
ã«ãŒãã«ãŒãå¹³åäºä¹ïŒRMSpropïŒã¯ãææ°é¢æ°çã«æžè¡°ããå¹³åã§ããRMSpropã®æ¬è³ªçãªç¹æ§ã¯ãéå»ã®åŸé ã®åèšã ãã«å¶éãããã®ã§ã¯ãªããæåŸã®ã¿ã€ã ã¹ãããã®åŸé ã«ããã«å¶éãããããšã§ããRMSpropã¯ãéå»ã®ãäºä¹ååŸé ãã®ææ°é¢æ°çã«æžè¡°ããå¹³åã«å¯äžããŸããRMSPropã§ã¯ãå¹³åã䜿çšããŠåçŽæ¹åã®åããæžãããããšããŠããŸããããã¯ãå¹³åããšããšãåèšãçŽ0ã«ãªãããã§ããRMSpropã¯ãæŽæ°ã®å¹³åãæäŸããŸãã

ãœãŒã¹

以äžã®ã³ãŒããèŠãŠãã ãããããã«ããããã®ãªããã£ãã€ã¶ãŒãå®è£ ããæ¹æ³ã®åºæ¬ãçè§£ã§ããŸãããã¹ãŠSGDãšåãã§ãæŽæ°æ©èœã倿Žããå¿ èŠããããŸãã
def initilization_RMS(params):
s = {}
for i in range(len(params)//2 ):
s["dW" + str(i)] = np.zeros(params["W" + str(i)].shape)
s["db" + str(i)] = np.zeros(params["b" + str(i)].shape)
return s
def update_params_with_RMS(params, grads,s, beta, learning_rate):
# grads has the dw and db parameters from backprop
# params has the W and b parameters which we have to update
for l in range(len(params) // 2 ):
# HERE WE COMPUTING THE VELOCITIES
s["dW" + str(l)]= beta * s["dW" + str(l)] + (1 - beta) * np.square(grads['dW' + str(l)])
s["db" + str(l)] = beta * s["db" + str(l)] + (1 - beta) * np.square(grads['db' + str(l)])
#updating parameters W and b
params["W" + str(l)] = params["W" + str(l)] - learning_rate * grads['dW' + str(l)] / (np.sqrt( s["dW" + str(l)] )+ pow(10,-4))
params["b" + str(l)] = params["b" + str(l)] - learning_rate * grads['db' + str(l)] / (np.sqrt( s["db" + str(l)]) + pow(10,-4))
return params
4.ã¢ãã ãªããã£ãã€ã¶ãŒ
Adamã¯ããã¥ãŒã©ã«ãããã¯ãŒã¯ãã¬ãŒãã³ã°ã§æãå¹ççãªæé©åã¢ã«ãŽãªãºã ã®1ã€ã§ããããã¯ãRMSPropãšPulseOptimizerã®ã¢ã€ãã¢ãçµã¿åããããã®ã§ããAdamã¯ãRMSPropã®ããã«ãæåã®ã¢ãŒã¡ã³ãã®å¹³åïŒå¹³åïŒã«åºã¥ããŠãã©ã¡ãŒã¿ãŒã®åŠç¿çãé©å¿ããã代ããã«ãåŸé ã®2çªç®ã®ã¢ãŒã¡ã³ãã®å¹³åã䜿çšããŸããç¹ã«ãã¢ã«ãŽãªãºã ã¯ãåŸé ãš2次åŸé ã®ææ°ç§»åå¹³åãèšç®ãããã©ã¡ãŒã¿ãŒ
beta1
ãèšç®ããŠbeta2
ããããã®ç§»åå¹³åã®æžè¡°çãå¶åŸ¡ããŸããã©ããã£ãŠïŒ
def initilization_Adam(params):
s = {}
v = {}
for i in range(len(params)//2 ):
v["dW" + str(i)] = np.zeros(params["W" + str(i)].shape)
v["db" + str(i)] = np.zeros(params["b" + str(i)].shape)
s["dW" + str(i)] = np.zeros(params["W" + str(i)].shape)
s["db" + str(i)] = np.zeros(params["b" + str(i)].shape)
return v, s
def update_params_with_Adam(params, grads,v, s, beta1,beta2, learning_rate,t):
epsilon = pow(10,-8)
v_corrected = {}
s_corrected = {}
# grads has the dw and db parameters from backprop
# params has the W and b parameters which we have to update
for l in range(len(params) // 2 ):
# HERE WE COMPUTING THE VELOCITIES
v["dW" + str(l)] = beta1 * v["dW" + str(l)] + (1 - beta1) * grads['dW' + str(l)]
v["db" + str(l)] = beta1 * v["db" + str(l)] + (1 - beta1) * grads['db' + str(l)]
v_corrected["dW" + str(l)] = v["dW" + str(l)] / (1 - np.power(beta1, t))
v_corrected["db" + str(l)] = v["db" + str(l)] / (1 - np.power(beta1, t))
s["dW" + str(l)] = beta2 * s["dW" + str(l)] + (1 - beta2) * np.power(grads['dW' + str(l)], 2)
s["db" + str(l)] = beta2 * s["db" + str(l)] + (1 - beta2) * np.power(grads['db' + str(l)], 2)
s_corrected["dW" + str(l)] = s["dW" + str(l)] / (1 - np.power(beta2, t))
s_corrected["db" + str(l)] = s["db" + str(l)] / (1 - np.power(beta2, t))
params["W" + str(l)] = params["W" + str(l)] - learning_rate * v_corrected["dW" + str(l)] / np.sqrt(s_corrected["dW" + str(l)] + epsilon)
params["b" + str(l)] = params["b" + str(l)] - learning_rate * v_corrected["db" + str(l)] / np.sqrt(s_corrected["db" + str(l)] + epsilon)
return params
ãã€ããŒãã©ã¡ãŒã¿
- β1ïŒbeta1ïŒå€ã¯ã»ãŒ0.9
- β2ïŒbeta2ïŒ-ã»ãŒ0.999
- ε-ãŒãã«ããé€ç®ãé²ããŸãïŒ10 ^ -8ïŒïŒåŠç¿ã«ããŸã圱é¿ããŸããïŒ
ãªããã®ãªããã£ãã€ã¶ãŒïŒ
ãã®å©ç¹ïŒ
- ç°¡åãªå®è£ ã
- èšç®å¹çã
- äœã¡ã¢ãªèŠä»¶ã
- åŸé ã®å¯Ÿè§ã¹ã±ãŒãªã³ã°ã«å¯ŸããŠäžå€ã
- ããŒã¿ãšãã©ã¡ãŒã¿ã®ç¹ã§å€§èŠæš¡ãªã¿ã¹ã¯ã«æé©ã§ãã
- éå®åžžç®çã«é©ããŠããŸãã
- éåžžã«ãã€ãºã®å€ãããŸãã¯ãŸã°ããªåŸé ã®ã¿ã¹ã¯ã«é©ããŠããŸãã
- ãã€ããŒãã©ã¡ãŒã¿ã¯åçŽã§ãéåžžã¯ã»ãšãã©èª¿æŽããå¿ èŠããããŸããã
ã¢ãã«ãäœæããŠããã€ããŒãã©ã¡ãŒã¿ãåŠç¿ãã©ã®ããã«ã¹ããŒãã¢ããããããèŠãŠã¿ãŸããã
åŠç¿ãå éããæ¹æ³ã®å®è·µçãªãã¢ã³ã¹ãã¬ãŒã·ã§ã³ãããŸãããããã®èšäºã§ã¯ãæã ã¯ä»ã®ãã®ãïŒåæåãäžæ ã説æã§ããªãã ãã
forward_prop
ãback_prop
ãåŸé
éäžãããã³ããã«ãD.ïŒããã¬ãŒãã³ã°ã«å¿
èŠãªæ©èœã¯ãã§ã«NumPyã«çµã¿èŸŒãŸããŠããŸããããªãããããèŠãããªãã°ãããã«ãªã³ã¯ããããŸãïŒ
ã¯ãããŸãããïŒ
ããã§èª¬æãããã¹ãŠã®ãªããã£ãã€ã¶ãŒã§æ©èœããæ±çšã¢ãã«é¢æ°ãäœæããŠããŸãã
1.åæåïŒïŒ
ãã®
features_size
å Žåã¯12288ïŒãªã©ã®å
¥åãšãµã€ãºã®é衚瀺é
åïŒ[100,1]ã䜿çšïŒãåãåãåæå颿°ã䜿çšããŠãã©ã¡ãŒã¿ãŒãåæåãããã®åºåãåæåãã©ã¡ãŒã¿ãŒãšããŠäœ¿çšããŸããå¥ã®åæåæ¹æ³ããããŸãããã®èšäºãèªãããšããå§ãããŸãã
def initilization(input_size,layer_size):
params = {}
np.random.seed(0)
params['W' + str(0)] = np.random.randn(layer_size[0], input_size) * np.sqrt(2 / input_size)
params['b' + str(0)] = np.zeros((layer_size[0], 1))
for l in range(1,len(layer_size)):
params['W' + str(l)] = np.random.randn(layer_size[l],layer_size[l-1]) * np.sqrt(2/layer_size[l])
params['b' + str(l)] = np.zeros((layer_size[l],1))
return params
2.é æ¹åäŒæ¬ïŒ
ãã®é¢æ°ã§ã¯ãå ¥åã¯Xã§ããããã©ã¡ãŒã¿ãŒãé衚瀺ã¬ã€ã€ãŒã®ç¯å²ãããã³ããããã¢ãŠãã¯ãããããã¢ãŠãææ³ã§äœ¿çšãããŸãã
ã¯ãŒã¯ã¢ãŠãã§å¹æãèŠãããªãããã«ãå€ã1ã«èšå®ããŸãããã¢ãã«ããªãŒããŒãã£ããããŠããå Žåã¯ãå¥ã®å€ãèšå®ã§ããŸããããããã¢ãŠãã¯å¶æ°ã¬ã€ã€ãŒã«ã®ã¿é©çšããŸãã
颿°ã䜿çšããŠãåã¬ã€ã€ãŒã®ã¢ã¯ãã£ããŒã·ã§ã³å€ãèšç®ããŸã
forward_activation
ã
#activations-----------------------------------------------
def forward_activation(A_prev, w, b, activation):
z = np.dot(A_prev, w.T) + b.T
if activation == 'relu':
A = np.maximum(0, z)
elif activation == 'sigmoid':
A = 1/(1+np.exp(-z))
else:
A = np.tanh(z)
return A
#________model forward ____________________________________________________________________________________________________________
def model_forward(X,params, L,keep_prob):
cache = {}
A =X
for l in range(L-1):
w = params['W' + str(l)]
b = params['b' + str(l)]
A = forward_activation(A, w, b, 'relu')
if l%2 == 0:
cache['D' + str(l)] = np.random.randn(A.shape[0],A.shape[1]) < keep_prob
A = A * cache['D' + str(l)] / keep_prob
cache['A' + str(l)] = A
w = params['W' + str(L-1)]
b = params['b' + str(L-1)]
A = forward_activation(A, w, b, 'sigmoid')
cache['A' + str(L-1)] = A
return cache, A
3.éäŒæïŒ
ããã§ã¯ãéäŒæé¢æ°ãèšè¿°ããŸããgradïŒslopeïŒãè¿ããŸãã
grad
ãã©ã¡ãŒã¿ãæŽæ°ãããšãã«äœ¿çšããŸãïŒããã«ã€ããŠç¥ããªãå ŽåïŒããã®èšäºãèªãããšããå§ãããŸãã
def backward(X, Y, params, cach,L,keep_prob):
grad ={}
m = Y.shape[0]
cach['A' + str(-1)] = X
grad['dz' + str(L-1)] = cach['A' + str(L-1)] - Y
cach['D' + str(- 1)] = 0
for l in reversed(range(L)):
grad['dW' + str(l)] = (1 / m) * np.dot(grad['dz' + str(l)].T, cach['A' + str(l-1)])
grad['db' + str(l)] = 1 / m * np.sum(grad['dz' + str(l)].T, axis=1, keepdims=True)
if l%2 != 0:
grad['dz' + str(l-1)] = ((np.dot(grad['dz' + str(l)], params['W' + str(l)]) * cach['D' + str(l-1)] / keep_prob) *
np.int64(cach['A' + str(l-1)] > 0))
else :
grad['dz' + str(l - 1)] = (np.dot(grad['dz' + str(l)], params['W' + str(l)]) *
np.int64(cach['A' + str(l - 1)] > 0))
return grad
ãªããã£ãã€ã¶ã®æŽæ°æ©èœã¯ãã§ã«èŠãŠããã®ã§ãããã§äœ¿çšããŸããSGDã®èª¬æãããã¢ãã«é¢æ°ã«ããã€ãã®å°ããªå€æŽãå ããŸãããã
def model(X,Y,learning_rate,num_iter,hidden_size,keep_prob,optimizer):
L = len(hidden_size)
params = initilization(X.shape[1], hidden_size)
costs = []
itr = []
if optimizer == 'momentum':
v = initilization_moment(params)
elif optimizer == 'rmsprop':
s = initilization_RMS(params)
elif optimizer == 'adam' :
v,s = initilization_Adam(params)
for i in range(1,num_iter):
MiniBatches = RandomMiniBatches(X, Y, 32) # GET RAMDOMLY MINIBATCHES
p , q = MiniBatches[2]
for MiniBatch in MiniBatches: #LOOP FOR MINIBATCHES
(MiniBatch_X, MiniBatch_Y) = MiniBatch
cache, A = model_forward(MiniBatch_X, params, L,keep_prob) #FORWARD PROPOGATIONS
cost = cost_f(A, MiniBatch_Y) #COST FUNCTION
grad = backward(MiniBatch_X, MiniBatch_Y, params, cache, L,keep_prob) #BACKWARD PROPAGATION
if optimizer == 'momentum':
params = update_params_with_momentum(params, grad, v, beta=0.9,learning_rate=learning_rate)
elif optimizer == 'rmsprop':
params = update_params_with_RMS(params, grad, s, beta=0.9,learning_rate=learning_rate)
elif optimizer == 'adam' :
params = update_params_with_Adam(params, grad,v, s, beta1=0.9,beta2=0.999, learning_rate=learning_rate,t=i) #UPDATE PARAMETERS
elif optimizer == "minibatch":
params = update_params(params, grad,learning_rate=learning_rate)
if i%5 == 0:
costs.append(cost)
itr.append(i)
if i % 100 == 0 :
print('cost of iteration______{}______{}'.format(i,cost))
return params,costs,itr
ããããã¯ã§ã®ãã¬ãŒãã³ã°
params, cost_sgd,itr = model(X_train, Y_train, learning_rate = 0.01,
num_iter=500, hidden_size=[100, 1],keep_prob=1,optimizer='minibatch')
Y_train_pre = predict(X_train, params, 2)
print('train_accuracy------------', accuracy_score(Y_train_pre, Y_train))
ããããã±ãŒãžã§ã¢ãããŒããããšãã®çµè«ïŒ
cost of iteration______100______0.35302967575683797
cost of iteration______200______0.472914548745098
cost of iteration______300______0.4884728238471557
cost of iteration______400______0.21551100063345618
train_accuracy------------ 0.8494208494208494
ãã«ã¹ãªããã£ãã€ã¶ãŒãã¬ãŒãã³ã°
params,cost_momentum, itr = model(X_train, Y_train, learning_rate = 0.01,
num_iter=500, hidden_size=[100, 1],keep_prob=1,optimizer='momentum')
Y_train_pre = predict(X_train, params, 2)
print('train_accuracy------------', accuracy_score(Y_train_pre, Y_train))
ãã«ã¹ãªããã£ãã€ã¶åºåïŒ
cost of iteration______100______0.36278494129038086
cost of iteration______200______0.4681552335189021
cost of iteration______300______0.382226159384529
cost of iteration______400______0.18219310793752702 train_accuracy------------ 0.8725868725868726
RMSpropã䜿çšãããã¬ãŒãã³ã°
params,cost_rms,itr = model(X_train, Y_train, learning_rate = 0.01,
num_iter=500, hidden_size=[100, 1],keep_prob=1,optimizer='rmsprop')
Y_train_pre = predict(X_train, params, 2)
print('train_accuracy------------', accuracy_score(Y_train_pre, Y_train))
RMSpropåºåïŒ
cost of iteration______100______0.2983858963793841
cost of iteration______200______0.004245700579927428
cost of iteration______300______0.2629426607580565
cost of iteration______400______0.31944824707807556 train_accuracy------------ 0.9613899613899614
ã¢ãã ãšã®ãã¬ãŒãã³ã°
params,cost_adam, itr = model(X_train, Y_train, learning_rate = 0.01,
num_iter=500, hidden_size=[100, 1],keep_prob=1,optimizer='adam')
Y_train_pre = predict(X_train, params, 2)
print('train_accuracy------------', accuracy_score(Y_train_pre, Y_train))
ã¢ãã ã®çµè«ïŒ
cost of iteration______100______0.3266223660473619
cost of iteration______200______0.08214547683157716
cost of iteration______300______0.0025645257286439583
cost of iteration______400______0.058015188756586206 train_accuracy------------ 0.9845559845559846
2ã€ã®ç²ŸåºŠã®éããèŠãããšããããŸããïŒåãåæåãã©ã¡ãŒã¿ãŒãåãåŠç¿çãåãååŸ©åæ°ã䜿çšããŸããããªããã£ãã€ã¶ãŒã ããç°ãªããŸãããçµæãèŠãŠãã ããïŒ
Mini-batch accuracy : 0.8494208494208494
momemtum accuracy : 0.8725868725868726
Rms accuracy : 0.9613899613899614
adam accuracy : 0.9845559845559846
ã¢ãã«ã®ã°ã©ãã£ãã¯èŠèŠå

ã³ãŒãã«çå ãããå Žåã¯ããªããžããªã確èªã§ããŸãã
æŠèŠ

ãœãŒã¹
ãããŸã§èŠãŠããããã«ãAdamãªããã£ãã€ã¶ãŒã¯ä»ã®ãªããã£ãã€ã¶ãŒãšæ¯èŒããŠåªãã粟床ãæäŸããŸããäžã®å³ã¯ãã¢ãã«ãå埩ãéããŠã©ã®ããã«åŠç¿ãããã瀺ããŠããŸããMomentumã¯SGDé床ã瀺ããRMSPropã¯æŽæ°ããããã©ã¡ãŒã¿ãŒã®éã¿ã®ææ°å¹³åã瀺ããŸããäžèšã®ã¢ãã«ã§ã¯äœ¿çšããããŒã¿ãå°ãªããªã£ãŠããŸãããå€§èŠæš¡ãªããŒã¿ã»ãããšå€ãã®å埩ãåŠçããå Žåããªããã£ãã€ã¶ãŒã®ã¡ãªããã倧ãããªããŸãããªããã£ãã€ã¶ãŒã®åºæ¬çãªèãæ¹ã«ã€ããŠèª¬æããŸãããããã«ããããªããã£ãã€ã¶ãŒã«ã€ããŠããã«åŠã³ã䜿çšããåæ©ãåŸãããããšãé¡ã£ãŠããŸãã
ãªãœãŒã¹
- Coursera â
- â
ãã¥ãŒã©ã«ãããã¯ãŒã¯ãšãã£ãŒããã·ã³ã©ãŒãã³ã°ã®èŠéãã¯éåžžã«å€§ãããæãæ§ãããªèŠç©ããã«ãããšãäžçãžã®åœ±é¿ã¯ã19äžçŽã®ç£æ¥ãžã®é»åã®åœ±é¿ãšã»ãŒåãã§ãã誰ãããæ©ããããã®èŠéããè©äŸ¡ããå°éå®¶ã¯ã鲿©ã®ãªãŒããŒã«ãªããã£ã³ã¹ããããŸãããã®ãããªäººã ã®ããã«ãç§ãã¡ã¯ããã¢ãŒã·ã§ã³ã³ãŒãHABRãäœæããŸãããããã¯ããããŒã«ç€ºãããŠãããã¬ãŒãã³ã°å²åŒã«ããã«10ïŒ ãäžããŸãã

- æ©æ¢°åŠç¿ã³ãŒã¹
- ã³ãŒã¹ãããŒã¿ãµã€ãšã³ã¹ã®ããã®æ°åŠã𿩿¢°åŠç¿ã
- äžçŽã³ãŒã¹ãMachineLearningPro + DeepLearningã
- ããŒã¿ãµã€ãšã³ã¹ã®ãªã³ã©ã€ã³ããŒããã£ã³ã
- ããŒã¿ã¢ããªã¹ãã®è·æ¥ããŒããããã¬ãŒãã³ã°ãã
- ããŒã¿åæãªã³ã©ã€ã³ããŒããã£ã³ã
- ããŒã¿ãµã€ãšã³ã¹ã®å°éå®¶ããŒãããæãã
- Python forWebéçºã³ãŒã¹
ãã®ä»ã®ã³ãŒã¹
- DevOps
- -
- iOS-
- Android-
- Java-
- JavaScript
ããããèšäº
- ãªã³ã©ã€ã³ã³ãŒã¹ãªãã§ããŒã¿ãµã€ãšã³ãã£ã¹ãã«ãªãæ¹æ³
- 450
- Machine Learning 5 9
- : 2020
- Machine Learning Computer Vision