Processing math: 100%

개발공부기록

gradient descent(3)

Batch Normalization
Batch Normalization Definition 인공신경망을 re-centering과 re-scaling으로 layer의 input 정규화를 통해 더 빠르고 안정화시키는 방법 Motivation Internal covariate shift Covariate shift : 이전 레이어의 파라미터 변화로 현재 레이어 입력 분포가 바뀌는 현상 Internal covariate shift : 레이어 통과시 마다 covariate shift가 발생해 입력 분포가 약간씩 변하는 현상 망이 깊어짐에 따라 작은 변화가 뒷단에 큰 영향을 미침 Covariate Shift 줄이는 방법 layer's input 을 whitening 시킴(입력 평균:0, 분산:1) whitening이 backpropagatio..
2021.10.27
Adam Optimizer
Adam Optimizer Optimizer Loss Function의 결과값을 최소화하는 모델 파라미터를 찾는것 최적화 알고리즘 Network가 빠르고 정확하게 학습하도록 도와줌 Background Batch Gradient Descent 목적함수 $encoding="application/x-tex">f(\theta)</annotation></semantics></math>$ 의 $encoding="application/x-tex">\theta</annotation></semantics></math>$ 는 전체 훈련 데이터의 $encoding="application/x-tex">\theta</annotation></semantics></math>$ 에 관한 $f$ 의 gradient를 기반으로 업데이트 $encoding="application/x-tex">g_t=\nabla_{\theta_{t-1}}f(\theta_{t-1})</annotation></semantics></math>$ $encoding="application/x-tex">\theta_t=\theta_{t-1}-\alpha g_t</annotation></semantics></math>$ $encoding="application/x-tex">\alpha</annotation></semantics></math>$ : learning rate t : t 번째 반복 주요 문제 : 목적함수의 local minima 또는 saddle point에 갇히는것 좋지 않은 수렴을 하게 만드는 learn..
2021.10.26
Gradient Descent
Gradient Descent 1차 근삿값 발견용 최적화 알고리즘 함수의 기울기(경사)를 구하고 경사의 절대값이 낮은 쪽으로 계속 이동시켜 극값에 이를 때 까지 반복시키는 것 최적화할 함수 $encoding="application/x-tex">{\displaystyle f(\mathbf {x} )}</annotation></semantics></math>$ 에 대하여, 먼저 시작점 $encoding="application/x-tex">\mathbf {x} _{0}</annotation></semantics></math>$ 를 정한다. 현재 가 주어졌을 때, 그 다음으로 이동할 점인 $encoding="application/x-tex">{\mathbf {x}}{i}</annotation></semantics></math>$ 은 다음과 같이 계산된다. $encoding="application/x-tex">{\displaystyle \mathbf {x} _{i+1}=\mathbf {x} _{i}-\gamma _{i}\nabla f(\mathbf {x} _{i})}</annotation></semantics></math>$ 이때 $\gamma _{i}</annotation></semantics></math>$ 는 이동할 거리를 조절하는 매개변수이다.이 알고리즘의 수렴 여부는 $f$ 의 성질과 $\gam..
2021.10.19

1

티스토리툴바