Softmax Classification

softmax
Cross Entropy
Low-level Implementation
High-level Implementation
Traing Example

Discrete Probability Distribution(이산확률분포)

Discrete Probability Distribution이란, 이산적인 확률 분포를 이른다.
ex) 주사위를 돌려서 나오는 숫자의 값에 대한 확률 분포

이산 값에 대한 확률이 정확하게 나오며, 연속형 확률분포와 달리 x 값이 정수값으로 떨어져 있다.

Softmax

Convert numbers to probabilitis with softmax

$${P(class = i) = \frac{e^i}{\sum e^i}}$$

pytorch 는 softmax값을 보여줌

softmax는 max와는 다르게 가볍게 max값을 뽑아준다는 의미이므로, 합쳐서 1이 되는 확률값을 보여주게 됨

z = torch.FloatTensor([1,2,3])
z.argmax : [0,0,1]
z.softmax : [0.0900, 0.2447, 0.66521]

hypothesis = F.softmax(z, dim = 0)
print(hypothesis)

>>>
tensor([0.0900, 0.2447, 0.6652])

hypothesis.sum() ## 합치면 1이 됨

>>>
tensor(1.)

Cross Entropy

두개의 확률분포가 주어져 있을 때 그 확률분포가 얼마나 비슷한지를 나타낼 수 있는 수치

$${H(P, Q) = - \mathbb{E}_{X ~ P(x)}[\log Q(x)] = - \sum_{x \in X}P(x) \log Q(x)}$$

cross entropy 를 최소화 하는 것이 중요하다.

Cross Entropy Loss (Low-level)

multi-class classification에서, 우리는 다음과 같은 cross entrpy loss를 구할 수 있다.

$${L = \frac{1}{N} \sum - y\log(\hat{y})}$$

y : 실제 y값
$\hat{y}$ : 예측값

z = torch.rand(3, 5, requires_grad = True)
hypothesis = F.softmax(z, dim = 1) # 행에 대해 softmax를 구해라. 즉 y hat
print(hypothesis)

>>>
tensor([[0.2645, 0.1639, 0.1855, 0.2585, 0.1277],
        [0.2430, 0.1624, 0.2322, 0.1930, 0.1694],
        [0.2226, 0.1986, 0.2326, 0.1594, 0.1868]], grad_fn=<SoftmaxBackward>)

y = torch.randint(5, (3,)).long() # 5보다 작은 수로 3개 행 나타내기
print(y)

>>>
tensor([0, 2, 1])

y_one_hot = torch.zeros_like(hypothesis)
y_one_hot.scatter_(1, y.unsqueeze(1), 1) # dim = 1을 가지고 각 자리에 1을 입력
# _가 잇으니 inplace 됨

>>>
tensor([[1., 0., 0., 0., 0.],
        [0., 0., 1., 0., 0.],
        [0., 1., 0., 0., 0.]])

## cross entropy loss
cost = (y_one_hot * -torch.log(hypothesis)).sum(dim = 1).mean()
print(cost)

>>>
tensor(1.4689, grad_fn=<MeanBackward0>)

Cross-entropy Loss with torch.nn.functional

## low level
F.nll_loss(F.log_softmax(z, dim = 1), y) # negative log likelihood loss
## == (y_one_hot * -torch.log(F.softmax(z, dim = 1))).sum(dim = 1).mean()
## == (y_one_hot * -F.log_softmax(z, dim = 1)).sum(dim = 1).mean()

# high level
F.cross_entropy(z, y)

'Study > DL_Basic' 카테고리의 다른 글

[파이토치로 시작하는 딥러닝 기초]10.1_Convolutional Neural Network (0)	2020.12.30
[파이토치로 시작하는 딥러닝 기초]07_MLE, Overfitting, Regularization, Learning Rate (0)	2020.12.28
[파이토치로 시작하는 딥러닝 기초]05_ Logistic Regression (0)	2020.12.22
[파이토치로 시작하는 딥러닝 기초]04.02_Loading Data (0)	2020.12.21
[파이토치로 시작하는 딥러닝 기초]04.01_Multivariable_Linear_regression (0)	2020.12.21

[파이토치로 시작하는 딥러닝 기초]06_Softmax Classification