[파이토치로 시작하는 딥러닝 기초]10.4_Advance CNN(VGG)

2021. 1. 7. 09:30·AI Study/DL_Basic
반응형

이번 강의에 VGG의 이론적인 설명은 많이 들어있지 않았다.(모두의 딥러닝 시즌 1에 있는 내용이라 했지만, 사실 충분치 않은 내용이었다)

별도로 공부해서 위해 구글링을 한 후 포스팅 해서 아래에 업로드 할 예정이다. 

10.5 Advance CNN

VGG- net이란?

  • 전부 3x3 convolution, stride = 1, padding 1으로만 구성되어 있음

torchvision.models.vgg

  • vgg11 ~ vgg19까지 만들 수 있도록 되어있음
  • 3x224x224입력을 기준으로 만들도록 되어있음
  • input size가 다른 경우 VGG를 적용하려면 어떻게 해야할까?

VGG net 실습 - documentation 그대로 따라써보기

import torch.nn as nn
import torch.utils.model_zoo as model_zoo
__all__ = [
    'VGG', 'vgg11', 'vgg11_bn', 'vgg13', 'vgg13_bn', 'vgg16', 'vgg16_bn',
    'vgg19_bn', 'vgg19',
]


model_urls = {
    'vgg11': 'https://download.pytorch.org/models/vgg11-bbd30ac9.pth',
    'vgg13': 'https://download.pytorch.org/models/vgg13-c768596a.pth',
    'vgg16': 'https://download.pytorch.org/models/vgg16-397923af.pth',
    'vgg19': 'https://download.pytorch.org/models/vgg19-dcbb9e9d.pth',
    'vgg11_bn': 'https://download.pytorch.org/models/vgg11_bn-6002323d.pth',
    'vgg13_bn': 'https://download.pytorch.org/models/vgg13_bn-abd245e5.pth',
    'vgg16_bn': 'https://download.pytorch.org/models/vgg16_bn-6c64b313.pth',
    'vgg19_bn': 'https://download.pytorch.org/models/vgg19_bn-c79401a0.pth',
}
class VGG(nn.Module):
    def __init__(self, features, num_classes=1000, init_weights=True):
        super(VGG, self).__init__()
        
        self.features = features #convolution layer 구조(추후 make_layer로 만들어지는 nn.Sequential)
        
        self.avgpool = nn.AdaptiveAvgPool2d((7, 7)) ## average pooling
        
        self.classifier = nn.Sequential(
            nn.Linear(512 * 7 * 7, 4096),
            nn.ReLU(True),
            nn.Dropout(),
            nn.Linear(4096, 4096),
            nn.ReLU(True),
            nn.Dropout(),
            nn.Linear(4096, num_classes),
        ) # FC layer
        
        if init_weights:
            self._initialize_weights()

    def forward(self, x):
        x = self.features(x) # Convolution 
        x = self.avgpool(x) # avgpool
        x = x.view(x.size(0), -1) 
        x = self.classifier(x) #FC layer
        return x

    def _initialize_weights(self): ## vgg는 bias를 0으로 한다. 
        for m in self.modules():
            if isinstance(m, nn.Conv2d):
                nn.init.kaiming_normal_(m.weight,
                                        mode='fan_out',
                                        nonlinearity='relu') ## 초기화 방법
                if m.bias is not None:
                    nn.init.constant_(m.bias, 0) # bias 값을 0으로
            elif isinstance(m, nn.BatchNorm2d):
                nn.init.constant_(m.weight, 1)
                nn.init.constant_(m.bias, 0)
            elif isinstance(m, nn.Linear):
                nn.init.normal_(m.weight, 0, 0.01)
                nn.init.constant_(m.bias, 0)
def make_layers(cfg, batch_norm=False):
    layers = []
    in_channels = 3
    
    for v in cfg:
        if v == 'M': ## 'M'인 경우는 Max Pooling
            layers += [nn.MaxPool2d(kernel_size=2, stride=2)]
        else: ## 'M'이 아닌 경우 Convlution layer 추가
            conv2d = nn.Conv2d(in_channels, v, kernel_size=3, padding=1)
            if batch_norm:
                layers += [conv2d, nn.BatchNorm2d(v), nn.ReLU(inplace=True)]
            else:
                layers += [conv2d, nn.ReLU(inplace=True)] ## layer 쌓기
            in_channels = v ## input size 최종
                     
    return nn.Sequential(*layers) ## Sequential 시키기

example) [64, 'M', 128, 'M', 256, 256, 'M', 512, 512, 'M', 512, 512, 'M'] 로 VGGnet을 구성해본다면, 

conv = make_layers(cfg['custom'], batch_norm=True)

conv

>>>

Sequential(
  (0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))     ## layer 1 : 3 -> 64
  (1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (2): ReLU(inplace=True)
  (3): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))    ## layer 2 : 64 -> 64
  (4): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (5): ReLU(inplace=True)
  (6): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))    ## layer 3 : 64 -> 64
  (7): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (8): ReLU(inplace=True)
  (9): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)  ## maxpooling
  (10): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))  ## layer 4 : 64 -> 128
  (11): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (12): ReLU(inplace=True)
  (13): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) ## layer 5 : 128 -> 128
  (14): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (15): ReLU(inplace=True)
  (16): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) ## layer 6 : 128 -> 128
  (17): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (18): ReLU(inplace=True)
  (19): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False) ## maxpooling
  (20): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) ## layer 7 : 128 -> 256
  (21): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (22): ReLU(inplace=True)
  (23): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) ## layer 8 : 256 -> 256
  (24): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (25): ReLU(inplace=True)
  (26): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) ## layer 9 : 256 -> 256
  (27): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (28): ReLU(inplace=True)
  (29): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False) ## maxpooling
)
cfg = {
    'A': [64, 'M', 128, 'M', 256, 256, 'M', 512, 512, 'M', 512, 512, 'M'], # 8 + 3 =11 == vgg11
    'B': [64, 64, 'M', 128, 128, 'M', 256, 256, 'M', 512, 512, 'M', 512, 512, 'M'], # 10 + 3 = vgg 13
    'D': [64, 64, 'M', 128, 128, 'M', 256, 256, 256, 'M', 512, 512, 512, 'M', 512, 512, 512, 'M'], #13 + 3 = vgg 16
    'E': [64, 64, 'M', 128, 128, 'M', 256, 256, 256, 256, 'M', 512, 512, 512, 512, 'M', 512, 512, 512, 512, 'M'], # 16 +3 =vgg 19
    'custom' : [64,64,64,'M',128,128,128,'M',256,256,256,'M']
}
CNN = VGG(make_layers(cfg['custom']), num_classes = 10, init_weights = True)
CNN

>>>

VGG(
  (features): Sequential(
    (0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (1): ReLU(inplace=True)
    (2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (3): ReLU(inplace=True)
    (4): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (5): ReLU(inplace=True)
    (6): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (7): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (8): ReLU(inplace=True)
    (9): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (10): ReLU(inplace=True)
    (11): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (12): ReLU(inplace=True)
    (13): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (14): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (15): ReLU(inplace=True)
    (16): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (17): ReLU(inplace=True)
    (18): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (19): ReLU(inplace=True)
    (20): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (avgpool): AdaptiveAvgPool2d(output_size=(7, 7))
  (classifier): Sequential(
    (0): Linear(in_features=25088, out_features=4096, bias=True)
    (1): ReLU(inplace=True)
    (2): Dropout(p=0.5, inplace=False)
    (3): Linear(in_features=4096, out_features=4096, bias=True)
    (4): ReLU(inplace=True)
    (5): Dropout(p=0.5, inplace=False)
    (6): Linear(in_features=4096, out_features=10, bias=True)
  )
)

VGG for cifar10

  • VGG net에 cifar10 적용해보기
import torch
import torch.nn as nn

import torch.optim as optim

import torchvision
import torchvision.transforms as transforms
## 사전에 visdom server를 열어줘야 함
# !python -m visdom.server

import visdom

vis = visdom.Visdom()
vis.close(env="main")

 

def loss_tracker(loss_plot, loss_value, num):
    '''num, loss_value, are Tensor'''
    vis.line(X=num,
             Y=loss_value,
             win = loss_plot,
             update='append'
             )
device = 'cuda' if torch.cuda.is_available() else 'cpu'

torch.manual_seed(777)
if device =='cuda':
    torch.cuda.manual_seed_all(77
transform = transforms.Compose(
    [transforms.ToTensor(),
     transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])

trainset = torchvision.datasets.CIFAR10(root='./cifar10', train=True,
                                        download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=512,
                                          shuffle=True, num_workers=0)

testset = torchvision.datasets.CIFAR10(root='./cifar10', train=False,
                                       download=True, transform=transform)

testloader = torch.utils.data.DataLoader(testset, batch_size=4,
                                         shuffle=False, num_workers=0)

classes = ('plane', 'car', 'bird', 'cat',
           'deer', 'dog', 'frog', 'horse', 'ship', 'truck')
import matplotlib.pyplot as plt
import numpy as np
%matplotlib inline
# functions to show an image


def imshow(img):
    img = img / 2 + 0.5     # unnormalize
    npimg = img.numpy()
    plt.imshow(np.transpose(npimg, (1, 2, 0)))
    plt.show()


# get some random training images
dataiter = iter(trainloader)
images, labels = dataiter.next()
vis.images(images/2 + 0.5)

# show images
#imshow(torchvision.utils.make_grid(images))

# print labels
print(' '.join('%5s' % classes[labels[j]] for j in range(4)))

>>>
truck   dog horse truck

make VGG16 using vgg.py

# from py import vgg
import torchvision.models.vgg as vgg
cfg = [32,32,'M', 64,64,128,128,128,'M',256,256,256,512,512,512,'M'] #13 + 3 =vgg16
class VGG(nn.Module):

    def __init__(self, features, num_classes=1000, init_weights=True):
        super(VGG, self).__init__()
        self.features = features
        #self.avgpool = nn.AdaptiveAvgPool2d((7, 7))
        self.classifier = nn.Sequential(
            nn.Linear(512 * 4 * 4, 4096),
            nn.ReLU(True),
            nn.Dropout(),
            nn.Linear(4096, 4096),
            nn.ReLU(True),
            nn.Dropout(),
            nn.Linear(4096, num_classes),
        )
        if init_weights:
            self._initialize_weights()

    def forward(self, x):
        x = self.features(x)
        #x = self.avgpool(x)
        x = x.view(x.size(0), -1)
        x = self.classifier(x)
        return x

    def _initialize_weights(self):
        for m in self.modules():
            if isinstance(m, nn.Conv2d):
                nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
                if m.bias is not None:
                    nn.init.constant_(m.bias, 0)
            elif isinstance(m, nn.BatchNorm2d):
                nn.init.constant_(m.weight, 1)
                nn.init.constant_(m.bias, 0)
            elif isinstance(m, nn.Linear):
                nn.init.normal_(m.weight, 0, 0.01)
                nn.init.constant_(m.bias, 0)
## 모델 생성
vgg16= VGG(vgg.make_layers(cfg),10,True).to(device)
criterion = nn.CrossEntropyLoss().to(device)
optimizer = torch.optim.SGD(vgg16.parameters(), lr = 0.005,momentum=0.9)

lr_sche = optim.lr_scheduler.StepLR(optimizer, step_size=5, gamma=0.9)

Make plot

loss_plt = vis.line(Y=torch.Tensor(1).zero_(),
                    opts=dict(title='loss_tracker', 
                              legend=['loss'], 
                              showlegend=True)
                   )

training

 

print(len(trainloader))
epochs = 20

for epoch in range(epochs):  # loop over the dataset multiple times
    running_loss = 0.0
    lr_sche.step()
    for i, data in enumerate(trainloader, 0):
        # get the inputs
        inputs, labels = data
        inputs = inputs.to(device)
        labels = labels.to(device)

        # zero the parameter gradients
        optimizer.zero_grad()

        # forward + backward + optimize
        outputs = vgg16(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        # print statistics
        running_loss += loss.item()
        if i % 30 == 29:    # print every 30 mini-batches
            loss_tracker(loss_plt,
                         torch.Tensor([running_loss/30]),
                         torch.Tensor([i + epoch*len(trainloader) ]))
            print('[%d, %5d] loss: %.3f' %
                  (epoch + 1, i + 1, running_loss / 30))
            running_loss = 0.0
        

print('Finished Training')

>>>

[1,    30] loss: 2.302
[1,    60] loss: 2.299
[1,    90] loss: 2.286
[2,    30] loss: 2.210
[2,    60] loss: 2.136
[2,    90] loss: 2.077
...
[13,    90] loss: 0.941
[14,    30] loss: 0.905
[14,    60] loss: 0.907
[14,    90] loss: 0.907

dataiter = iter(testloader)
images, labels = dataiter.next()

# print images
imshow(torchvision.utils.make_grid(images))
print('GroundTruth: ', ' '.join('%5s' % classes[labels[j]] for j in range(4)))

outputs = vgg16(images.to(device))
_, predicted = torch.max(outputs, 1)

print('Predicted: ', ' '.join('%5s' % classes[predicted[j]]
                              for j in range(4)))
correct = 0
total = 0

with torch.no_grad():
    for data in testloader:
        images, labels = data
        images = images.to(device)
        labels = labels.to(device)
        outputs = vgg16(images)
        
        _, predicted = torch.max(outputs.data, 1)
        
        total += labels.size(0)
        
        correct += (predicted == labels).sum().item()

print('Accuracy of the network on the 10000 test images: %d %%' % (
    100 * correct / total))
    
>>>
    
Accuracy of the network on the 10000 test images: 71 %

시간이 오래걸려 epoch을 많이 사용하지 않았더나 성능이 높지는 않았다. 

반응형

'AI Study > DL_Basic' 카테고리의 다른 글

[Deep Learning] Activation function(활성화함수)이란?  (0) 2021.04.22
[딥러닝] CNN 구조 - VGG  (0) 2021.01.08
[파이토치로 시작하는 딥러닝 기초]10.3 ImageFolder / 모델 저장 / 모델 불러오기  (0) 2021.01.04
[파이토치로 시작하는 딥러닝 기초]10.2_visdom  (0) 2020.12.31
[파이토치로 시작하는 딥러닝 기초]10.1_Convolutional Neural Network  (0) 2020.12.30
'AI Study/DL_Basic' 카테고리의 다른 글
  • [Deep Learning] Activation function(활성화함수)이란?
  • [딥러닝] CNN 구조 - VGG
  • [파이토치로 시작하는 딥러닝 기초]10.3 ImageFolder / 모델 저장 / 모델 불러오기
  • [파이토치로 시작하는 딥러닝 기초]10.2_visdom
자동화먹
자동화먹
많은 사람들에게 도움이 되는 생산적인 기록하기
    반응형
  • 자동화먹
    자동화먹의 생산적인 기록
    자동화먹
  • 전체
    오늘
    어제
    • 분류 전체보기 (144)
      • 생산성 & 자동화 툴 (30)
        • Notion (24)
        • Obsidian (0)
        • Make.com (1)
        • tips (5)
      • Programming (37)
        • Python (18)
        • Oracle (6)
        • Git (13)
      • AI Study (65)
        • DL_Basic (14)
        • ML_Basic (14)
        • NLP (21)
        • Marketing&Recommend (4)
        • chatGPT (0)
        • etc (12)
      • 주인장의 생각서랍 (10)
        • 생각정리 (4)
        • 독서기록 (6)
  • 블로그 메뉴

    • 홈
    • 태그
    • 방명록
  • 링크

  • 공지사항

  • 인기 글

  • 태그

    자연어처리
    dl
    GPT
    딥러닝
    nlp
    노션첫걸음
    seq2seq
    Python
    Transformer
    빅데이터분석
    Jupyter notebook
    머신러닝
    pytorch
    LSTM
    데이터베이스
    gcp
    기초
    python기초
    파이토치
    Google Cloud Platform
    notion
    git
    파이토치로 시작하는 딥러닝 기초
    cnn
    빅데이터
    git commit
    노션
    데이터분석
    Github
    ML
  • 최근 댓글

  • 최근 글

  • hELLO· Designed By정상우.v4.10.3
자동화먹
[파이토치로 시작하는 딥러닝 기초]10.4_Advance CNN(VGG)
상단으로

티스토리툴바