出处

论文：Deep Residual Learning for Image Recognition

作者：Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun

ImageNet Top5错误率： 3.57%

主要思想

主要体现在 Residual(残差)，从名字就可以看出，不学绝对值，而学差值。不去学绝对的完全重构映射，只学映射后相对于原来的偏差，即和identity的差值，绝对变相对，容易多了。前向，容易学习，后向，有了梯度高速通道，更好训练，能避免梯度消失。

残差块（以BasicBlock为例）

一般的网络结构下，输入Xl 直接经过两个卷积层，就可以得到输出Xl+1，而残差块则是将（通过两个卷积层所得到的输出）加上（网络的输入Xl），有的将这个过程成为skip connection。

skip connection 不只是可以直接将输入Xl与卷积结果相加，某些情况下，输入Xl与Xl+1维度不同，因此还可以加入1*1卷积对输入进行降维，从而使Xl与Xl+1维度相同，两者才可以相加。

网络结构

左边是BasicBlock，ResNet18和ResNet34就由其堆叠。
右边是BottleNeck，多了一层，用1x1的卷积先降通道再升通道（首先做一个降维，然后做卷积，然后升维，这样做的好处是可以大大减少计算量，专门用于网络层数较深的的网络，ResNet-50以上的网络都有这种基础结构构成，如ResNet50、ResNet101、ResNet152就由其堆叠）。当要降尺度的时候，3x3卷积使用stride 2（同时旁边的shortcut也需要一个1x1的stride 2卷积，而非直接用输入的identity，这样可以使得后面相加的时候尺寸一致，因为不同层级的输入输出维度可能会不一样，但是结构类似）。平时的卷积都是stride 1。

使用1x1卷积，对稀疏信息进行压缩，有效利用计算力，所以效率更高。

代码实现

BasicBlock的代码

def res_block_v1(x, input_filter, output_filter):
    res_x = Conv2D(kernel_size=(3,3), filters=output_filter, strides=1, padding='same')(x)
    res_x = BatchNormalization()(res_x)
    res_x = Activation('relu')(res_x)
    res_x = Conv2D(kernel_size=(3,3), filters=output_filter, strides=1, padding='same')(res_x)
    res_x = BatchNormalization()(res_x)
    if input_filter == output_filter:
        identity = x
    else: #需要升维或者降维
        identity = Conv2D(kernel_size=(1,1), filters=output_filter, strides=1, padding='same')(x)
    x = keras.layers.add([identity, res_x])
    output = Activation('relu')(x)
    return output

BottleNeck结构的代码

Pytorch 中的代码，注意到上图中为了减少计算量，作者将 256 维的输入缩小了 4 倍变为 64 进入卷积，在升维时需要升到 256 维，对应代码中的 expansion 参数：

class Bottleneck(nn.Module):
    expansion = 4

    def __init__(self, inplanes, planes, stride=1, downsample=None, groups=1,
                 base_width=64, dilation=1, norm_layer=None):
        super(Bottleneck, self).__init__()
        if norm_layer is None:
            norm_layer = nn.BatchNorm2d
        width = int(planes * (base_width / 64.)) * groups
        # Both self.conv2 and self.downsample layers downsample the input when stride != 1
        self.conv1 = conv1x1(inplanes, width)
        self.bn1 = norm_layer(width)
        self.conv2 = conv3x3(width, width, stride, groups, dilation)
        self.bn2 = norm_layer(width)
        self.conv3 = conv1x1(width, planes * self.expansion)
        self.bn3 = norm_layer(planes * self.expansion)
        self.relu = nn.ReLU(inplace=True)
        self.downsample = downsample
        self.stride = stride

    def forward(self, x):
        identity = x

        out = self.conv1(x)
        out = self.bn1(out)
        out = self.relu(out)

        out = self.conv2(out)	 # 要降尺度的话在这里，这里是stride 2的卷积
        out = self.bn2(out)
        out = self.relu(out)

        out = self.conv3(out)
        out = self.bn3(out)

        if self.downsample is not None:	# 需要通道升，尺度降
            identity = self.downsample(x)	 # 实际上是一个stride 2卷积加bn

        out += identity		# 相加
        out = self.relu(out)

        return out

Pytorch中的使用

在 Pytorch 中使用 ResNet 只需要 4 行代码：

from torch import nn
# torchvision 专用于视觉方面
import torchvision 
  
# pretrained ：使用在 ImageNet 数据集上预训练的模型
model = torchvision.models.resnet18(pretrained=True)
# 修改模型的全连接层使其输出为你需要类型数，这里是10
# 由于使用了预训练的模型 而预训练的模型输出为1000类，所以要修改全连接层
# 若不使用预训练的模型可以直接在创建模型时添加参数 num_classes=10 而不需要修改全连接层
model.fc = nn.Linear(model.fc.in_features, 10)

参考1：https://zhuanlan.zhihu.com/p/104657484

参考2：https://zhuanlan.zhihu.com/p/74230238

参考3：https://zhuanlan.zhihu.com/p/32781577

网站首页 > 技术文章正文

Day146:第二讲 ResNet

出处

主要思想

残差块（以BasicBlock为例）

网络结构

代码实现

Pytorch中的使用

猜你喜欢

本文暂时没有评论，来添加一个吧(●'◡'●)

取消回复欢迎你发表评论:

网站首页 > 技术文章 正文

Day146:第二讲 ResNet

出处

主要思想

残差块（以BasicBlock为例）

网络结构

代码实现

Pytorch中的使用

猜你喜欢

本文暂时没有评论，来添加一个吧(●'◡'●)

取消回复欢迎 你 发表评论:

网站首页 > 技术文章正文

取消回复欢迎你发表评论: