ResNet网络

article/2025/8/22 22:55:54

1.1.ResNet的提出

残差网络(ResNet) 是由来自Microsoft Research的4位学者提出的卷积神经网络,在2015年的ImageNet大规模视觉识别竞赛(ImageNet Large Scale Visual Recognition Challenge, ILSVRC)中获得了图像分类和物体识别的优胜。

网络出自论文《Deep Residual Learning for Image Recognition》

resnet18、resnet34、resnet50、resnet101、resnet152结构

 经常看到别人手推网络,很是羡慕,于是决定自己也手推一把。

就拿resnet18来推导吧。

上面这个图是简单示意图,一般分类任务输入图像224*224.

卷积输出,池化输出公式:

o_w =( i_w + 2*p - k)/s + 1

o_h =( i_h + 2*p - k)/s + 1

其中:

o_w、o_h:分别代表输出的宽和高;

i_w、i_h:分别代表输入的宽和高;

k: 卷积或池化的核大小;

p: padding的核大小;

s: stride 步长大小。

上图标识

1. conv1 实际代表了Convolution、BatchNormal、ReLU操作,只有卷积影响尺寸。

输出尺寸 = (224 + 2 * 3 - 7)/ 2 + 1 = 112

2. maxpool

输出尺寸 = (112 + 2 * 1 - 3)/ 2 + 1 = 56

**BasicBlock类,可以对比结构图中的resnet18和resnet34,类中expansion =1,其表示block内部最后一个卷积的输出channel与第一个卷积的输出channel比值,即:**

 expansion=last_block_channel / first_block_channel

接下来是ResNet类,其和我们通常定义的模型差不多一个__init__()+forward(),代码有点长,我们一步步来分析:

- 参考前面的结构图,所有的resnet的第一个conv层都是一样的,输出channel=64
- 然后到了self.layer1 = self._make_layer(block, 64, layers[0]),这里的layers[0]=2,然后我们进入到_make_layer函数,由于stride=1或当前的输入channel和上一个块的输出channel一样,因而可以直接相加
- self.layer2 = self._make_layer(block, 128, layers[1], stride=2),此时planes=128而self.inplanes=64(上一个box_block的输出channel),此时channel不一致,需要对输出的x扩维后才能相加,而downsample 实现的就是该功能(ps:这里只有box_block中的第一个block需要downsample,为何?看图4)
- self.layer3 = self._make_layer(block, 256, layers[2], stride=2),此时planes=256而self.inplanes=128为,此时也需要扩维后才能相加,layer4 同理。


1.2.ResNet的特性

容易优化,并且能够通过增加相当的深度来提高准确率。其内部的残差块使用了跳跃连接,缓解了在深度神经网络中增加深度带来的梯度消失问题

pytorch官方的ResNet实现:

import torch
import torch.nn as nn
from .utils import load_state_dict_from_url__all__ = ['ResNet', 'resnet18', 'resnet34', 'resnet50', 'resnet101','resnet152', 'resnext50_32x4d', 'resnext101_32x8d','wide_resnet50_2', 'wide_resnet101_2']model_urls = {'resnet18': 'https://download.pytorch.org/models/resnet18-5c106cde.pth','resnet34': 'https://download.pytorch.org/models/resnet34-333f7ec4.pth','resnet50': 'https://download.pytorch.org/models/resnet50-19c8e357.pth','resnet101': 'https://download.pytorch.org/models/resnet101-5d3b4d8f.pth','resnet152': 'https://download.pytorch.org/models/resnet152-b121ed2d.pth','resnext50_32x4d': 'https://download.pytorch.org/models/resnext50_32x4d-7cdf4587.pth','resnext101_32x8d': 'https://download.pytorch.org/models/resnext101_32x8d-8ba56ff5.pth','wide_resnet50_2': 'https://download.pytorch.org/models/wide_resnet50_2-95faca4d.pth','wide_resnet101_2': 'https://download.pytorch.org/models/wide_resnet101_2-32ee1156.pth',
}def conv3x3(in_planes, out_planes, stride=1, groups=1, dilation=1):"""3x3 convolution with padding"""return nn.Conv2d(in_planes, out_planes, kernel_size=3, stride=stride,padding=dilation, groups=groups, bias=False, dilation=dilation)def conv1x1(in_planes, out_planes, stride=1):"""1x1 convolution"""return nn.Conv2d(in_planes, out_planes, kernel_size=1, stride=stride, bias=False)class BasicBlock(nn.Module):expansion = 1def __init__(self, inplanes, planes, stride=1, downsample=None, groups=1,base_width=64, dilation=1, norm_layer=None):super(BasicBlock, self).__init__()if norm_layer is None:norm_layer = nn.BatchNorm2dif groups != 1 or base_width != 64:raise ValueError('BasicBlock only supports groups=1 and base_width=64')if dilation > 1:raise NotImplementedError("Dilation > 1 not supported in BasicBlock")# Both self.conv1 and self.downsample layers downsample the input when stride != 1self.conv1 = conv3x3(inplanes, planes, stride)self.bn1 = norm_layer(planes)self.relu = nn.ReLU(inplace=True)self.conv2 = conv3x3(planes, planes)self.bn2 = norm_layer(planes)self.downsample = downsampleself.stride = stridedef forward(self, x):identity = xout = self.conv1(x)out = self.bn1(out)out = self.relu(out)out = self.conv2(out)out = self.bn2(out)if self.downsample is not None:identity = self.downsample(x)out += identityout = self.relu(out)return outclass Bottleneck(nn.Module):# Bottleneck in torchvision places the stride for downsampling at 3x3 convolution(self.conv2)# while original implementation places the stride at the first 1x1 convolution(self.conv1)# according to "Deep residual learning for image recognition"https://arxiv.org/abs/1512.03385.# This variant is also known as ResNet V1.5 and improves accuracy according to# https://ngc.nvidia.com/catalog/model-scripts/nvidia:resnet_50_v1_5_for_pytorch.expansion = 4def __init__(self, inplanes, planes, stride=1, downsample=None, groups=1,base_width=64, dilation=1, norm_layer=None):super(Bottleneck, self).__init__()if norm_layer is None:norm_layer = nn.BatchNorm2dwidth = int(planes * (base_width / 64.)) * groups# Both self.conv2 and self.downsample layers downsample the input when stride != 1self.conv1 = conv1x1(inplanes, width)self.bn1 = norm_layer(width)self.conv2 = conv3x3(width, width, stride, groups, dilation)self.bn2 = norm_layer(width)self.conv3 = conv1x1(width, planes * self.expansion)self.bn3 = norm_layer(planes * self.expansion)self.relu = nn.ReLU(inplace=True)self.downsample = downsampleself.stride = stridedef forward(self, x):identity = xout = self.conv1(x)out = self.bn1(out)out = self.relu(out)out = self.conv2(out)out = self.bn2(out)out = self.relu(out)out = self.conv3(out)out = self.bn3(out)if self.downsample is not None:identity = self.downsample(x)out += identityout = self.relu(out)return outclass ResNet(nn.Module):def __init__(self, block, layers, num_classes=1000, zero_init_residual=False,groups=1, width_per_group=64, replace_stride_with_dilation=None,norm_layer=None):super(ResNet, self).__init__()if norm_layer is None:norm_layer = nn.BatchNorm2dself._norm_layer = norm_layerself.inplanes = 64self.dilation = 1if replace_stride_with_dilation is None:# each element in the tuple indicates if we should replace# the 2x2 stride with a dilated convolution insteadreplace_stride_with_dilation = [False, False, False]if len(replace_stride_with_dilation) != 3:raise ValueError("replace_stride_with_dilation should be None ""or a 3-element tuple, got {}".format(replace_stride_with_dilation))self.groups = groupsself.base_width = width_per_groupself.conv1 = nn.Conv2d(3, self.inplanes, kernel_size=7, stride=2, padding=3,bias=False)self.bn1 = norm_layer(self.inplanes)self.relu = nn.ReLU(inplace=True)self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)self.layer1 = self._make_layer(block, 64, layers[0])self.layer2 = self._make_layer(block, 128, layers[1], stride=2,dilate=replace_stride_with_dilation[0])self.layer3 = self._make_layer(block, 256, layers[2], stride=2,dilate=replace_stride_with_dilation[1])self.layer4 = self._make_layer(block, 512, layers[3], stride=2,dilate=replace_stride_with_dilation[2])self.avgpool = nn.AdaptiveAvgPool2d((1, 1))self.fc = nn.Linear(512 * block.expansion, num_classes)for m in self.modules():if isinstance(m, nn.Conv2d):nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')elif isinstance(m, (nn.BatchNorm2d, nn.GroupNorm)):nn.init.constant_(m.weight, 1)nn.init.constant_(m.bias, 0)# Zero-initialize the last BN in each residual branch,# so that the residual branch starts with zeros, and each residual block behaves like an identity.# This improves the model by 0.2~0.3% according to https://arxiv.org/abs/1706.02677if zero_init_residual:for m in self.modules():if isinstance(m, Bottleneck):nn.init.constant_(m.bn3.weight, 0)elif isinstance(m, BasicBlock):nn.init.constant_(m.bn2.weight, 0)def _make_layer(self, block, planes, blocks, stride=1, dilate=False):norm_layer = self._norm_layerdownsample = Noneprevious_dilation = self.dilationif dilate:self.dilation *= stridestride = 1if stride != 1 or self.inplanes != planes * block.expansion:downsample = nn.Sequential(conv1x1(self.inplanes, planes * block.expansion, stride),norm_layer(planes * block.expansion),)layers = []layers.append(block(self.inplanes, planes, stride, downsample, self.groups,self.base_width, previous_dilation, norm_layer))self.inplanes = planes * block.expansionfor _ in range(1, blocks):layers.append(block(self.inplanes, planes, groups=self.groups,base_width=self.base_width, dilation=self.dilation,norm_layer=norm_layer))return nn.Sequential(*layers)def _forward_impl(self, x):# See note [TorchScript super()]x = self.conv1(x)x = self.bn1(x)x = self.relu(x)x = self.maxpool(x)x = self.layer1(x)x = self.layer2(x)x = self.layer3(x)x = self.layer4(x)x = self.avgpool(x)x = torch.flatten(x, 1)x = self.fc(x)return xdef forward(self, x):return self._forward_impl(x)def _resnet(arch, block, layers, pretrained, progress, **kwargs):model = ResNet(block, layers, **kwargs)if pretrained:state_dict = load_state_dict_from_url(model_urls[arch],progress=progress)model.load_state_dict(state_dict)return modeldef resnet18(pretrained=False, progress=True, **kwargs):r"""ResNet-18 model from`"Deep Residual Learning for Image Recognition" <https://arxiv.org/pdf/1512.03385.pdf>`_Args:pretrained (bool): If True, returns a model pre-trained on ImageNetprogress (bool): If True, displays a progress bar of the download to stderr"""return _resnet('resnet18', BasicBlock, [2, 2, 2, 2], pretrained, progress,**kwargs)def resnet34(pretrained=False, progress=True, **kwargs):r"""ResNet-34 model from`"Deep Residual Learning for Image Recognition" <https://arxiv.org/pdf/1512.03385.pdf>`_Args:pretrained (bool): If True, returns a model pre-trained on ImageNetprogress (bool): If True, displays a progress bar of the download to stderr"""return _resnet('resnet34', BasicBlock, [3, 4, 6, 3], pretrained, progress,**kwargs)def resnet50(pretrained=False, progress=True, **kwargs):r"""ResNet-50 model from`"Deep Residual Learning for Image Recognition" <https://arxiv.org/pdf/1512.03385.pdf>`_Args:pretrained (bool): If True, returns a model pre-trained on ImageNetprogress (bool): If True, displays a progress bar of the download to stderr"""return _resnet('resnet50', Bottleneck, [3, 4, 6, 3], pretrained, progress,**kwargs)def resnet101(pretrained=False, progress=True, **kwargs):r"""ResNet-101 model from`"Deep Residual Learning for Image Recognition" <https://arxiv.org/pdf/1512.03385.pdf>`_Args:pretrained (bool): If True, returns a model pre-trained on ImageNetprogress (bool): If True, displays a progress bar of the download to stderr"""return _resnet('resnet101', Bottleneck, [3, 4, 23, 3], pretrained, progress,**kwargs)def resnet152(pretrained=False, progress=True, **kwargs):r"""ResNet-152 model from`"Deep Residual Learning for Image Recognition" <https://arxiv.org/pdf/1512.03385.pdf>`_Args:pretrained (bool): If True, returns a model pre-trained on ImageNetprogress (bool): If True, displays a progress bar of the download to stderr"""return _resnet('resnet152', Bottleneck, [3, 8, 36, 3], pretrained, progress,**kwargs)def resnext50_32x4d(pretrained=False, progress=True, **kwargs):r"""ResNeXt-50 32x4d model from`"Aggregated Residual Transformation for Deep Neural Networks" <https://arxiv.org/pdf/1611.05431.pdf>`_Args:pretrained (bool): If True, returns a model pre-trained on ImageNetprogress (bool): If True, displays a progress bar of the download to stderr"""kwargs['groups'] = 32kwargs['width_per_group'] = 4return _resnet('resnext50_32x4d', Bottleneck, [3, 4, 6, 3],pretrained, progress, **kwargs)def resnext101_32x8d(pretrained=False, progress=True, **kwargs):r"""ResNeXt-101 32x8d model from`"Aggregated Residual Transformation for Deep Neural Networks" <https://arxiv.org/pdf/1611.05431.pdf>`_Args:pretrained (bool): If True, returns a model pre-trained on ImageNetprogress (bool): If True, displays a progress bar of the download to stderr"""kwargs['groups'] = 32kwargs['width_per_group'] = 8return _resnet('resnext101_32x8d', Bottleneck, [3, 4, 23, 3],pretrained, progress, **kwargs)def wide_resnet50_2(pretrained=False, progress=True, **kwargs):r"""Wide ResNet-50-2 model from`"Wide Residual Networks" <https://arxiv.org/pdf/1605.07146.pdf>`_The model is the same as ResNet except for the bottleneck number of channelswhich is twice larger in every block. The number of channels in outer 1x1convolutions is the same, e.g. last block in ResNet-50 has 2048-512-2048channels, and in Wide ResNet-50-2 has 2048-1024-2048.Args:pretrained (bool): If True, returns a model pre-trained on ImageNetprogress (bool): If True, displays a progress bar of the download to stderr"""kwargs['width_per_group'] = 64 * 2return _resnet('wide_resnet50_2', Bottleneck, [3, 4, 6, 3],pretrained, progress, **kwargs)def wide_resnet101_2(pretrained=False, progress=True, **kwargs):r"""Wide ResNet-101-2 model from`"Wide Residual Networks" <https://arxiv.org/pdf/1605.07146.pdf>`_The model is the same as ResNet except for the bottleneck number of channelswhich is twice larger in every block. The number of channels in outer 1x1convolutions is the same, e.g. last block in ResNet-50 has 2048-512-2048channels, and in Wide ResNet-50-2 has 2048-1024-2048.Args:pretrained (bool): If True, returns a model pre-trained on ImageNetprogress (bool): If True, displays a progress bar of the download to stderr"""kwargs['width_per_group'] = 64 * 2return _resnet('wide_resnet101_2', Bottleneck, [3, 4, 23, 3],pretrained, progress, **kwargs)

参考:ResNet学习笔记


http://chatgpt.dhexx.cn/article/VwDbi84U.shtml

相关文章

ResNet详解——通俗易懂版

ResNet学习 什么是ResNet为什么要引入ResNet&#xff1f;ResNet详细解说 本篇博客主要是自己对论文的一些解读以及参考一些博客后的理解&#xff0c;如若有不对之处&#xff0c;请各位道友指出。多谢&#xff01; 2015年刚提出ResNet的Paper 2016对ResNet进行改进之后的Paper …

CNN经典网络模型(五):ResNet简介及代码实现(PyTorch超详细注释版)

目录 一、开发背景 二、网络结构 三、模型特点 四、代码实现 1. model.py 2. train.py 3. predict.py 4. spilit_data.py 五、参考内容 一、开发背景 残差神经网络(ResNet)是由微软研究院的何恺明、张祥雨、任少卿、孙剑等人提出的&#xff0c; 斩获2015年ImageNet竞赛…

ResNet详解+PyTorch实现

1.Resnet简介 深度残差网络&#xff08;Deep residual network, ResNet&#xff09;的提出是CNN图像史上的一件里程碑事件&#xff0c;由于其在公开数据上展现的优势&#xff0c;作者何凯明也因此摘得CVPR2016最佳论文奖。 Resnet是残差网络(Residual Network)的缩写&#xff…

点估计

1.说明&#xff1a; 设总体 X 的分布函数形式已知, 但它的一个或多个参数为未知, 借助于总体 X 的一个样本来估计总体未知参数的值的问题称为点估计问题. 在统计问题中往往先使用最大似然估计法, 在最大似然估计法使用不方便时, 再用矩估计法. 2.常用构造估计量的方法 1&#…

【应用统计学】参数统计-点估计与估计量的评价标准

一、点估计 参数的点估计就是根据样本构造一个统计量&#xff0c;作为总体未知参数的估计。这个统计量称为未知参数的估计量。 在统计中&#xff0c;经常使用的点估计量有: 二、估计量的评价标准 1、无偏性 无偏性即指估计量抽样分布的数学期望等于总体参数的真值。 2、有效…

功能点估算方法,如何让估算偏差更小?

1、何为软件功能点 ​ ​软件功能点是站在业务角度对软件规模的一种度量&#xff0c;功能点的多少代表软件规模的大小&#xff0c;这里说的功能点是标准的功能点&#xff0c;按照标准的估算方法&#xff0c;每个人对特定需求估算出的功能点数是一致的。 功能点估算方法&…

三点估算法

施工时间划分为乐观时间、最可能时间、悲观时间 乐观时间:也就是工作顺利情况下的时间为a 最可能时间:最可能时间&#xff0c;就是完成某道工序的最可能完成时间m 悲观时间:最悲观的时间就是工作进行不利所用时间b。 活动历时均值(或估计值)(乐观估计4最可能估计悲观估计)/6 …

点估计、区间估计(利用回归方程进行预测)

回归模型经过各种检验并标明符合预定的要求后&#xff0c;可利用它来预测因变量。预测&#xff08;predict&#xff09;是指通过自变量x的取值来预测因变量y的取值。 1、点估计 利用估计的方程&#xff0c;对于x的一个特定值 ,求出y的一个估计值就是点估计。点估计分为两种&…

数理统计中的点估计

• 统计推断的基本问题有二:估计问题,和假设检验问题. • 本章讨论总体参数的点估计和区间估计.理解这两种估计的思想,掌握求参数估计量的方法和评判估计量好坏的标准. 点估计 问题的提出 设灯泡寿命 T~N(μ,σ2) ,但参数 μ 和 σ2 未知. 现在要求通过对总体抽样得到的…

统计学-点估计和区间估计

点估计和区间估计 点估计 矩估计法 正态分布是一种统计量&#xff0c;目的是描述总体的某一性质。而矩则是描述这些样本值的分布情况&#xff0c;无论几阶矩&#xff0c;无外乎是描述整体的疏密情况。K阶矩分为原点矩和中心矩&#xff1a; 前者是绝对的&#xff1a;1阶就是平均…

点估计和区间估计——统计学概念

概念简介&#xff1a; 点估计和区间估计是通过样本统计量估计总体参数的两种方法。点估计是在抽样推断中不考虑抽样误差&#xff0c;直接以抽样指标代替全体指标的一种推断方法。因为个别样本的抽样指标不等于全体指标&#xff0c;所以&#xff0c;用抽样指标直接代替全体指标&…

【定量分析、量化金融与统计学】统计推断基础(3)---点估计、区间估计

一、前言 我发现很多人学了很久的统计学&#xff0c;仍然搞不清楚什么是点估计、区间估计&#xff0c;总是概念混淆&#xff0c;那今天我们来盘一盘统计推断基础的点估计、区间估计。这个系列统计推断基础5部分分别是&#xff1a; 总体、样本、标准差、标准误【定量分析、量化…

【数据统计】— 峰度、偏度、点估计、区间估计、矩估计、最小二乘估计

【数据统计】— 峰度、偏度、点估计、区间估计、矩估计、最小二乘估计 四分位差异众比率变异系数利用数据指标指导建模思路 形状变化数据分布形态峰度: 度量数据在中心聚集程度偏度 利用数据指标指导建模思路 参数估计点估计区间估计矩估计举例&#xff1a;黑白球&#xff08;矩…

7.1 参数的点估计

小结&#xff1a; 点估计是一种统计推断方法&#xff0c;它用于通过样本数据估计总体参数的值。在统计学中&#xff0c;总体是指一个包含所有个体的集合&#xff0c;而样本是从总体中选出的一部分个体。总体参数是总体的某种特征&#xff0c;如平均值、标准差、比例等。 点估…

【数理统计】参数估计及相关(点估计、矩估计法、最大似然估计、原点矩中心距)

1 基础知识 1.1 常见分布的期望和方差 1.2 对数运算法则 log ⁡ a ( M N ) log ⁡ a M log ⁡ a N log ⁡ a ( M / N ) log ⁡ a M − log ⁡ a N log ⁡ a ( 1 / N ) − log ⁡ a N log ⁡ a M n n log ⁡ a M \log _{a}(M N)\log _{a} M\log _{a} N \\ \log _{a}(M / N…

二、机器学习基础11(点估计)

点估计&#xff1a;用实际样本的一个指标来估计总体的一个指标的一种估计方法。点估计举例&#xff1a;比如说&#xff0c;我们想要了解中国人的平均身高&#xff0c;那么在大街上随便找了一个人&#xff0c;通过测量这个人的身高来估计中国人的平均身高水平&#xff1b;或者在…

统计学之参数估计(点估计和参数估计)含例题和解答

统计学之参数估计 参数点估计矩估计法极大似然估计法点估计的评价准则&#xff08;无偏性一致性有效性&#xff09; 区间估计主要公式置信区间区间估计的内容总体均值的区间估计(大样本)总体均值的区间估计(小样本)单一总体均值的区间估计总结两个总体均值之差的区间估计(大样本…

点估计(矩估计法和最大似然估计法)

估计即是近似地求某个参数的值&#xff0c;需要区别理解样本、总体、量、值 大致的题型是已知某分布&#xff08;其实包含未知参数&#xff09;&#xff0c;从中取样本并给出样本值 我只是一个初学者&#xff0c;可能有的步骤比较繁琐&#xff0c;请见谅~ 1、矩估计法 做题步骤…

概率论--点估计

首先我们来看下什么是参数估计 那么参数估计问题又是什么&#xff1f; 参数估计分为两大类&#xff0c;一类是点估计&#xff0c;还有一类是区间估计&#xff0c;点估计分为矩估计和最大似然估计&#xff0c;就比如说估计降雨量&#xff0c;预计今天的降雨量如果是550mm就…

点估计及矩估计的一些理解

点估计指的是用样本统计量来估计总体参数,因为样本统计量为数轴上某一点值,估计的结果也以一个点的数值表示,所以称为点估计。在这个定义中,总体参数也即是总体分布的参数,一般我们在讨论总体分布的时候,只有在简单随机样本(样本独立同分布)情况下才有明确的意义,总体…