使用ResNet101作为预训练模型训练Faster-RCNN-TensorFlow-Python3-master

article/2025/10/6 14:44:02

  使用VGG16作为预训练模型训练Faster-RCNN-TensorFlow-Python3-master的详细步骤→Windows10+Faster-RCNN-TensorFlow-Python3-master+VOC2007数据集。

  如果使用ResNet101作为预训练模型训练Faster-RCNN-TensorFlow-Python3-master,在之前使用VGG16作为预训练模型的训练步骤基础上需要修改几个地方。

  • 第一个,在之前的第6步时,改为下载预训练模型ResNet101,在./data文件夹下新建文件夹imagenet_weights,将下载好的resnet_v1_101_2016_08_28.tar.gz解压到./data/imagenet_weights路径下,并将resnet_v1_101.ckpt重命名为resnet101.ckpt
      
    在这里插入图片描述
  • 第二个,在之前的第7步时,除了修改最大迭代次数max_iters参数和迭代多少次保存一次模型snap_iterations参数之外,还需要修改以下几个参数。
    ① 将network参数由vgg16改为resnet101
      
    在这里插入图片描述  
    ② 将pretrained_model参数由./data/imagenet_weights/vgg16.ckpt改为./data/imagenet_weights/resnet101.ckpt
      
    在这里插入图片描述  
    ③ 增加pooling_modeFIXED_BLOCKSPOOLING_SIZEMAX_POOL四个参数
      
    在这里插入图片描述
tf.app.flags.DEFINE_string('network', "resnet101", "The network to be used as backbone")
tf.app.flags.DEFINE_string('pretrained_model', "./data/imagenet_weights/resnet101.ckpt", "Pretrained network weights")
# ResNet options
tf.app.flags.DEFINE_string('pooling_mode', "crop", "Default pooling mode")
tf.app.flags.DEFINE_integer('FIXED_BLOCKS', 1, "Number of fixed blocks during training")
tf.app.flags.DEFINE_integer('POOLING_SIZE', 7, "Size of the pooled region after RoI pooling")
tf.app.flags.DEFINE_boolean('MAX_POOL', False, "Whether to append max-pooling after crop_and_resize")
  • 第三个,对resnet_v1.py文件进行修改,用下面的代码替换原文件中的代码。
# --------------------------------------------------------
# Tensorflow Faster R-CNN
# Licensed under The MIT License [see LICENSE for details]
# Written by Zheqi He and Xinlei Chen
# --------------------------------------------------------
from __future__ import absolute_import
from __future__ import division
from __future__ import print_functionimport tensorflow as tf
import tensorflow.contrib.slim as slim
from tensorflow.contrib.slim import losses
from tensorflow.contrib.slim import arg_scope
from tensorflow.contrib.slim.python.slim.nets import resnet_utils
from tensorflow.contrib.slim.python.slim.nets import resnet_v1
import numpy as npfrom lib.nets.network import Network
from tensorflow.python.framework import ops
from tensorflow.contrib.layers.python.layers import regularizers
from tensorflow.python.ops import nn_ops
from tensorflow.contrib.layers.python.layers import initializers
from tensorflow.contrib.layers.python.layers import layers
from lib.config import config as cfgdef resnet_arg_scope(is_training=True,weight_decay=cfg.FLAGS.weight_decay,# weight_decay=cfg.TRAIN.WEIGHT_DECAY,batch_norm_decay=0.997,batch_norm_epsilon=1e-5,batch_norm_scale=True):batch_norm_params = {# NOTE 'is_training' here does not work because inside resnet it gets reset:# https://github.com/tensorflow/models/blob/master/slim/nets/resnet_v1.py#L187'is_training': False,'decay': batch_norm_decay,'epsilon': batch_norm_epsilon,'scale': batch_norm_scale,'trainable': False,'updates_collections': ops.GraphKeys.UPDATE_OPS}with arg_scope([slim.conv2d],weights_regularizer=regularizers.l2_regularizer(weight_decay),weights_initializer=initializers.variance_scaling_initializer(),trainable=is_training,activation_fn=nn_ops.relu,normalizer_fn=layers.batch_norm,normalizer_params=batch_norm_params):with arg_scope([layers.batch_norm], **batch_norm_params) as arg_sc:return arg_scclass resnetv1(Network):def __init__(self, batch_size=1, num_layers=101):Network.__init__(self, batch_size=batch_size)self._num_layers = num_layersself._resnet_scope = 'resnet_v1_%d' % num_layersdef _crop_pool_layer(self, bottom, rois, name):with tf.variable_scope(name) as scope:batch_ids = tf.squeeze(tf.slice(rois, [0, 0], [-1, 1], name="batch_id"), [1])# Get the normalized coordinates of bboxesbottom_shape = tf.shape(bottom)height = (tf.to_float(bottom_shape[1]) - 1.) * np.float32(self._feat_stride[0])width = (tf.to_float(bottom_shape[2]) - 1.) * np.float32(self._feat_stride[0])x1 = tf.slice(rois, [0, 1], [-1, 1], name="x1") / widthy1 = tf.slice(rois, [0, 2], [-1, 1], name="y1") / heightx2 = tf.slice(rois, [0, 3], [-1, 1], name="x2") / widthy2 = tf.slice(rois, [0, 4], [-1, 1], name="y2") / height# Won't be backpropagated to rois anyway, but to save timebboxes = tf.stop_gradient(tf.concat([y1, x1, y2, x2], 1))if cfg.FLAGS.MAX_POOL:pre_pool_size = cfg.FLAGS.POOLING_SIZE * 2crops = tf.image.crop_and_resize(bottom, bboxes, tf.to_int32(batch_ids), [pre_pool_size, pre_pool_size],name="crops")crops = slim.max_pool2d(crops, [2, 2], padding='SAME')else:crops = tf.image.crop_and_resize(bottom, bboxes, tf.to_int32(batch_ids),[cfg.FLAGS.POOLING_SIZE, cfg.FLAGS.POOLING_SIZE],name="crops")return crops# Do the first few layers manually, because 'SAME' padding can behave inconsistently# for images of different sizes: sometimes 0, sometimes 1def build_base(self):with tf.variable_scope(self._resnet_scope, self._resnet_scope):net = resnet_utils.conv2d_same(self._image, 64, 7, stride=2, scope='conv1')net = tf.pad(net, [[0, 0], [1, 1], [1, 1], [0, 0]])net = slim.max_pool2d(net, [3, 3], stride=2, padding='VALID', scope='pool1')return netdef build_network(self, sess, is_training=True):# select initializers# if cfg.TRAIN.TRUNCATED:if cfg.FLAGS.initializer == "truncated":initializer = tf.truncated_normal_initializer(mean=0.0, stddev=0.01)initializer_bbox = tf.truncated_normal_initializer(mean=0.0, stddev=0.001)else:initializer = tf.random_normal_initializer(mean=0.0, stddev=0.01)initializer_bbox = tf.random_normal_initializer(mean=0.0, stddev=0.001)bottleneck = resnet_v1.bottleneck# choose different blocks for different number of layersif self._num_layers == 50:blocks = [resnet_utils.Block('block1', bottleneck,[(256, 64, 1)] * 2 + [(256, 64, 2)]),resnet_utils.Block('block2', bottleneck,[(512, 128, 1)] * 3 + [(512, 128, 2)]),# Use stride-1 for the last conv4 layerresnet_utils.Block('block3', bottleneck,[(1024, 256, 1)] * 5 + [(1024, 256, 1)]),resnet_utils.Block('block4', bottleneck, [(2048, 512, 1)] * 3)]elif self._num_layers == 101:# blocks = [#     resnet_utils.Block('block1', bottleneck,#                        [(256, 64, 1)] * 2 + [(256, 64, 2)]),#     resnet_utils.Block('block2', bottleneck,#                        [(512, 128, 1)] * 3 + [(512, 128, 2)]),#     # Use stride-1 for the last conv4 layer#     resnet_utils.Block('block3', bottleneck,#                        [(1024, 256, 1)] * 22 + [(1024, 256, 1)]),#     resnet_utils.Block('block4', bottleneck, [(2048, 512, 1)] * 3)# ]blocks = [resnet_v1.resnet_v1_block('block1', base_depth=64, num_units=3, stride=2),resnet_v1.resnet_v1_block('block2', base_depth=128, num_units=4, stride=2),resnet_v1.resnet_v1_block('block3', base_depth=256, num_units=23, stride=1),resnet_v1.resnet_v1_block('block4', base_depth=512, num_units=3, stride=1),]elif self._num_layers == 152:blocks = [resnet_utils.Block('block1', bottleneck,[(256, 64, 1)] * 2 + [(256, 64, 2)]),resnet_utils.Block('block2', bottleneck,[(512, 128, 1)] * 7 + [(512, 128, 2)]),# Use stride-1 for the last conv4 layerresnet_utils.Block('block3', bottleneck,[(1024, 256, 1)] * 35 + [(1024, 256, 1)]),resnet_utils.Block('block4', bottleneck, [(2048, 512, 1)] * 3)]else:# other numbers are not supportedraise NotImplementedError# assert (0 <= cfg.RESNET.FIXED_BLOCKS < 4)assert (0 <= cfg.FLAGS.FIXED_BLOCKS < 4)if cfg.FLAGS.FIXED_BLOCKS == 3:with slim.arg_scope(resnet_arg_scope(is_training=False)):net = self.build_base()net_conv4, _ = resnet_v1.resnet_v1(net,blocks[0:cfg.FLAGS.FIXED_BLOCKS],global_pool=False,include_root_block=False,scope=self._resnet_scope)elif cfg.FLAGS.FIXED_BLOCKS > 0:with slim.arg_scope(resnet_arg_scope(is_training=False)):net = self.build_base()net, _ = resnet_v1.resnet_v1(net,blocks[0:cfg.FLAGS.FIXED_BLOCKS],global_pool=False,include_root_block=False,scope=self._resnet_scope)with slim.arg_scope(resnet_arg_scope(is_training=is_training)):net_conv4, _ = resnet_v1.resnet_v1(net,blocks[cfg.FLAGS.FIXED_BLOCKS:-1],global_pool=False,include_root_block=False,scope=self._resnet_scope)else:  # cfg.RESNET.FIXED_BLOCKS == 0with slim.arg_scope(resnet_arg_scope(is_training=is_training)):net = self.build_base()net_conv4, _ = resnet_v1.resnet_v1(net,blocks[0:-1],global_pool=False,include_root_block=False,scope=self._resnet_scope)self._act_summaries.append(net_conv4)self._layers['head'] = net_conv4with tf.variable_scope(self._resnet_scope, self._resnet_scope):# build the anchors for the imageself._anchor_component()# rpnrpn = slim.conv2d(net_conv4, 512, [3, 3], trainable=is_training, weights_initializer=initializer,scope="rpn_conv/3x3")self._act_summaries.append(rpn)rpn_cls_score = slim.conv2d(rpn, self._num_anchors * 2, [1, 1], trainable=is_training,weights_initializer=initializer,padding='VALID', activation_fn=None, scope='rpn_cls_score')# change it so that the score has 2 as its channel sizerpn_cls_score_reshape = self._reshape_layer(rpn_cls_score, 2, 'rpn_cls_score_reshape')rpn_cls_prob_reshape = self._softmax_layer(rpn_cls_score_reshape, "rpn_cls_prob_reshape")rpn_cls_prob = self._reshape_layer(rpn_cls_prob_reshape, self._num_anchors * 2, "rpn_cls_prob")rpn_bbox_pred = slim.conv2d(rpn, self._num_anchors * 4, [1, 1], trainable=is_training,weights_initializer=initializer,padding='VALID', activation_fn=None, scope='rpn_bbox_pred')if is_training:rois, roi_scores = self._proposal_layer(rpn_cls_prob, rpn_bbox_pred, "rois")rpn_labels = self._anchor_target_layer(rpn_cls_score, "anchor")# Try to have a determinestic order for the computing graph, for reproducibilitywith tf.control_dependencies([rpn_labels]):rois, _ = self._proposal_target_layer(rois, roi_scores, "rpn_rois")else:# if cfg.TEST.MODE == 'nms':if cfg.FLAGS.test_mode == "nms":rois, _ = self._proposal_layer(rpn_cls_prob, rpn_bbox_pred, "rois")# elif cfg.TEST.MODE == 'top':elif cfg.FLAGS.test_mode == "top":rois, _ = self._proposal_top_layer(rpn_cls_prob, rpn_bbox_pred, "rois")else:raise NotImplementedError# rcnnif cfg.FLAGS.pooling_mode == 'crop':pool5 = self._crop_pool_layer(net_conv4, rois, "pool5")else:raise NotImplementedErrorwith slim.arg_scope(resnet_arg_scope(is_training=is_training)):fc7, _ = resnet_v1.resnet_v1(pool5,blocks[-1:],global_pool=False,include_root_block=False,scope=self._resnet_scope)with tf.variable_scope(self._resnet_scope, self._resnet_scope):# Average pooling done by reduce_meanfc7 = tf.reduce_mean(fc7, axis=[1, 2])cls_score = slim.fully_connected(fc7, self._num_classes, weights_initializer=initializer,trainable=is_training, activation_fn=None, scope='cls_score')cls_prob = self._softmax_layer(cls_score, "cls_prob")bbox_pred = slim.fully_connected(fc7, self._num_classes * 4, weights_initializer=initializer_bbox,trainable=is_training,activation_fn=None, scope='bbox_pred')self._predictions["rpn_cls_score"] = rpn_cls_scoreself._predictions["rpn_cls_score_reshape"] = rpn_cls_score_reshapeself._predictions["rpn_cls_prob"] = rpn_cls_probself._predictions["rpn_bbox_pred"] = rpn_bbox_predself._predictions["cls_score"] = cls_scoreself._predictions["cls_prob"] = cls_probself._predictions["bbox_pred"] = bbox_predself._predictions["rois"] = roisself._score_summaries.update(self._predictions)return rois, cls_prob, bbox_preddef get_variables_to_restore(self, variables, var_keep_dic):variables_to_restore = []for v in variables:# exclude the first conv layer to swap RGB to BGRif v.name == (self._resnet_scope + '/conv1/weights:0'):self._variables_to_fix[v.name] = vcontinueif v.name.split(':')[0] in var_keep_dic:print('Varibles restored: %s' % v.name)variables_to_restore.append(v)return variables_to_restoredef fix_variables(self, sess, pretrained_model):print('Fix Resnet V1 layers..')with tf.variable_scope('Fix_Resnet_V1') as scope:with tf.device("/cpu:0"):# fix RGB to BGRconv1_rgb = tf.get_variable("conv1_rgb", [7, 7, 3, 64], trainable=False)restorer_fc = tf.train.Saver({self._resnet_scope + "/conv1/weights": conv1_rgb})restorer_fc.restore(sess, pretrained_model)sess.run(tf.assign(self._variables_to_fix[self._resnet_scope + '/conv1/weights:0'],tf.reverse(conv1_rgb, [2])))
  • 第四个,在之前的第9步时,点击Run 'train'开始训练之前先修改train.py代码的如下几个地方。
      
    在这里插入图片描述  
    在这里插入图片描述  
    在这里插入图片描述
# 添加的代码(使用resnet101作为预训练模型)
from lib.nets.resnet_v1 import resnetv1
# 添加结束
        # 添加的代码(使用resnet101)if cfg.FLAGS.network == 'resnet101':self.net = resnetv1(batch_size=cfg.FLAGS.ims_per_batch)# 添加结束
        # Store the model snapshotfilename = 'resnet101_faster_rcnn_iter_{:d}'.format(iter) + '.ckpt'filename = os.path.join(self.output_dir, filename)self.saver.save(sess, filename)print('Wrote snapshot to: {:s}'.format(filename))# Also store some meta information, random state, etc.nfilename = 'resnet101_faster_rcnn_iter_{:d}'.format(iter) + '.pkl'nfilename = os.path.join(self.output_dir, nfilename)

  经过上面的几步修改后,就可以运行train.py开始训练模型了。
  训练时,模型保存的路径是./default/voc_2007_trainval/default,每次保存模型都是保存4个文件,如下图所示。

在这里插入图片描述

  相应地,测试时也需要修改几个地方。

  • 第一个,在之前的第12步时,改为新建./output/resnet101/voc_2007_trainval/default文件夹,从./default/voc_2007_trainval/default路径下复制一组模型数据到新建的文件夹下,并将所有文件名改为resnet101.后缀

    在这里插入图片描述
  • 第二个,在之前的第13步时,对demo.py再进行如下的修改。

    在这里插入图片描述
    在这里插入图片描述
    在这里插入图片描述

  经过上面的几步修改后,就可以运行demo.py开始测试模型了。
  在输出PR曲线并计算AP值时,同样也需要修改test_net.py文件中的几个地方,如下图所示。

在这里插入图片描述

在这里插入图片描述

# 添加的代码(使用resnet101)
from lib.nets.resnet_v1 import resnetv1
# 添加结束
# NETS = {'vgg16': ('vgg16.ckpt',)}  # 自己需要修改:训练输出模型
NETS = {'resnet101': ('resnet101.ckpt',)}  # 自己需要修改:训练输出模型

  经过上面的几步修改后,就可以运行test_net.py来输出PR曲线并计算AP值了。


http://chatgpt.dhexx.cn/article/8G3o8Roc.shtml

相关文章

TensorRT学习笔记--基于FCN-ResNet101推理引擎实现语义分割

目录 前言 1--Pytorch模型转换为Onnx模型 2--Onnx模型可视化及测试 2-1--可视化Onnx模型 2-2--测试Onnx模型 3--Onnx模型转换为Tensor RT推理模型 4--基于Tensor RT使用推理引擎实现语义分割 前言 基于Tensor RT的模型转换流程&#xff1a;Pytorch → Onnx → Tensor RT…

迁移学习之ResNet50和ResNet101(图像识别)

文章目录 1.实现的效果&#xff1a;2.主文件TransorResNet.py: 1.实现的效果&#xff1a; 实际的图片&#xff1a; &#xff08;1&#xff09;可以看到ResNet50预测的前三个结果中第一个结果为&#xff1a;whippet&#xff08;小灵狗&#xff09; &#xff08;2&#xff09;Re…

Mask-RCNN(2)Resnet101

1. 对应着图像中的CNN部分&#xff0c;其对输入进来的图片有尺寸要求&#xff0c;需要可以整除2的6次方。在进行特征提取后&#xff0c;利用长宽压缩了两次、三次、四次、五次的特征层来进行特征金字塔结构的构造。Mask-RCNN使用Resnet101作为主干特征提取网络 2.ResNet101有…

Pytorch-预训练网络

预训练网络 我们可以把预训练的神经网络看作一个接收输入并生成输出的程序&#xff0c;该程序的行为是由神经网络的结构以及它在训练过程中所看到的样本所决定的&#xff0c;即期望的输入-输出对&#xff0c;或者期望输出应该满足的特性。我们可以在Pytorch中加载和运行这些预…

基于ResNet-101深度学习网络的图像目标识别算法matlab仿真

目录 1.算法理论概述 1.1、ResNet-101的基本原理 1.2、基于深度学习框架的ResNet-101实现 1.3网络训练与测试 2.部分核心程序 3.算法运行软件版本 4.算法运行效果图预览 5.算法完整程序工程 1.算法理论概述 介绍ResNet-101的基本原理和数学模型&#xff0c;并解释其在图…

【深度学习】ResNet网络详解

文章目录 ResNet参考结构概况conv1与池化层残差结构Batch Normalization总结 ResNet 参考 ResNet论文&#xff1a; https://arxiv.org/abs/1512.03385 本文主要参考视频&#xff1a;https://www.bilibili.com/video/BV1T7411T7wa https://www.bilibili.com/video/BV14E411H7U…

【使用Pytorch实现ResNet网络模型:ResNet50、ResNet101和ResNet152】

使用Pytorch实现Resnet网络模型&#xff1a;ResNet50、ResNet101和ResNet152 介绍什么是 ResNet&#xff1f;ResNet 的架构使用Pytorch构建 ResNet网络 介绍 在深度学习和计算机视觉领域取得了一系列突破。尤其是随着非常深的卷积神经网络的引入&#xff0c;这些模型有助于在图…

使用PyTorch搭建ResNet101、ResNet152网络

ResNet18的搭建请移步&#xff1a;使用PyTorch搭建ResNet18网络并使用CIFAR10数据集训练测试 ResNet34的搭建请移步&#xff1a;使用PyTorch搭建ResNet34网络 ResNet34的搭建请移步&#xff1a;使用PyTorch搭建ResNet50网络 参照我的ResNet50的搭建&#xff0c;由于50层以上几…

Java中的数组

数组 1.什么是数组 数组就是存储相同数据类型的一组数据,且长度固定 基本数据类型4类8种&#xff1a;byte/char/short/int/long/float/double/boolean 数组&#xff0c;是由同一种数据类型按照一定的顺序排序的集合&#xff0c;给这个数组起一个名字。是一种数据类型&#…

java输出数组(java输出数组)

多维数组在Java里如何创建多维数组&#xff1f; 这从第四个例子可以看出&#xff0c;它向我们演示了用花括号收集多个new表达式的能力&#xff1a; Integer[][] a4 { { new Integer (1), new Integer (2)}, { new Integer (3), new Integer (4)}, { new Integer (5), new…

java怎么输出数组(Java怎么给数组赋值)

Java中数组输出的三种方式。第一种方式,传统的for循环方式,第二种方式,for each循环,  第三种方式,利用Array类中的toString方法. 定义一个int类型数组,用于输出 int[] array={1,2,3,4,5}; 第一种方式,传统的for循环方式 for(int i=0;i {System.out.println(a[i]); } 第…

数组的输入与输出

前言&#xff1a; 我们知道对一个字符数组进行输入与输出时会用到&#xff1a; 输入&#xff1a;scanf,getchar,gets 输出&#xff1a;printf,putchar,puts 然而可能还有很多的朋友对这些还不是很了解&#xff0c;今天让我们共同学习数组的输入与输出吧。 %c格式是用于输入…

Java二维数组的输出

1. Java二维数组的输出<1> (1) 输出结果右对齐"%5d" public class HelloWorld {public static void main(String[] args){int myArray[ ][ ] { {1,2}, {7,2}, {3,4} };for(int i0; i<3; i){for (int j0; j<2; j)System.out.printf("%5d",my…

Java中数组的输入输出

数组的输入 首先声明一个int型数组 int[] a 或者 int a[] 给数组分配空间 anew int[10]; 和声明连起来就是int[] anew int[10]; 或者是 int a[]new int[10]; 给数组赋值 a[0]1;//0代表的是数组的第1个元素 ,元素下标为0 a[1]1;//1代表的是数组的第2个元素 ,元素下标为0 …

Java 数组的输入输出

Java中要对控制台进行输入操作的话要调用Scanner类&#xff0c;定义一个扫描的对象&#xff0c;例&#xff1a; //要导入java.util.Scanner; Scanner scanner new Scanner(System.in); 这样便打开了输入流&#xff0c;接下来定义数组&#xff1a; int[] n new int[4];//使…

Java中字符串数组的输入与输出

今天刷题遇到一个坑&#xff0c;老是接收不到字符串数组。即用str[i]sc.nextLine();这样的方式去接收数组的话&#xff0c;打印的时候总是会少一个。 import java.util.Scanner;public class test {public static void main(String[] args) {Scanner sc new Scanner(System.i…

java中打印输出数组内容的三种方式

今天输出数组遇到问题&#xff0c;学习一下打印输出数组内容的几种方式 错误示范&#xff1a;System.out.println(array);  //这样输出的是数组的首地址&#xff0c;而不能打印出数组数据。&#xff08;唉&#xff0c;我开始就是这么写的。。。&#xff09; 一维数组&#…

NTP协议之旅

NTP协议之旅 What---啥是NTPWhy---为什么需要NTPHow---NTP实现原理Do---NTP实战使用HCL 华三模拟器进行NTP配置抓包分析 What—啥是NTP NTP是在分布式网络中&#xff0c;进行时钟同步的协议&#xff0c;其具有较高的时间同步精度。所使用的传输层协议为UDP&#xff0c;使用端口…

ntrip协议

https://blog.csdn.net/wandersky0822/article/details/88558456这篇介绍的是RTK精确定位的原理&#xff0c;及影响精确定位的各种条件。 这一篇介绍的就比较细&#xff0c;仅仅介绍RTK 差分信息的 产生 申请与分发。 最近要做一个GPS RTK基站&#xff0c;也就是为RTK客户端提…

Ntrip协议简介

Ntrip通讯协议1.0 1 什么是Ntrip&#xff1f; CORS&#xff08;Continuously Operating Reference Stations&#xff09;就是网络基准站&#xff0c;通过网络收发GPS差分数据。用户访问CORS后&#xff0c;不用单独架设GPS基准站&#xff0c;即可实现GPS流动站的差分定位。 访问…