深度学习（十二）：Matconvnet小试牛刀与提特征

article/2025/9/11 13:39:54

该节简单介绍一下如何使用Matconvnet的现有的模型进行图像分类实验以及提取图像对应层的特征。

先来看看如何用训练好的imagenet网络模型进行图像的预测，英文版的官网教程就在这里：

http://www.vlfeat.org/matconvnet/quick/

检测图像分类之前，首先需要训练好的模型，官网也提供了各式各样的网络模型，下载如下：

http://www.vlfeat.org/matconvnet/pretrained/

这里可以看到一堆的模型，我们先下载一个“imagenet-vgg-f ”试一下，

就放在安装包文件夹外面
这里写图片描述

好了，将matconvnet整个文件夹添加到路径（假设已经编译好了）。

在这个目录下建一个m函数，输入下面代码，运行就可以了：


% Load a model and upgrade it to MatConvNet current version.
% 导入下载的模型
net = load('imagenet-vgg-f.mat') ;
%将其变为simplenn的网络
%matconvnet有两种网络：还有一种为DAG 模型，
% 两个网络的不同之处在于将网络以不同的形式显示出来，后者DAG 会更直观
net = vl_simplenn_tidy(net) ;% Obtain and preprocess an image.
%读一张图，matlab自带
im = imread('peppers.png') ;
im_ = single(im) ; % note: 255 range
%归一化大小
im_ = imresize(im_, net.meta.normalization.imageSize(1:2)) ;
% 减去图像均值，这个是输入都需要做的一项预处理工作
im_ = im_ - net.meta.normalization.averageImage ;% Run the CNN.
% 然后把图像带进去运行一下就ok了
res = vl_simplenn(net, im_) ;%把结果显示出来
% Show the classification result.
scores = squeeze(gather(res(end).x)) ;
[bestScore, best] = max(scores) ;
figure(1) ; clf ; imagesc(im) ;
title(sprintf('%s (%d), score %.3f',...net.meta.classes.description{best}, best, bestScore)) ;

这里写图片描述

最高得分0.704，结果为辣椒。还可以。

再来看一下另一种DAG 模型：

从新建一个m脚本代码如下：

% load the pre-trained CNN
net = dagnn.DagNN.loadobj(load('imagenet-vgg-f.mat')) ;
% 切换到test模式
net.mode = 'test' ;% load and preprocess an image
im = imread('peppers.png') ;
im_ = single(im) ; % note: 0-255 range
im_ = imresize(im_, net.meta.normalization.imageSize(1:2)) ;
im_ = bsxfun(@minus, im_, net.meta.normalization.averageImage) ;%运行网络，可以看到和simple模式还是不一样的
% 输入是 x0
% run the CNN
net.eval({'x0', im_}) ;%输出是x21
% obtain the CNN otuput
scores = net.vars(net.getVarIndex('x21')).value ;
scores = squeeze(gather(scores)) ;%显示
% show the classification results
[bestScore, best] = max(scores) ;
figure(1) ; clf ; imagesc(im) ;
title(sprintf('%s (%d), score %.3f',...
net.meta.classes.description{best}, best, bestScore)) ;

可以看到的是，依然是上述的结果，就不贴了，这里只是说几点需要注意且容易错的地方：

（1）就是输入”x0”
可能会问为什么是x0呢？不同的网络模型还真不一样，这里我们在matlab下观看net这个网络变量可以看到如下：
这里写图片描述

可以发现这个网络把输入定义为‘x0’，把最终的输出定义为‘x21’了，所以才会有程序那样。那么中间的是什么呢，各种卷积层全连接层，显然这是不太好看的，所以一般情况下会把这个名字改一下，因为这个vgg网络原因，本身是给simple用的就没有改。

上面是我运行以后的结果，右边可以看大，这个结果只保留了最后1000类预测的得分，中间的卷积层呀，全连接层的特征都为空[],这是默认情况下为了节省内存有意设置的。但是有些时候，我们不光要得到最后的分类概率，还想把中间的特征抠出来作为特征来进行其他分类应用，这个时候就需要将所有的特征显示出来，调用的参数为：conserveMemory [true]（注意这是在使用DagNN 的模式下）

也就是说当你在测试一个样本的时候，使用了eval以后，得到的结果里面默认不会保存卷积的值，只有最终prop的概率值，因为默认conserveMemory =1，这个时候想要获得各个层包括卷基层等等的输出值，只需要把conserveMemory =0即可，

还是以上面的vgg-f网络为例：

% load the pre-trained CNN
net = dagnn.DagNN.loadobj(load('imagenet-vgg-f.mat')) ;
% 切换到test模式
net.mode = 'test' ;% load and preprocess an image
im = imread('peppers.png') ;
im_ = single(im) ; % note: 0-255 range
im_ = imresize(im_, net.meta.normalization.imageSize(1:2)) ;
im_ = bsxfun(@minus, im_, net.meta.normalization.averageImage) ;%运行网络，可以看到和simple模式还是不一样的
% 输入是 x0
% run the CNN
net.conserveMemory = 0;   %将特征都显示出来
net.eval({'x0', im_}) ;%输出是x21
% obtain the CNN otuput
scores = net.vars(net.getVarIndex('x21')).value ;
scores = squeeze(gather(scores)) ;%显示
% show the classification results
[bestScore, best] = max(scores) ;
figure(1) ; clf ; imagesc(im) ;
title(sprintf('%s (%d), score %.3f',...
net.meta.classes.description{best}, best, bestScore)) ;