pytorch(11)-- crnn 车牌端到端识别

article/2025/8/23 23:45:19

车牌图片端到端识别

  • 一、前言
  • 二、数据集处理
  • 三、crnn模型文件
  • 四、训练验证代码
  • 五、测试代码
  • 六、代码搬自

一、前言

  本文主要记录了使用crnn 对车牌图片做端到端识别,即不用对车牌字符做逐个字符分割识别,车牌第一个字符为汉字,共有31个,第2个字符为去除了“O”和“I”这2个大写字母,工24个大写字母,其余5位均为24个大写字母+10个阿拉伯数字的形式,再加上使用CRNN,需要有空白字符,放在第65位,共有66个字符

二、数据集处理

   本文使用的数据集为人工生成,这方面网上很多
在这里插入图片描述
   分辨率(h,w)为(72,272),训练集共有50000张,测试集共有10000张图片
在这里插入图片描述
   train_label.txt和test_label.txt 为车牌的汉字字符标签
如 train_plate文件夹中 03.jpg车牌为鄂A80065,则train_label.txt中的第4行为鄂A80065 , 故需要getTxt.py将汉字标签转换为数字类别标签,生成train.txt和test.txt

import osroot = "."ch_1 = ["京","津","冀","晋","蒙","辽","吉","黑","沪","苏","浙","皖","闽","赣","鲁","豫","鄂","湘","粤","桂","琼","渝","川","贵","云","藏","陕","甘","青","宁","新"]
ch_2 = [ "A","B","C","D","E","F","G" , "H" , "J" ,"K" ,"L","M","N","P","Q" , "R","S","T" , "U","V","W","X","Y","Z" ] 
ch_3 = ch_1+ch_2+[str(i) for i in range(10)]+[" ",] if os.path.exists("test.txt"): os.remove("test.txt")
if os.path.exists("train.txt"): os.remove("train.txt")def getTrainTxt( train_label , train_txt  , f_path ):f = open( train_label, "r" , encoding='UTF-8' )i = 0train_info = []for line in f.readlines():if len(line) <2 : continueline = line.rstrip("\n").strip(" ")  #7位字符if i <10:jpg = "0{}.jpg".format( i )else:jpg = "{}.jpg".format( i )i+=1pad_info = [jpg, ]for j , e in enumerate( line ):if j==0:pad_info.append(  ch_3.index( e ) )elif j==1:pad_info.append(  ch_3.index( e ) )else:pad_info.append(  ch_3.index( e ) ) train_info.append( pad_info )with open(train_txt,"a") as ftxt:for e in train_info:s = f_pathfor d in e:s+= str(d)+" "ftxt.write( s[:-1]+"\n"  ) getTrainTxt( "train_label.txt" , "train.txt" ,"train_plate/"  )     
getTrainTxt( "test_label.txt" , "test.txt" ,  "test_plate/" )     

在这里插入图片描述

三、crnn模型文件

   crnn模型将图像原始(h,w)为(72,272),按高度缩放为32的比例缩放为(32,120),并且转为单通道图像,文字识别不需要颜色信息,不一定要32,48,64都可,只是此处实验定为32 , 输出变换为[seq,batch,类别总数] , seq表示该批次图片最多输出seq个字符 , batch为批次大小,类别总数为66

#crnn.py
import argparse,os
import torch
import torch.nn as nnclass BidirectionalLSTM(nn.Module):def __init__(self, nInput_size, nHidden,nOut):super(BidirectionalLSTM, self).__init__()self.lstm = nn.LSTM(nInput_size, nHidden, bidirectional=True)self.linear = nn.Linear(nHidden * 2, nOut)def forward(self, input):recurrent, (hidden,cell)= self.lstm(input)T, b, h = recurrent.size()t_rec = recurrent.view(T * b, h)output = self.linear(t_rec)  # [T * b, nOut]output = output.view(T, b, -1) #输出变换为[seq,batch,类别总数]return outputclass CNN(nn.Module):def __init__(self,imageHeight,nChannel):super(CNN,self).__init__()assert imageHeight % 32 == 0,'image Height has to be a multiple of 32'self.depth_conv0 = nn.Conv2d(in_channels=nChannel,out_channels=nChannel,kernel_size=3,stride=1,padding=1,groups=nChannel)self.point_conv0 = nn.Conv2d(in_channels=nChannel,out_channels=64,kernel_size=1,stride=1,padding=0,groups=1)self.relu0 = nn.ReLU(inplace=True)self.pool0 = nn.MaxPool2d(kernel_size=2,stride=2)self.depth_conv1 = nn.Conv2d(in_channels=64,out_channels=64,kernel_size=3,stride=1,padding=1,groups=64)self.point_conv1 = nn.Conv2d(in_channels=64,out_channels=128,kernel_size=1,stride=1,padding=0,groups=1)self.relu1 = nn.ReLU(inplace=True)self.pool1 = nn.MaxPool2d(kernel_size=2,stride=2)self.depth_conv2 = nn.Conv2d(in_channels=128,out_channels=128,kernel_size=3,stride=1,padding=1,groups=128)self.point_conv2 = nn.Conv2d(in_channels=128,out_channels=256,kernel_size=1,stride=1,padding=0,groups=1)self.batchNorm2 = nn.BatchNorm2d(256)self.relu2 = nn.ReLU(inplace=True)self.depth_conv3 = nn.Conv2d(in_channels=256, out_channels=256, kernel_size=3, stride=1, padding=1, groups=256)self.point_conv3 = nn.Conv2d(in_channels=256, out_channels=256, kernel_size=1, stride=1, padding=0, groups=1)self.relu3 = nn.ReLU(inplace=True)self.pool3 = nn.MaxPool2d(kernel_size=(2,2),stride=(2,1),padding=(0,1))self.depth_conv4 = nn.Conv2d(in_channels=256, out_channels=256, kernel_size=3, stride=1, padding=1, groups=256)self.point_conv4 = nn.Conv2d(in_channels=256, out_channels=512, kernel_size=1, stride=1, padding=0, groups=1)self.batchNorm4 = nn.BatchNorm2d(512)self.relu4 = nn.ReLU(inplace=True)self.depth_conv5 = nn.Conv2d(in_channels=512, out_channels=512, kernel_size=3, stride=1, padding=1, groups=512)self.point_conv5 = nn.Conv2d(in_channels=512, out_channels=512, kernel_size=1, stride=1, padding=0, groups=1)self.relu5 = nn.ReLU(inplace=True)self.pool5 = nn.MaxPool2d(kernel_size=(2,2),stride=(2,1),padding=(0,1))#self.conv6 = nn.Conv2d(in_channels=512, out_channels=512, kernel_size=2, stride=1, padding=0)self.depth_conv6 = nn.Conv2d(in_channels=512, out_channels=512, kernel_size=2, stride=1, padding=0, groups=512)self.point_conv6 = nn.Conv2d(in_channels=512, out_channels=512, kernel_size=1, stride=1, padding=0, groups=1)self.batchNorm6 = nn.BatchNorm2d(512)self.relu6= nn.ReLU(inplace=True)def forward(self,input):depth0 = self.depth_conv0(input)point0 = self.point_conv0(depth0)relu0 = self.relu0(point0)pool0 = self.pool0(relu0)# print(pool0.size())depth1 = self.depth_conv1(pool0)point1 = self.point_conv1(depth1)relu1 = self.relu1(point1)pool1 = self.pool1(relu1)#print(pool1.size())depth2 = self.depth_conv2(pool1)point2 = self.point_conv2(depth2)batchNormal2 = self.batchNorm2(point2)relu2 = self.relu2(batchNormal2)#print(relu2.size())depth3 = self.depth_conv3(relu2)point3 = self.point_conv3(depth3)relu3 = self.relu3(point3)pool3 = self.pool3(relu3)#print(pool3.size())depth4 = self.depth_conv4(pool3)point4 = self.point_conv4(depth4)batchNormal4 = self.batchNorm4(point4)relu4 = self.relu4(batchNormal4)#print(relu4.size())depth5 = self.depth_conv5(relu4)point5 = self.point_conv5(depth5)relu5 = self.relu5(point5)pool5 = self.pool5(relu5)#print(pool5.size())depth6 = self.depth_conv6(pool5)point6 = self.point_conv6(depth6)batchNormal6 = self.batchNorm6(point6)relu6 = self.relu6(batchNormal6)#print(relu6.size())return relu6class CRNN(nn.Module):def __init__(self,imgHeight, nChannel, nClass, nHidden):super(CRNN,self).__init__()self.cnn = nn.Sequential(CNN(imgHeight, nChannel))self.lstm = nn.Sequential(BidirectionalLSTM(512, nHidden, nHidden),BidirectionalLSTM(nHidden, nHidden, nClass),)def forward(self,input):conv = self.cnn(input)# pytorch框架输出结构为BCHWbatch,channel,height,width = conv.size()assert  height==1,"the output height must be 1."# 将height==1的维度去掉-->BCWconv = conv.squeeze(dim=2)# 调整各个维度的位置(B,C,W)->(W,B,C),对应lstm的输入(seq,batch,input_size)conv = conv.permute(2,0,1)output = self.lstm(conv)return  outputif __name__=="__main__":x = torch.rand(1,1, 32 , 120 )model =  CRNN(imgHeight=32,nChannel=1,nClass=66,nHidden=256)y = model(x)print(  y.shape )

四、训练验证代码

   需要注意的是需要修改代码中的总类别数
train() 中的n_class = 66
train() 中的loss_func = torch.nn.CTCLoss(blank=n_class-1)
decode(preds) 中的if preds[i] != 65 and ((i == 65) or (i != 65 and preds[i] != preds[i-1])):

import os
import torch
import cv2
from torchvision import transforms
from torch.utils.data import Dataset,DataLoader
from crnn import CRNN  
import time# 调整图像大小和归一化操作
class resizeAndNormalize():def __init__(self,size,interpolation=cv2.INTER_LINEAR):# 注意对于opencv,size的格式是(w,h)self.size = sizeself.interpolation = interpolation# ToTensor属于类  """Convert a ``PIL Image`` or ``numpy.ndarray`` to tensor.self.toTensor = transforms.ToTensor()def __call__(self, image):# (x,y) 对于opencv来说,图像宽对应x轴,高对应y轴image = cv2.resize(image,self.size,interpolation=self.interpolation)#转为tensor的数据结构image = self.toTensor(image)#对图像进行归一化操作image = image.sub_(0.5).div_(0.5)return imageclass CRNNDataSet(Dataset):def __init__(self,imageRoot,labelRoot):self.image_root = imageRootself.image_dict = self.readfile(labelRoot)self.image_name = [fileName for fileName,_ in self.image_dict.items()]def __getitem__(self, index):image_path = os.path.join(self.image_root,self.image_name[index])keys = self.image_dict.get(self.image_name[index])label = [int(x) for x in keys]image = cv2.imread(image_path,cv2.IMREAD_GRAYSCALE)# if image is None:#     return None,None(height,width) = image.shapesize_height = 32  #由于crnn网络输入图像的高为32,故需要resize原始图像的heightratio = 32/float(height)size_width = int(ratio * width)transform = resizeAndNormalize((size_width,size_height))#图像预处理image = transform(image)#标签格式转换为IntTensorlabel = torch.IntTensor(label)return image,labeldef __len__(self):return len(self.image_name)def readfile(self,fileName):res = []with open(fileName, 'r') as f:lines = f.readlines()for line in lines:res.append(line.strip())dic = {}total = 0for line in res:part = line.split(' ')#由于会存在训练过程中取图像的时候图像不存在导致异常,所以在初始化的时候就判断图像是否存在if  not os.path.exists(os.path.join(self.image_root, part[0])):print(os.path.join(self.image_root, part[0]))total += 1else:dic[part[0]] = part[1:]print(total)return dictrainData = CRNNDataSet(imageRoot="D:\other\carPad\data\\",labelRoot="D:\other\carPad\\data\\train.txt")trainLoader = DataLoader(dataset=trainData,batch_size=32,shuffle=True,num_workers=0)valData = CRNNDataSet(imageRoot="D:\other\carPad\data\\",labelRoot="D:\other\carPad\\data\\test.txt")valLoader = DataLoader(dataset=valData,batch_size=100,shuffle=True,num_workers=1)def decode(preds):pred = []for i in range(len(preds)):if preds[i] != 65 and ((i == 65) or (i != 65 and preds[i] != preds[i-1])):  # 注意,修改为 总类别数-1,总类别数是包含空白的,66-1=5pred.append(int(preds[i]))return preddef val(model, loss_function, max_iteration,use_gpu=True):# 将模式切换为验证评估模式model.eval()k = 0totalloss = 0correct_num = 0total_num = 0val_iter = iter(valLoader)max_iter = min(max_iteration,len(valLoader))for i in range(max_iter):k = k + 1data,label = val_iter.next()labels = torch.IntTensor([])for j in range(label.size(0)):labels = torch.cat((labels,label[j]),0)if torch.cuda.is_available() and use_gpu:data = data.cuda()output = model(data)input_lengths = torch.IntTensor([output.size(0)] * int(output.size(1)))target_lengths = torch.IntTensor([label.size(1)] * int(label.size(0)))loss = loss_function(output,labels,input_lengths,target_lengths) /  label.size(0)totalloss += float(loss)pred_label = output.max(2)[1]pred_label = pred_label.transpose(1,0).contiguous().view(-1)pred = decode(pred_label)total_num += len(pred)for x,y in zip(pred,labels):if int(x) == int(y):correct_num += 1accuracy = correct_num / float(total_num) * 100test_loss = totalloss / kprint('Test loss : %.3f , accuary : %.3f%%' % (test_loss, accuracy))def train():use_gpu =  False # Truelearning_rate = 0.001weight_decay = 1e-4max_epoch = 10modelpath = './pytorch-crnn.pth'#char_set = open('../train/char_std_5990.txt','r',encoding='utf-8').readlines()#char_set = ''.join([ch.strip('\n') for ch in char_set[1:]] +['卍'])n_class =  66  #len(char_set)  #注意,需更改为总类别数model =  CRNN(imgHeight=32,nChannel=1,nClass=n_class,nHidden=256)if torch.cuda.is_available() and use_gpu:model.cuda()loss_func = torch.nn.CTCLoss(blank=n_class-1)   # 注意,这里的CTCLoss中的 blank是指空白字符的位置,在这里是第65个,也即最后一个optimizer = torch.optim.Adam(model.parameters(),lr=learning_rate,weight_decay=weight_decay)if os.path.exists(modelpath):print("load model from %s" % modelpath)model.load_state_dict(torch.load(modelpath))print("done!")lossTotal = 0.0k = 0printInterval = 100  #每隔多少步打印一次训练的lossvalinterval = 1000   #每隔多少步做一次测试集测试,输出测试准确率start_time = time.time()for epoch in range(max_epoch):for i,(data,label) in enumerate(trainLoader):k = k + 1#开启训练模式model.train()labels = torch.IntTensor([])for j in range(label.size(0)):labels = torch.cat((labels,label[j]),0)if torch.cuda.is_available and use_gpu:data = data.cuda()loss_func = loss_func.cuda()labels = labels.cuda()output = model(data)#log_probs = output#example 建议使用这样,貌似直接把output送进去loss fun也没发现什么问题#log_probs = output.log_softmax(2).detach().requires_grad_()  #注意 detach 要去掉好像log_probs = output.log_softmax(2).requires_grad_()targets = labelsinput_lengths = torch.IntTensor([output.size(0)] * int(output.size(1)))target_lengths = torch.IntTensor([label.size(1)] * int(label.size(0)))#forward(self, log_probs, targets, input_lengths, target_lengths)loss = loss_func(log_probs,targets,input_lengths,target_lengths) / label.size(0)lossTotal += float(loss)if k % printInterval == 0:print("[%d/%d] [%d/%d] loss:%f" % (epoch, max_epoch, i + 1, len(trainLoader), lossTotal/printInterval))lossTotal = 0.0torch.save(model.state_dict(), './pytorch-crnn.pth')optimizer.zero_grad()loss.backward()optimizer.step()if k % valinterval == 0:val(model,loss_func , 10000)end_time = time.time()print("takes {}s".format((end_time - start_time)))if __name__ == '__main__':train()

五、测试代码

需要调整:
decode(preds,char_set)中的
if preds[i] != 65 and ((i == 65) or (i != 65 and preds[i] != preds[i-1])):


import os
# os.environ['CUDA_VISIBLE_DEVICES'] = '7'
import torch
#from config import opt
from crnn import CRNN
from PIL import Image
from torchvision import transformsclass resizeNormalize(object):def __init__(self, size, interpolation=Image.BILINEAR):self.size = sizeself.interpolation = interpolationself.toTensor = transforms.ToTensor()def __call__(self, img):img = img.resize(self.size, self.interpolation)img = self.toTensor(img)img.sub_(0.5).div_(0.5)return imgdef decode(preds,char_set):pred_text = ''for i in range(len(preds)):if preds[i] != 65 and ((i == 65) or (i != 65 and preds[i] != preds[i-1])):  #5989 需改为 包含空白的总字数-1pred_text += char_set[int(preds[i]) ]  #这里不需减1的,因为空白字符在最后return pred_text# test if crnn workif __name__ == '__main__':imagepath = '../data/test_plate/06.jpg'img_h = 32   #opt.img_h  图高度限制32,可以自行设置use_gpu = False  # opt.use_gpu 是否使用gpumodelpath = './pytorch-crnn.pth'#modelpath = '../train/models/pytorch-crnn.pth'# modelpath = opt.modelpath#char_set = open('char_std_5990.txt', 'r', encoding='utf-8').readlines()#char_set = ''.join([ch.strip('\n') for ch in char_set[1:]] + ['卍'])ch_1 = ["京","津","冀","晋","蒙","辽","吉","黑","沪","苏","浙","皖","闽", "赣","鲁","豫","鄂","湘","粤","桂","琼","渝","川","贵","云","藏","陕","甘","青","宁","新"]ch_2 = [ "A","B","C","D","E","F","G" , "H" , "J" ,"K" ,"L","M","N","P","Q" , "R","S","T" , "U","V","W","X","Y","Z"  ] char_set = ch_1+ch_2+[str(i) for i in range(10)]+[ " " , ]  #最后加上空白字符,空白是放最后,包含空白是66个字符,空白位置在第65n_class = len(char_set)print(n_class)#from crnn_new import crnnmodel =  CRNN(img_h, 1, n_class, 256)if os.path.exists(modelpath):print('Load model from "%s" ...' % modelpath)model.load_state_dict(torch.load(modelpath))print('Done!')if torch.cuda.is_available and use_gpu:model.cuda()image = Image.open(imagepath).convert('L')(w,h) = image.sizesize_h = 32ratio = size_h / float(h)size_w = int(w * ratio)# keep the ratiotransform = resizeNormalize((size_w, size_h))image = transform(image)image = image.unsqueeze(0)if torch.cuda.is_available and use_gpu:image = image.cuda()model.eval()preds = model(image)preds = preds.max(2)preds = preds[1]preds = preds.squeeze()pred_text = decode(preds,char_set)print('predict == >',pred_text )

实测效果:
在这里插入图片描述
还是挺准确的

六、代码搬自

链接: 点击这里


http://chatgpt.dhexx.cn/article/LwxSdUOL.shtml

相关文章

CRNN 论文翻译

《An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition》论文翻译 摘要 基于图像的序列识别一直是计算机视觉中长期存在的研究课题。在本文中&#xff0c;我们研究了场景文本识别的问题&#xff0c;…

CRNN算法详解

《An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition》&#xff0c;是比较老的一篇文章了&#xff0c;在2015年6月发表在arxiv上&#xff0c;但是该方法还是被广泛运用。 文章思想&#xff1a;文章…

文本识别论文CRNN

目录 1. 解读文本识别论文CRNN1.1 CRNN文字识别整体流程1.2 理解CTC Loss1.2.1 CTC loss是如何做的1.2.2 以一个具体的例子来展现CTC loss的过程 2. 总结3. 参考资料 1. 解读文本识别论文CRNN 本文解读的是一篇来自2015年的一篇文字识别论文 [ 1 ] ^{[1]} [1]。里面的CTC Loss相…

opencv pytorch CRNN验证码识别

文章目录 前言&#xff1a;效果预览&#xff1a;搭建CRNN模型&#xff1a;验证码数据集制作&#xff1a;模型训练&#xff1a;项目结构与源码下载&#xff1a; 前言&#xff1a; 本文使用crnn网络识别验证码&#xff0c;使用的验证码数据集有三种&#xff0c;准确率都很高。 …

CRNN笔记

参考链接&#xff1a; 一文读懂CRNNCTC文字识别 - 知乎 CTC loss - 知乎 1、背景 文字识别主流的两种算法 1.1 基于CRNNCTC 1.2 基于CNNSeq2SeqAttention 2、CRNNCTC原理解析 CRNNCTC结构图 以下是根据paddleocr中以mobilenetv3为backbone的网络结构图 model …

ocr小白入门CRNN

什么是CRNN CRNN的整体框架图&#xff1a; CRNNCNNRNNCTC 1&#xff09;CNN主要是为RNN提取特征&#xff1b; 2&#xff09;RNN主要是将CNN输出的特征序列转换为输出&#xff1b; 3&#xff09;CTC为翻译层&#xff0c;得到最终的预测结果&#xff0c;由于CTC适合不知道输入…

CRNN代码笔记

CRNN代码笔记 主要由五个模块组成&#xff1a; 数据集的加载与切分CRNN代码复现训练过程预测过程训练过程中对的评估 文章目录 CRNN代码笔记数据集的加载与切分RCNN模型构建训练部分训练辅助函数注意超参数设置判断cuda是否可用&#xff0c;是则基于GPU训练&#xff0c;否则用…

基于CRNN的文本识别

文章目录 0. 前言1. 数据集准备2.构建网络3.数据读取4.训练模型 0. 前言 至于CRNN网络的细节这里就不再多言了&#xff0c;网上有很多关于crnn的介绍&#xff0c;这里直接讲一下代码的实现流程 1. 数据集准备 CRNN是识别文本的网络&#xff0c;所以我们首先需要构建数据集&a…

CRNN论文翻译——中文版

文章作者&#xff1a;Tyan 博客&#xff1a;noahsnail.com | CSDN | 简书 翻译论文汇总&#xff1a;https://github.com/SnailTyan/deep-learning-papers-translation An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Applicatio…

CRNN论文笔记

0. 前言 在这篇论文《An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition》所讲述的内容便是大名鼎鼎的CRNN网络,中实现了端到端的文本识别。 论文地址 Github地址 该网络具有如下的特点: 1)该模…

CRNN详解

一.概述 常用文字识别算法主要有两个框架&#xff1a; CNNRNNCTC(CRNNCTC)CNNSeq2SeqAttention 本文介绍第一种方法。 CRNN是一种卷积循环神经网络结构&#xff0c;用于解决基于图像的序列识别问题&#xff0c;特别是场景文字识别问题。 文章认为文字识别是对序列的预测方法…

CRNN模型

介绍&#xff1a; 是目前较为流行的图文识别模型&#xff0c;可识别较长的文本序列&#xff0c; 它利用BLSTM和CTC部件学习字符图像中的上下文关系&#xff0c; 从而有效提升文本识别准确率&#xff0c;使得模型更加鲁棒。 CRNN是一种卷积循环神经网络结构&#xff0c;用于解决…

CRNN

CRNN详解&#xff1a;https://blog.csdn.net/bestrivern/article/details/91050960 https://www.cnblogs.com/skyfsm/p/10335717.html 1 概述 传统的OCR识别过程分为两步&#xff1a;单字切割和分类任务。现在更流行的是基于深度学习的端到端的文字识别&#xff0c;即我们不需…

论文阅读 - CRNN

文章目录 1 概述2 模型介绍2.1 输入2.2 Feature extraction2.3 Sequence modeling2.4 Transcription2.4.1 训练部分2.4.2 预测部分 3 模型效果参考资料 1 概述 CRNN(Convolutional Recurrent Neural Network)是2015年华科的白翔老师团队提出的&#xff0c;直至今日&#xff0c…

文本识别网络CRNN

文本识别网络CRNN 简介网络结构CNN层LSTM层CTC Loss 代码实现 简介 CRNN&#xff0c;全称Convolutional Recurrent Neural Network&#xff0c;卷积循环神经网络。 它是一种基于图像的序列识别网络&#xff0c;可以对不定长的文字序列进行端到端的识别。 它集成了卷积神经网络…

CRNN——文本识别算法

常用文字识别算法主要有两个框架&#xff1a; CNNRNNCTC(CRNNCTC)CNNSeq2SeqAttention 文章认为文字识别是对序列的预测方法&#xff0c;所以采用了对序列预测的RNN网络。通过CNN将图片的特征提取出来后采用RNN对序列进行预测&#xff0c;最后通过一个CTC的翻译层得到最终结果…

OCR论文笔记系列(一): CRNN文字识别

👨‍💻作者简介:大数据专业硕士在读,CSDN人工智能领域博客专家,阿里云专家博主,专注大数据与人工智能知识分享,公众号:GoAI的学习小屋,免费分享书籍、简历、导图等资料,更有交流群分享AI和大数据,加群方式公众号回复“加群”或➡️点击链接。 🎉专栏推荐:➡️点…

CRNN——卷积循环神经网络结构

CRNN——卷积循环神经网络结构 简介构成CNNMap-to-Sequence 图解RNNctcloss序列合并机制推理过程编解码过程 代码实现 简介 CRNN 全称为 Convolutional Recurrent Neural Network&#xff0c;是一种卷积循环神经网络结构&#xff0c;主要用于端到端地对不定长的文本序列进行识…

java bean的生命周期

文章转载来自博客园&#xff1a;https://www.cnblogs.com/kenshinobiy/p/4652008.html Spring 中bean 的生命周期短暂吗? 在spring中&#xff0c;从BeanFactory或ApplicationContext取得的实例为Singleton&#xff0c;也就是预设为每一个Bean的别名只能维持一个实例&#xf…

Spring创建Bean的生命周期

1.Bean 的创建生命周期 UserService.class —> 无参构造方法&#xff08;推断构造方法&#xff09; —> 普通对象 —> 依赖注入&#xff08;为带有Autowired的属性赋值&#xff09; —> 初始化前&#xff08;执行带有PostConstruct的方法&#xff09; —> 初始…