ORL Character Recgnition

article/2025/11/1 3:07:23

文章目录

  • ORL Character Recgnition
  • 0 Abstract
  • 1 Introduction
  • 2 Related Work
    • 2.1 Character recognition
    • 2.2 Text detection
  • 3 Connection Text Proposal Network
    • 3.1 Anchor
    • 3.2 Bi-Directional LSTM
    • 3.3 RPN layer
    • 3.4 Text line constructor
    • 3.5 Loss function
    • 3.6 Total system
    • 3.7 Algorithm
  • 4 Experiment
    • 4.1Dataset
    • 4.2Taining model
    • 4.3 Algorithm analysis
  • 5 Conclusion and future work

ORL Character Recgnition

0 Abstract

The paper uses Connectionist Text Proposal Network (CTPN) to make a character recognition in the natural image. The CTPN detects a text line in a sequence of fine-scale text proposals in convolutional feature maps. It consists of Anchor, Bi-Directional LSTM, RPN layer, and Text line constructor. The CTPN can jointly predict location and text or non-text to get an accurate location. It can learn from the rich context information of the image to detect ambiguous text. In the experiment, we used the method to train on the 2800 dataset. After the end of the 1200 iterations, we get a better result because the method can detect text perfectly.

Keywords: CTPN, character recognition, text detection

1 Introduction

​ For the character recognition of complex scene, the text detection is a research hotspot. The text recognition aims to detect bounding box. When overlap rate between the bounding box and ground truth is greater than 0.5. Then the detection result is correct.

​ We adopt the CTPN to detect text, The CTPN have vertical anchor regression mechanism detecting small scale text candidate box. It can decompose text bounding box into the same width small bounding box. After make regression in the y y y direction, to concatenate the some small bounding box into a big bounding box. Bi-Directional LSTM can learn sequence signatures. A Bi-Directional LSTM is equal to two connected LSTMs in opposite directions. It has a better effect than LSTM in a single direction. The RPN layer consists of 2k vertical coordinates, 2k scores, and k side-refinement. It can respectively output the location of the anchor, text and non-text scores of foreground and background, and ratio of side-refinement. The text line constructor can link some anchors to an integral whole and get a big nonoverlap bounding box.

​ In the experiment, we first scale the pre-normalized images to crop them into 800\times600. Then generate trained datasets, and validate datasets. Finally, train and validate the model. Through the test, we can get a better result via CTPN.

2 Related Work

2.1 Character recognition

​ Character recognition is a way that enables to translation of different types of documents into editable and searchable data, and Convolutional Neural Network is a deep learning architecture having a strong ability to learn. Therefore, many researchers have done research in character recognition using deep neural networks. Vaidya, Rohan, et al. proposed offline handwritten character detection via deep neural networks. The method adopted OpenCV for performing image networks and used Tensorflow to train the neural network based on Python. Although the technique can recognize offline handwriting, it can’t recognize cursive handwritten text [1]. Narayan, Adith, et al. presented a method based on CNN. It can detect and recognize handwritten text images with higher accuracy. The plans are used in English handwritten characters. However, the difficulty in contour-based techniques limited the recognition of natural scenes and handwritten[2]. Khan, Mohammad Meraj, et al. presented a deep CNN model using the SE-ResNeXt and used SE blocks to improve the performance. Getting the result of decreasing the hyper-parameter and complexity leads to an increase in the capability of learning very complex features[3]. Khandokar, I. et al. use CNN to progress in HCR by learning discriminatory characteristics from raw data. CNN can recognize the characters by considering the forms and contrasting the features[4]. Balaha, Hossam Magdy, et al. present a deep learning system with a convolutional neural network. The method has two frames, HMB1 and HMB2. HMB1 has a more complex architecture and more parameters than HMB2. Getting the conclusion of more parameters leads to accuracy[5]. Narang et al. used deep network character recognition to recognize ancient characters. It is a powerful application direction in character recognition, pattern recognition, etc[6].

2.2 Text detection

​ Text detection plays a vital role in recognition. Liao, Minghui, et al. proposed a Differentiable Binarization module. The method can gain the accuracy of text detection with a simple pipeline[7]. Shivakumara et al., a Laplacian multi-oriented text detection method, is presented to detect in the video. It can handle a text of arbitrary orientation via Fourier-Laplacian and k-means. This method can take graphics text and scene text of both horizontal and non-horizontal orientations [8]. Zhang, Zheng, et al. propose a localizing text line and a coarse-to-fine method. It predicts the salient map of text regions via DCN and evaluates text line hypotheses to use another FCN classifier to predict the centroid of each character[9]. Zhu, Yiqin, et al. use the Fourier domain and propose a Fourier Contour Embedding (FCE) method to represent arbitrarily shaped text contours as compact signatures. The method is accurate and robust by experiment[10].

3 Connection Text Proposal Network

3.1 Anchor

​ The method adopts a vertical anchor that can forecast the vertical direction of the text rather than the horizon direction because the technique only detects a slight widening of text fragment in the horizon direction and forecasts the height of the corresponding widening. Finally, link all the fragments together, as shown in Fig 1.

​ We always set the width to sixteen and the height to a group of ten on the same width. Because CTPN adopts VGG16 to extract features, the width and size of the conv5 feature map are always is 1 16 \frac{1}{16} 161 of the input image, and FC has the same width and height. Therefore, the anchor can cover each point of the original image, not overlap, and use a group of heights to solve the different heights of texts.

​ After getting the anchor, Softmax will determine whether the anchor contains text and choose the max positive softmax score. The bounding box regression amends the y center coordinate and height of the anchor.

v c = ( c y − c y a ) h a v_c = \frac{(c_y-c^a_y)}{h^a} vc=ha(cycya), v h = l o g ( h h a ) v_h=log(\frac{h}{h^a}) vh=log(hah), v c ∗ = ( c y ∗ − c y a ) h a v^*_c=\frac{(c^*_y-c^a_y)}{h^a} vc=ha(cycya), v h ∗ = l o g ( h ∗ / h a ) v^*_h=log(h^*/h^a) vh=log(h/ha)

c y c_y cyis the detected center coordinate, h h h is the detected height, c y a c^a_y cya is the center y coordinate of anchor, h a h^a ha is the height of the anchor, c y ∗ c^*_y cy is the center coordinate of ground truth, h ∗ h^* h is the center coordinate of ground truth. v c v_c vc and v h v_h vh are the coordinate transformers of regression predicted value and anchor. v c ∗ v^*_c vc and v h ∗ v^*_h vh are coordinate transformers of ground truth and anchor.
在这里插入图片描述

The institution of the anchor on the horizon and vertical.

3.2 Bi-Directional LSTM

​ The Bi-Directional LSTM is a left and right accumulating map zipped together and makes predictions over a sequence with both past and future context. It has two values of forwarding calculation A A A and backward calculation A ′ A' A to output y y y. Then time steps deal with one context in the input sequence at a time using forward-propagation and backward-propagation directions, as shown in Fig 2.
在这里插入图片描述

Structure of Bi-Directional LSTM[11]


3.3 RPN layer

​ The RPN of CTPN is similar to Faster R-CNN. The first branch is 2k vertical coordinates. It can output the location of the anchor( v c v_c vc, v h v_h vh). The second branch is 2k scores. It can output text scores of foreground and non-text scores of background, and softmax calculates scores. The third branch is k side-refinement. It can output the ratio of side-refinement o o o to modify the text boxes because horizontal direction may have inaccuracy, as shown in Fig 3.

o = ( x s i d e − c x a ) w a o = \frac{(x_{side}-c^a_x)}{w^a} o=wa(xsidecxa), o ∗ = x s i d e ∗ − c x a w a o^* = \frac{x^*_{side}-c^a_x}{w^a} o=waxsidecxa

o ∗ o^* o is ground truth, x s i d e x_{side} xside is the left or right edge of the text box, and c x a c^a_x cxa is the horizontal center coordinate of an anchor. w a w_a wa is the fixed width of the anchor (16 pixels).

在这里插入图片描述

The RPN layer is an exact stretch to adjust the text boxes.

3.4 Text line constructor

​ The method first sorts anchor on the horizonal coordinate and calculates p a i r ( b o x j ) pair(box_j) pair(boxj) of each anchor ( b o x i ) (box_i) (boxi) to get p a i r ( x i , x j ) pair(x_i,x_j) pair(xi,xj), finally, construct a connect graph by p a i r ( x i , x j ) pair(x_i,x_j) pair(xi,xj) and gain the text detection box, as shown in Fig 4.

在这里插入图片描述

the text line constructor

Text line constructor has two ways. On the one hand, look for the candidate anchor on the horizontal positive coordinate by forwarding, and choose b o x i box_i boxi ( o v e r l a p v > 0.7 overlap_v >0.7 overlapv>0.7) in the vertical direction, then get the biggest scores b o x j box_j boxj ( s c o r e i score_i scorei) by softmax score. On the other hand, look for the candidate anchor in the horizontal negative coordinate backward and choose b o x i box_i boxi ( o v e r l a p v > 0.7 overlap_v >0.7 overlapv>0.7) in the vertical direction, then get the biggest scores b o x k box_k boxk ( s c o r e k score_k scorek) by softmax score. To contrast s c o r e i score_i scorei with s c o r e k score_k scorek, when s c o r e i ≥ s c o r e k score_i \ge score_k scoreiscorek, set g r a p h ( i , j ) = T r u e graph(i, j) = True graph(i,j)=True. When s c o r e i < s c o r e k score_i < score_k scorei<scorek, this is not the longest link.

3.5 Loss function

​ The method adopts an end-to-end network predicting three outputs at a time. The loss function is expressed as:

L ( s i , v j , o k ) = 1 N s ∑ i L s c l ( s i , s i ∗ ) + λ 1 N v ∑ j L v r e ( v j , v j ∗ ) + λ 2 N o ∑ k L o r e ( o k , o k ∗ ) L(s_i,v_j,o_k)=\frac{1}{N_s}\sum_i L^{cl}_s(s_i,s^*_i)+\frac{\lambda_1}{N_v}\sum_jL^{re}_v(v_j,v^*_j)+\frac{\lambda_2}{N_o}\sum_kL^{re}_o(o_k,o^*_k) L(si,vj,ok)=Ns1iLscl(si,si)+Nvλ1jLvre(vj,vj)+Noλ2kLore(ok,ok)

i i i is the index of an anchor in a minibatch, s i s_i si is the predicted of anchor i i i, s i ∗ s^*_i si is the ground truth, j j j is the index of an anchor in the set of valid anchors for y y y-coordinates regression. s i ∗ s^*_i si=1. v j v_j vj and v j ∗ v^*_j vj are the prediction and ground truth y y y-coordinates associated with the j j j-th anchor. k k k is the index of a side-anchor. o k o_k ok and o k ∗ o^*_k ok are the predicted and ground truth offsets. L v c l L^{cl}_v Lvcl is the classification loss, L v r e L^{re}_v Lvre and L o r e L^{re}_o Lore are the regression loss. λ 1 \lambda_1 λ1and λ 2 \lambda_2 λ2 are loss weights. N s N_s Ns, N v N_v Nv and N o N_o No are normalization parameters.

3.6 Total system

​ CTPN first uses a VGG16 backbone to extract space features. The method gets the output of VGG16 conv5 (Batch size × \times × Width × \times × Height × \times × Channel), and each feature point in the output feature map corresponds to 16 pixels of the original picture, as shown in Fig 5.

​ Then, to decode the feature map to extract space feature using a 3$\times$3 sliding window in im3col and get the new feature map (Batch size × \times × Width × \times × Height × \times × Channel) where each pixel point merges surrounding 3$\times$3 information.

​ In the following, reshape the feature map to (NH) × \times ×W × \times ×C and use bi-directional LSTM to get a feature from the sequence characteristics of the row. Finally, bi-directional produce a output of (NH) × \times ×W$\times 256 a n d r e s h a p e b a c k t o N 256 and reshape back to N 256andreshapebacktoN\times 256 256 256\times H H H\times$W.

​ The output of BLSTM passes through a convolution layer to get N × \times ×H × \times ×W$\times 512 , a n d i t s r e s u l t p a s s e s t h r o u g h a n e t w o r k l i k e R P N i n t h r e e w a y s . T h e f i r s t b r a n c h i s t h e 2 k v e r t i c a l c o o r d i n a t e s ( N 512, and its result passes through a network like RPN in three ways. The first branch is the 2k vertical coordinates(N 512,anditsresultpassesthroughanetworklikeRPNinthreeways.Thefirstbranchisthe2kverticalcoordinates(N\times H H H\times 2 k ) , a n d k i s t h e k a n c h o r s o f a p i x e l c o r r e s p o n d i n g t o k a n c h o r , 2 k i s t h e f o r e c a s t o f a c e r t a i n a n c h o r v = [ 2k), and k is the k anchors of a pixel corresponding to k anchor, 2k is the forecast of a certain anchor v=[ 2k),andkisthekanchorsofapixelcorrespondingtokanchor,2kistheforecastofacertainanchorv=[v_c$, v h v_h vh]. The second branch is 2k scores(N × \times ×H × \times ×W$\times$2k). The 2k is a 2k foreground-background score, s = [ t e x t , n o n − t e x t ] [text, non-text] [text,nontext]. The third branch is k side-refinement(N × \times ×H × \times ×W × \times ×k). The k side-refinement can forecast certain anchors.

在这里插入图片描述

CTPN overall framework and predicted result[12] (a) is CTPN overall framework. (b) is a predicted result.

3.7 Algorithm

To normalize the pre-normalized background images

data_base_normalize.py
normalize width and height
the pre-normalized images will firstly be rescaled if not of size 800x600, then 800x600 rects will be cropped from the rescaled images. 
The 800x600 images will be stored in a newly-maked directory, ./images_base.

To generate validation data and training data

data_generator.py
validation data and training data will be generated. 
These will be store in the newly-maked directories, ./data_valid and ./data_train, respectively.

To train and validate

script_detect.py 
the model will be trained and validated. 
The validation results will be stored in ./data_valid/results. 
The ckpt files will be stored in a newly-maked directory, ./model_detect.

4 Experiment

4.1Dataset

​ We adopted a 2800 dataset to train the model. The dataset contains images and the context of images. In addition, we used 42 images and contexts to validate. Each image in the dataset has different locations, and the characters have their own coordinates. In image B, on the left are the coordinates of the characters. We modify images into 800$\times $600, as shown in Fig 6.

在这里插入图片描述

​ This shows an example of a training dataset. On the left, this is a training image. On the right, this is a training text corresponding to the image A.

4.2Taining model

​ We set the learning rate as 0.001 and trained the model for 1200 iterations, as shown in the figure. When train the model is in 200 iterations, the model begins to converge, as shown in Fig 7.

在这里插入图片描述

When the learning rate is 0.001, train the model in 1200 iterations.

在这里插入图片描述

在这里插入图片描述

An example of predicted datasets

According to the experiment, we found that the method can produce a better detection of the characters. However, the bounding boxes have overlapped over the characters, and some bounding boxes do not include the complete character, and some bounding box incorrectly displayed in some area.

Because we use text line constructor and RPN layer, there are existing many bounding boxes over noncharacter area, and the iterations number of training lead to the matter.

4.3 Algorithm analysis

​ The algorithm of the method consists of normalizing the pre-normalized background images, generating validation data and training data, training and validating, the function model is transparent, but the codes are redundant and have much time to run the code.

5 Conclusion and future work

The paper adopts CTPN to recognize characters. This is an efficient text detector that is end-to-end trainable. After the experiment, In the algorithm, the CTPN is more accurate for the four points on the upper, lower, left, and right sides of the tested frame. However, CTPN only detects the text in the horizontal direction, and word by word breaks off in the vertical direction. Therefore, we can further improve the algorithm in the vertical direction and add iterations of training.

Reference

[1]Vaidya, Rohan, et al. “Handwritten character recognition using deep-learning.” 2018 Second International Conference on Inventive Communication and Computational Technologies (ICICCT). IEEE, 2018.

[2]Narayan, Adith, and Raja Muthalagu. “Image Character Recognition using Convolutional Neural Networks.” 2021 Seventh International conference on Bio Signals, Images, and Instrumentation (ICBSII). IEEE, 2021.

[3]Khan, Mohammad Meraj, et al. “A squeeze and excitation resnext-based deep learning model for bangla handwritten compound character recognition.” Journal of King Saud University-Computer and Information Sciences (2021).

[4]Khandokar, I., et al. “Handwritten character recognition using convolutional neural network.” Journal of Physics: Conference Series. Vol. 1918. No. 4. IOP Publishing, 2021.

[5]Balaha, Hossam Magdy, et al. “A new Arabic handwritten character recognition deep learning system (AHCR-DLS).” Neural Computing and Applications 33.11 (2021): 6325-6367.

[6]Narang, Sonika Rani, Munish Kumar, and Manish Kumar Jindal. “DeepNetDevanagari: a deep learning model for Devanagari ancient character recognition.” Multimedia Tools and Applications 80.13 (2021): 20671-20686.

[7]Liao, Minghui, et al. “Real-time scene text detection with differentiable binarization and adaptive scale fusion.” IEEE Transactions on Pattern Analysis and Machine Intelligence (2022).

[8]Shivakumara, Palaiahnakote, Trung Quy Phan, and Chew Lim Tan. “A laplacian approach to multi-oriented text detection in video.” IEEE transactions on pattern analysis and machine intelligence 33.2 (2010): 412-419.

[9]Zhang, Zheng, et al. “Multi-oriented text detection with fully convolutional networks.” Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.

[10]Zhu, Yiqin, et al. “Fourier contour embedding for arbitrary-shaped text detection.” Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2021.

[11]Olah, Christopher. “Neural networks, types, and functional programming.” (2015).

[12]Tian, Zhi, et al. “Detecting text in natural image with connectionist text proposal network.” European conference on computer vision. Springer, Cham, 2016.
ctional programming." (2015).

[12]Tian, Zhi, et al. “Detecting text in natural image with connectionist text proposal network.” European conference on computer vision. Springer, Cham, 2016.


http://chatgpt.dhexx.cn/article/P2NE2mns.shtml

相关文章

ORL Faces Database介绍

ORL人脸数据集共包含40个不同人的400张图像&#xff0c;是在1992年4月至1994年4月期间由英国剑桥的Olivetti研究实验室创建。 此数据集下包含40个目录&#xff0c;每个目录下有10张图像&#xff0c;每个目录表示一个不同的人。所有的图像是以PGM格式存储&#xff0c;灰度图&…

基于ORL人脸数据库和PCA特征降维算法的人脸识别matlab仿真

目录 1.算法仿真效果 2.MATLAB核心程序 3.算法涉及理论知识概要 4.完整MATLAB 1.算法仿真效果 matlab2022a仿真结果如下&#xff1a; 2.MATLAB核心程序 ...................................................................... for i1:40sub_dir strcat(s, num2str(i))…

【图像处理matlab】PCA+KNN人脸识别 ORL人脸数据集

文章目录 0.写在前面1. 数据集导入与划分2. train-PCA构建脸空间2.1 原始数据导入2.2 去中心化2.3 求解协方差矩阵、特征值、特征向量2.4 特征脸选取--脸空间 3. test-物以类聚 KNN分类3.1 KNN简介3.2 KNN实现步骤3.2.1 距离度量---欧式距离、豪斯多夫距离.......3.2.2 k值选择…

Orcal 数据库

目录 一、数据库 1. 数据库概念 2. SQL 语言 3. Oracle结构 4. 表&#xff08;Table&#xff09; 5. 三范式 6. SELECT 语句 二、Orcal 基本操作 1. 查询列&#xff08;字段&#xff09; 2. 查询行&#xff08;记录&#xff09; 2.1 比较条件 2.2 且或非 2.3 null…

【人脸识别】基于PCA实现ORL人脸识别附matlab代码和报告

1 简介 人脸识别技术先进,应用广泛。借助PCA算法,利用MATLAB GUI可以简单操作,通过对待识别图像的预处理即可提高识别率。本文首先对相关概念进行了阐述,对工作原理进行了介绍,具体对基于PCA算法人脸识别的MATLAB实现进行了解析。 2 部分代码 function [neednum,average_face…

orical

truncate 和delete table 表名&#xff08;前面效率比后面高&#xff0c;而且前面执行后无法停止&#xff0c;前面会将orical查找数据时使用的节点删除&#xff0c;后面不会&#xff09; select * from 表名 会消耗大量资源不建议使用了&#xff1b; creat table 表名 as selec…

基于BP神经网络和ORL库的人脸识别matlab仿真

目录 一、理论基础 二、案例背景 1.问题描述 2.思路流程 三、部分MATLAB代码 四、仿真结论分析 五、参考文献 一、理论基础 这里&#xff0c;人脸的识别主要依据如下的流程进行&#xff1a; 具体的算法流程是这么一个过程&#xff1a; 第一&#xff1a;首先初始化一个区…

图像处理ORL--训练集及测试集建立--Matlab实现

在深度学习的研究与学习过程中&#xff0c;往往对神经网络的网络结构以及代码有比较好的理解&#xff0c;但基于matlab的数据集建立等操作经常困扰初学者。 今天带来matlab数据集建立的文件结构与代码。 文件格式 首先将图片保存在当前运行文件的文件夹中&#xff0c;将其命名…

深度学习入门_对ORL数据集进行特征提取降维后SVM分类

ORL人脸数据集共包含40个不同人的400张图像。所有图像都是以PGM格式存储的灰度图。每一个目录下的图像是在不同的时间、不同的光照、不同的面部表情条件下采集的。在该数据集中&#xff0c;每个人有10张照片。这10张照片中&#xff0c;前8张作为训练集&#xff0c;而后2张归为测…

编译原理——词法分析器实验

实验目的 掌握词法分析器的功能。掌握词法分析器的实现。 实验内容及要求 对于如下文法所定义的语言子集&#xff0c;试编写并上机调试一个词法分析程序&#xff1a; <程序>→PROGRAM <标识符>;<分程序>. <分程序>→<变量说明>BEGIN<语…

(C++)带你手肝词法分析器,容易理解,跟着思路有手就行

词法分析器 一.前言二.什么是“词法分析器”&#xff1f;三.正式设计1.设计种别码表2.设置判断为字母或数字的函数3.设置全局参数4.核心&#xff1a;scan()函数5.主函数里结合scan函数进行循环扫描6.结果截图7.需要注意&#xff08;比较难&#xff09;的地方 四.心得体会五.源代…

词法分析器实现

点击打开链接 词法分析器实现 一、写在前面 编译原理是软件工程的一项基础的课程&#xff0c;是研究软件是什么&#xff0c;为什么可以运行&#xff0c;以及怎么运行的学科&#xff0c;编译系统的改进将会直接对其上层的应用程序的执行效率&#xff0c;执行原理产生深刻的影响…

词法分析器(一)

词法分析器 &#xff08;基本符号表&#xff0c;状态转换图&#xff09; 引言&#xff1a;编译原理的实验部分是关于编译器的&#xff0c;我决定将这部分的学习和实践过程记录下来&#xff0c;也希望看到这篇文章的有缘人来指正和提出宝贵的意见。 基本符号表 本次实验我通过…

词法分析器 Java完整代码版

想了解更多内容&#xff0c;移步至编译原理专栏 2021.12.22 更新 整理了一下代码&#xff0c;同步到了gitee https://gitee.com/godelgnis/lrparserhttps://gitee.com/godelgnis/lrparser --------------------------------------------------分割线-----------------------…

编译原理——词法分析器(C/C++代码实现)

目录 0 实验目的&#xff1a; 1 实验要求&#xff1a; 2 实验内容&#xff1a; 3 实验思路&#xff1a; 4 实验代码&#xff1a; 5 实验结果&#xff1a; 6 实验总结&#xff1a; 7 实验程序以及实验报告下载链接&#xff1a; 0 实验目的&#xff1a; 设计、编制、实现…

词法分析器的构成(含源代码)

标题&#xff1a;词法分析器 本人最近在学习编译原理&#xff0c;刚刚学到词法分析器&#xff0c;心想着挺好玩&#xff0c;就想着自己写一个&#xff0c;奈何一没有系统的学过c语言&#xff0c;只是粗略的看过一遍K&R的c语言书&#xff0c;所以水平尚浅&#xff0c;代码有…

java实现词法分析器

实现词法分析器 实验内容要求 一、实验目的 加深对词法分析器的工作过程的理解&#xff1b;加强对词法分析方法的掌握&#xff1b;能够采用一种编程 语言实现简单的词法分析程序&#xff1b;能够使用自己编写的分析程序对简单的程序段进行词法分 析。 二、实验内容 自定义一…

python实现词法分析器

基于python3 实现一个简单的词法分析器。 主要使用的库&#xff1a;正则表达式、tkinter 识别关键字&#xff0c;标识符&#xff0c;运算符&#xff0c;分界符&#xff0c;数字&#xff08;整数和浮点数&#xff09; 当以数字开头时报错&#xff0c;标识符超过8个字符长度时报…

词法分析器设计与实现

开篇 编译&#xff0c;简单的说&#xff0c;就是把源程序转换为可执行程序。从hello world 说程序运行机制 里面简单的说明了程序运行的过程&#xff0c;以及一个程序是如何一步步变成可执行文件的。在这个过程中&#xff0c;编译器做了很多重要的工作。对底层该兴趣的我&…

[编译原理]词法分析器的分析与实现

词法分析概述&#xff1a; 编译程序要对高级语言编写的源程序进行分析和合成&#xff0c;生成目标程序。词法分析是对源程序进行的首次分析&#xff0c;实现词法分析的程序成为词法分析程序(或词法分析器)&#xff0c;也称扫描器。像用自然语言书写的文章一样&#xff0c;源程…