全面剖析PMD静态代码扫描工具

这是个人原创的一篇全面介绍PMD静态代码扫描工具的文档，最近部门需要调研静态代码扫描工具以提高代码的质量，经过对比几十款工具，剔除了商用和多年没有更新的，最终锁定的有PMD、SanorQube以及Facebook infer。

PMD是使用JavaCC生成解析器来解析源代码并生成AST(抽象语法树)的，这两天对PMD及自定义规则做了调研及实验，部分说明来自官方说明文档，做了大部分参数的详细描述及测试，少数几个参数不明白含义，有了解的朋友欢迎讨论。

1 调研对象

pmd-bin-6.4.0【PMD可执行版本】

· bin

· designer.bat【界面工具，能将java源代码转化为AST（抽象语法树），个人推荐使用】

· bgastviewer.bat【界面工具，与designer.bat功能相似】

· cpd.bat【用来查找重复代码的工具，命令行版】

· cpdgui.bat【用来查找重复代码的工具，GUI版】

· pmd.bat【Window平台下运行PMD需要使用的文件】

· run.sh【Linux平台下运行PMD需要使用的文件】

· lib【该目录存放PMD运行依赖的jar包，包括第三方jar包和各种语言的模块jar包】

2 基本使用

格式：pmd -d [filename|jar or zip file containing source code|directory]-f [report format] -R [ruleset file]

示例：E:\pmd-bin-6.4.0>bin\pmd.bat -d E:\name -f html -R java-basic,java-design

如果要将生成的结果保存出来可以使用命令行重定向保存到指定路径——

E:\pmd-bin-6.4.0>bin\pmd.bat -d E:\name -f html -R java-basic,java-design>report.html

Option	Description	Required	Applies for language
-rulesets / -R	Comma separated list of ruleset names to use	yes
-dir / -d	Root directory for sources	yes
-format / -f	Report format type. Default format is `text`.	no
-auxclasspath	Specifies the classpath for libraries used by the source code. This is used by the type resolution. Alternatively a `file://` URL to a text file containing path elements on consecutive lines can be specified.	no
-uri / -u	Database URI for sources. If this is given, then you don't need to provide `-dir`.	no	plsql
-filelist	Path to file containing a comma delimited list of files to analyze. If this is given, then you don't need to provide `-dir`.	no
-debug / -verbose / -D / -V	Debug mode. Prints more log output.	no
-help / -h / -H	Display help on usage.	no
-encoding / -e	Specifies the character set encoding of the source code files PMD is reading (i.e. UTF-8). Default is `UTF-8`.	no
-threads / -t	Sets the number of threads used by PMD. Default is `1`. Set threads to '0' to disable multi-threading processing.	no
-benchmark / -b	Benchmark mode - output a benchmark report upon completion; defaults to System.err	no
-stress / -S	Performs a stress test.	no
-shortnames	Prints shortened filenames in the report.	no
-showsuppressed	Report should show suppressed rule violations.	no
-suppressmarker	Specifies the string that marks the line which PMD should ignore; default is `NOPMD`.	no
-minimumpriority / -min	Rule priority threshold; rules with lower priority than configured here won't be used. Default is `5` - which is the lowest priority.	no
-property / -P	`{name}={value}`: Define a property for a report format.	no
-reportfile / -r	Send report output to a file; default to System.out	no
-version / -v	Specify version of a language PMD should use.	no
-language / -l	Specify a language PMD should use.	no
-failOnViolation {true\|false}	By default PMD exits with status 4 if violations are found. Disable this option with '-failOnViolation false' to exit with 0 instead and just write the report.	no
-cache	Specify a location for the analysis cache file to use. This can greatly improve analysis performance and is highly recommended.	no
-no-cache	Explicitly disable incremental analysis. This switch turns off suggestions to use Incremental Analysis, and causes the -cacheoption to be discarded if it is provided.

2 规则

2.1规则集

RuleCategories

PMD 自带了很多规则集合，并且分类写入不同的 ruleset 文件，如

Basic 包含每人都必须遵守的代码最佳实践，如EmptyCatchBlock

Braces 关于条件分支的规则，如IfStmtsMustUseBraces

Code Size 关于代码大小的规则，如方法的长度，参数的长度，属性的个数等

Clone 克隆实现的规则，如是否有super.clone()

Controversial 一些有争议的规则，如UnnecessaryConstructor不必要的构造器

Coupling 对象连接有关的规则

Design 可以检查有问题的设计，如SwitchStmtsShouldHaveDefault

Finalizers 使用finalizers时需遵循的规则，如FinalizeOnlyCallsSuperFinalize

Import Statements 和import有关的规则，如DuplicateImports重复import

J2EE 唯一规则UseProperClassLoader，class.getClassLoader()可能不正确，用

Thread.currentThread().getContextClassLoader() 代替

Javabeans 和javabean规范有关的规则，有BeanMembersShouldSerialize属性必须

序列化和MissingSerialVersionUID缺少序列化ID

JUnit Tests 和JUnit测试有关的，如JUnitSpelling拼写检查等

Logging (Java) 检查Logger的一些错误用法，如MoreThanOneLogger多个Logger

Logging (Jakarta) 使用Jakarta Logger的一些规则，有UseCorrectExceptionLogging

异常处理不当和ProperLogger是否正确定义Logger

Migrating JDK 版本移植的规则，如ReplaceVectorWithList用List代替Vector

Naming 和命名有关的规则，名称太短或太长，命名的约定等

2.2参数详情

● -dir/-d扫描目录

● -format/-f报告格式，有xml、xslt、html、text，默认为text

● -rulesets/R使用的规则集

● -auxclasspath

● -uri/-u源文件的数据库uri，使用它就不需要提供-dir

● -filelist一个包含逗号分隔的路径列表的文件，使用它就不需要提供-dir

示例：

命令：E:\pmd-bin-6.4.0>bin\pmd.bat -filelist E:\team-goblin\demo1\list.txt -f html -R myRule.xml

结果：

● -debug / -verbose / -D / -V 打印更详细的日志

● –help 显示用法帮助信息

● -encoding / -e 字符集编码，默认为utf-8

● -threads / -t 设置PMD使用的线程数。默认值是“1”。将线程设置为“0”以禁用多线程处理。

● -benchmark / -b 基准模式-完成后输出基准测试报告

● -stress / -S 进行压力测试

● –shortnames 报告中打印缩短的文件名，只有在文件只有一个目录时是奏效

示例：

使用 –shortnames后：

● –suppressmarker 指定一个字符串使得PMD忽略某行，默认为“NOPMD”。

示例1：

示例2：pmd.bat -d E:\team-goblin\demo1 -f html -RmyRule.xml -showsuppressed -suppressmarker allowint

● – showsuppressesd 报告显示被忽略的违规行为

示例：bin\pmd.bat -d E:\team-goblin\demo1 -f html –R myRule.xml-showsuppressesd>report.html

● –minimumpriority/-min

规则最小优先级阈值，默认为5，规则优先级低于它就不会被使用

示例：

pmd.bat-d E:\team-goblin\demo1 -f html -R myRule.xml -min 4

● –property/-P 为报告格式定义一个属性

● –language/-l 指定pmd使用的语言

【对扫描结果没有任何影响】

● - version 指定PMD使用的语言版本

● – failOnViolation {true|false} PMD默认会在发现违规情况时以状态4退出。使用“-failOnViolation false”禁用此选项，以0退出并且只写报告。

0	Everything is fine, no violations found
1	Couldn't understand command line parameters or PMD exited with an exception
4	At least one violation has been detected

● -cache指定要使用的分析缓存文件的位置。这可以大大提高分析性能，并被强烈推荐。

● –no-cache显式禁用增量分析。这个开关关闭了使用增量分析的建议，并导致-cache选项使用时无效

2.3 自定义规则

编写pmd规则有两种方法：

（1）用xpath，参考产生的AST树写xml

（2）用java code，需要深入了解pmd api，用于一些比较复杂的规则

2.3.1xpath自定义规则

使用PMD自带的designer.bat工具可以快速生成一个xpath rule xml。
（1）打开designer界面工具，输入源代码，输入XPath表达式，点击Go按钮，确认右下方的结果输出正确。

（2）点击左上方File->ExportXpath to rule