本文的csdn链接:https://blog.csdn.net/Jinyindao243052/article/details/107544145
知乎链接:https://zhuanlan.zhihu.com/p/163218826
算法
The OPTICS (Ordering Points To Identify the Clustering Structure) algorithm shares many similarities with the DBSCAN algorithm, and can be considered a generalization of DBSCAN that relaxes the eps
requirement from a single value to a value range.
【OPTICS(基于点排序来识别聚类结构)算法与 DBSCAN 算法有许多相似之处,可视为 DBSCAN 的泛化,将 eps
要求从单个值放宽到一个范围。】
The key difference between DBSCAN and OPTICS is that the OPTICS algorithm builds a reachability graph, which assigns each sample both a reachability_
distance, and a spot within the cluster ordering_
attribute; these two attributes are assigned when the model is fitted, and are used to determine cluster membership.
【DBSCAN 和 OPTICS 之间的主要区别在于,OPTICS 算法构建了一个可reachability graph,为每个样本分配一个reachability_
距离,和在群集ordering_
属性中的一个点; 这两个属性在拟合(fit)模型时分配,并用于确定cluster的成员。】
If OPTICS is run with the default value of inf set for max_eps
, then DBSCAN style cluster extraction can be performed repeatedly in linear time for any given eps
value using the cluster_optics_dbscan
method. Setting max_eps
to a lower value will result in shorter run times, and can be thought of as the maximum neighborhood radius from each point to find other potential reachable points.
【如果 OPTICS 在运行的时候把 max_eps
参数设置为默认值 inf,则可以用cluster_optics_dbscan
方法对任何给定的eps
值在线性时间重复执行 DBSCAN 样式群集提取。将max_eps
设置为较低的值将导致较短的运行时间,并可视为每个点在查找其他潜在的可到达点时所采用的最大邻域半径。】
算法原理
算法实现
sklearn.cluster.OPTICS
算法demo
scikit-learn官方文档示例
参考文献
[1] OPTICS: Ordering Points To Identify the Clustering Structure
[2] 皮果提的博客:聚类算法初探(六)OPTICS
[3] sklearn官方文档
[4] 科学摆渡人的博客:OPTICS聚类算法