PCL 的欧式距离聚类

article/2025/9/15 4:49:31

PCL 的欧式距离聚类

（感谢前辈）转自：https://zhuanlan.zhihu.com/p/75117664

聚类代码如下：

from paper_1_v0.my_ransac import my_ransac_v5
import numpy as npimg_id = 1  #  这里读入你的kitti 雷达数据即可
path = r'D:\KITTI\Object\training\velodyne\%06d.bin' % img_id  ## Path ## need to be changed
points = np.fromfile(path, dtype=np.float32).reshape(-1, 4)index_ground, index_obj, best_model, alpha = my_ransac_v5(points)   # 去除地面，用pcl 的代码也可以，不做也可以print('points  shape', points[index_obj, :3].shape)s = time.time()
cloud = pcl.PointCloud(points[index_obj, :3])tree = cloud.make_kdtree()ec = cloud.make_EuclideanClusterExtraction()
# ec.set_ClusterTolerance(0.02)
ec.set_ClusterTolerance(0.62)  # 三个参数设置
ec.set_MinClusterSize(100)
ec.set_MaxClusterSize(15000)
ec.set_SearchMethod(tree)
cluster_indices = ec.Extract()print('time:', time.time() - s)print('cluster_indices : ' + str( len(cluster_indices)) + "  .")cluster_points = points[index_obj,:].copy()
cluster_points[:,3] = 1000  # 基础颜色for j, indices in enumerate(cluster_indices):print('indices = ' + str(len(indices)))cluster_points[indices,3] = (j+10) * 4000   # 设置每个类的颜色color_cloud = pcl.PointCloud_PointXYZRGB(cluster_points)
visual = pcl.pcl_visualization.CloudViewing()
visual.ShowColorCloud(color_cloud, b'cloud')flag = True
while flag:flag != visual.WasStopped()

能设置的3个参数含义是什么？

tolerance 是设置 kdtree 的近邻搜索的搜索半径，从实验结果来看，tolerance 越大，检测到的范围也就越大；同时如果搜索半径取一个非常小的值，那么一个实际的对象就会被分割为多个聚类；如果将值设置得太高，那么多个对象就会被分割为一个聚类;

setMinClusterSize()来限制一个聚类最少需要的点数目；

setMaXClusterSize()来限制最多需要的点数目；这两个不要解释；

这个方法底层又是怎么聚类的？

欧几里德算法具体的实现方法大致是：

1 找到空间中某点p10，有kdTree找到离他最近的n个点，判断这n个点到p的距离。将距离小于阈值r的点p12,p13,p14…放在类Q里

2 在 Q(p10) 里找到一点p12,重复1

3 在 Q(p10,p12) 找到一点，重复1，找到p22,p23,p24…全部放进Q里

4 当 Q 再也不能有新点加入了，则完成搜索了；

看 PCL的代码，过程很简单，也能看出这个过程：

 pcl::extractEuclideanClusters (const PointCloud<PointT> &cloud,const typename search::Search<PointT>::Ptr &tree,float tolerance, std::vector<PointIndices> &clusters,unsigned int min_pts_per_cluster,unsigned int max_pts_per_cluster)
{if (tree->getInputCloud ()->points.size () != cloud.points.size ())   // 点数量检查{PCL_ERROR ("[pcl::extractEuclideanClusters] Tree built for a different point cloud dataset (%lu) than the input cloud (%lu)!\n", tree->getInputCloud ()->points.size (), cloud.points.size ());return;}// Check if the tree is sorted -- if it is we don't need to check the first elementint nn_start_idx = tree->getSortedResults () ? 1 : 0;// Create a bool vector of processed point indices, and initialize it to falsestd::vector<bool> processed (cloud.points.size (), false);std::vector<int> nn_indices;std::vector<float> nn_distances;   // 定义需要的变量// Process all points in the indices vectorfor (int i = 0; i < static_cast<int> (cloud.points.size ()); ++i)   //遍历点云中的每一个点{if (processed[i])   //如果该点已经处理则跳过continue;std::vector<int> seed_queue;   //定义一个种子队列int sq_idx = 0;seed_queue.push_back (i);  //加入一个种子processed[i] = true;while (sq_idx < static_cast<int> (seed_queue.size ()))    //遍历每一个种子{// Search for sq_idx  kdtree 树的近邻搜索if (!tree->radiusSearch (seed_queue[sq_idx], tolerance, nn_indices, nn_distances))  {sq_idx++;continue;   //没找到近邻点就继续}for (size_t j = nn_start_idx; j < nn_indices.size (); ++j)             // can't assume sorted (default isn't!){if (nn_indices[j] == -1 || processed[nn_indices[j]])        // Has this point been processed before ?continue;   // 种子点的近邻点中如果已经处理就跳出此次循环继续// Perform a simple Euclidean clusteringseed_queue.push_back (nn_indices[j]);   //将此种子点的临近点作为新的种子点。入队操作processed[nn_indices[j]] = true;  // 该点已经处理，打标签}sq_idx++;}// If this queue is satisfactory, add to the clusters    最大点数和最小点数的类过滤if (seed_queue.size () >= min_pts_per_cluster && seed_queue.size () <= max_pts_per_cluster){pcl::PointIndices r;r.indices.resize (seed_queue.size ());for (size_t j = 0; j < seed_queue.size (); ++j)r.indices[j] = seed_queue[j];// These two lines should not be needed: (can anyone confirm?) -FFstd::sort (r.indices.begin (), r.indices.end ());r.indices.erase (std::unique (r.indices.begin (), r.indices.end ()), r.indices.end ());r.header = cloud.header;clusters.push_back (r);   // We could avoid a copy by working directly in the vector}}
}