Python科学计算初探——余弦相似度

article/2025/10/1 1:08:17

  SciPy是世界上著名的Python开源科学计算库,建立在Numpy之上。它增加的功能包括数值积分、最优化、统计和一些专用函数。例如线性代数、常微分方程数值求解、信号处理、图像处理、稀疏矩阵等等。

安装科学计算包SciPy

  由于SciPy库在Windows下使用pip intall安装失败(网上资料说的),所以需要寻找第三方(Unofficial Windows Binaries for Python Extension Packages)安装包,使用“.whl”安装包进行安装(确保在pip外,还安装了wheel库),安装包地址:https://www.lfd.uci.edu/~gohlke/pythonlibs/ ,注意,SciPy依赖于numpy+mkl,安装scipy前需要先安装好numpy+mkl。即使你在此前已经安装过numpy,也请从该页面中找到numpy+mkl的whl,下载到本地,卸载先前安装的NumPy。

  下面是记录步骤。
  
  列出已经安装的软件包,查看是否安装过numpy。

D:\Python\Python36\Tools>pip list
cycler (0.10.0)
kiwisolver (1.0.1)
matplotlib (2.2.2)
numpy (1.14.2)
pip (9.0.1)
pyparsing (2.2.0)
python-dateutil (2.7.2)
pytz (2018.4)
setuptools (28.8.0)
six (1.11.0)

  卸载已经安装的numpy。

D:\Python\Python36\Tools>pip uninstall numpy
# 安装numpy+mkl
D:\Python\Python36\Tools>pip install d:\Python\numpy-1.14.2+mkl-cp36-cp36m-win_amd64.whl

  安装numpy+mkl,在windows环境上必须安装此版本;接着安装scipy软件包,注意,安装前把文件名中的“cp36m”替换为“none”。
  

D:\Python\Python36\Tools>pip install d:\Python\numpy-1.14.2+mkl-cp36-cp36m-win_amd64.whl
Processing d:\python\numpy-1.14.2+mkl-cp36-cp36m-win_amd64.whl
Installing collected packages: numpy
Successfully installed numpy-1.14.2+mklD:\Python\Python36\Tools>pip install d:\Python\scipy-1.0.1-cp36-none-win_amd64.whl
Processing d:\python\scipy-1.0.1-cp36-none-win_amd64.whl
Requirement already satisfied: numpy>=1.8.2 in d:\python\python36\lib\site-packages (from scipy==1.0.1)
Installing collected packages: scipy
Successfully installed scipy-1.0.1

计算两个向量夹角的余弦值

  根据定义,任取平面上两点A(x1,y1),B(x2,y2),则向量AB=(x2-x1,y2-y1),即一个向量的坐标等于表示此向量的有向线段的终点坐标减去始点的坐标。
这里写图片描述

计算向量余弦相似度

  几何中夹角余弦可用来衡量两个向量方向的差异,机器学习中借用这一概念来衡量样本向量之间的差异。

  余弦取值范围为[-1,1]。求得两个向量的夹角,并得出夹角对应的余弦值,此余弦值就可以用来表征这两个向量的相似性。夹角越小,趋近于0度,余弦值越接近于1,它们的方向更加吻合,则越相似。当两个向量的方向完全相反夹角余弦取最小值-1。当余弦值为0时,两向量正交,夹角为90度。因此可以看出,余弦相似度与向量的幅值无关,只与向量的方向相关。

  由于连续离散点连线的斜率存在无穷大的问题,所以,把角度和斜率转换为向量夹角余弦值,方便比较相似度。

  参考代码如下:

import matplotlib.pyplot as plt
import math
import numpy as np
from scipy.spatial.distance import pdistdef VectorCosine(x,y):''' 计算向量夹角余弦 '''vc = []for i in range(1,len(x)-2):xc1 = x[i] - x[i-1]xc2 = x[i+1] - x[i]yc1 = y[i] - y[i-1]yc2 = y[i+1] - y[i]vc.append((xc1*xc2+yc1*yc2)/(math.sqrt(xc1**2+yc1**2)*math.sqrt(xc2**2+yc2**2)))return vcdef main():x2 = [0.00,0.00,0.01,0.01,0.02,0.04,0.05,0.07,0.10,0.12,0.15,0.18,0.21,0.24,0.28,0.32,0.37,0.42,0.46,0.52,0.57,0.62,0.68,0.74,0.80,0.86,0.92,0.99,1.06,1.12,1.19,1.26,1.33,1.40,1.48,1.68,1.75,1.82,1.88,1.95,2.01,2.08,2.15,2.21,2.28,2.35,2.41,2.48,2.55,2.61,2.68,2.75,2.81,2.88,2.95,3.01,3.08,3.15,3.21,3.27,3.34,3.39,3.46,3.51,3.58,3.64,3.69,3.75,3.81,3.86,3.92,3.97,4.02,4.08,4.13,4.17,4.22,4.27,4.31,4.36,4.41,4.44,4.49,4.52,4.56,4.60,4.64,4.67,4.71,4.74,4.77,4.80,4.82,4.85,4.87,4.89,4.91,4.93,4.94,4.96,4.97,4.98,4.99,4.99,4.99,4.99,4.99,4.99,4.98,4.97,4.96,4.94,4.93,4.91,4.88,4.86,4.83,4.80,4.77,4.73,4.70,4.66,4.62,4.57,4.52,4.46,4.42,4.36,4.29,4.24,4.18,4.11,4.06,3.99,3.92,3.85,3.78,3.70,3.63,3.55,3.48,3.41,3.33,3.26,3.18,3.09,3.02,2.94,2.85,2.78,2.69,2.61,2.54,2.45,2.37,2.30,2.21,2.13,2.06,1.98,1.89,1.82,1.74,1.67,1.59,1.52,1.45,1.37,1.30,1.23,1.16,1.09,1.03,0.96,0.90,0.84,0.78,0.72,0.67,0.61,0.55,0.51,0.45,0.41,0.36,0.32,0.28,0.24,0.21,0.18,0.14,0.12,0.09,0.07,0.05,0.04,0.02,0.01,0.01,0.00,0.00]y2 = [35.01,35.30,35.32,35.22,37.23,38.91,40.61,41.66,43.01,45.78,49.20,51.85,53.81,56.15,58.65,57.61,55.97,54.22,52.13,50.91,51.01,51.65,52.28,53.65,54.56,54.53,54.43,53.75,52.45,51.85,51.76,51.75,51.80,52.42,52.42,52.47,52.60,52.75,52.83,52.55,52.35,52.25,52.01,51.82,51.82,51.81,51.85,51.88,51.88,51.81,51.80,51.75,51.53,51.49,51.54,51.51,51.51,51.52,51.51,51.48,51.52,51.26,51.09,51.05,50.92,50.93,50.97,50.97,50.95,51.02,50.99,51.04,51.04,50.92,50.65,50.64,50.61,50.61,50.66,50.67,50.64,50.67,50.58,50.47,50.45,50.24,50.07,50.10,50.07,50.05,50.11,50.10,50.07,49.97,49.70,49.67,49.68,49.50,49.50,49.49,49.47,49.50,49.46,49.48,49.21,48.11,47.81,47.37,47.32,46.85,45.77,44.54,43.09,41.66,40.29,38.49,36.54,33.99,31.23,28.23,25.26,23.25,24.20,26.10,29.01,31.74,33.24,33.20,32.61,30.41,27.65,26.16,25.95,25.98,27.61,29.39,31.12,31.89,31.97,30.75,29.65,28.33,27.31,27.00,27.47,28.33,29.30,30.26,30.96,30.99,30.31,29.17,28.83,28.18,28.16,28.18,28.94,29.49,30.08,30.34,30.43,30.24,29.58,29.15,29.08,29.08,29.41,29.76,30.36,30.48,30.55,30.48,30.47,30.14,29.80,29.80,30.17,30.39,30.85,31.42,31.55,31.53,31.54,31.48,31.43,31.40,31.41,31.57,32.01,32.66,33.24,33.25,33.24,33.24,32.80,32.25,32.25,32.40,32.61,33.04,35.01]x1 = [0.00,0.00,0.01,0.01,0.02,0.03,0.05,0.07,0.09,0.11,0.13,0.16,0.19,0.22,0.25,0.28,0.32,0.35,0.39,0.43,0.48,0.51,0.56,0.60,0.66,0.71,0.76,0.82,0.87,0.93,0.99,1.03,1.09,1.15,1.21,1.27,1.33,1.39,1.45,1.51,1.58,1.62,1.69,1.75,1.81,1.87,1.93,1.99,2.05,2.11,2.16,2.21,2.27,2.32,2.38,2.44,2.49,2.54,2.60,2.65,2.74,2.78,2.83,2.88,2.93,2.98,3.02,3.07,3.12,3.16,3.21,3.24,3.29,3.33,3.37,3.41,3.45,3.49,3.53,3.56,3.60,3.63,3.66,3.70,3.73,3.76,3.79,3.82,3.85,3.88,3.91,3.93,3.95,3.98,4.00,4.02,4.04,4.06,4.07,4.09,4.10,4.11,4.12,4.13,4.14,4.14,4.15,4.15,4.15,4.14,4.14,4.13,4.12,4.11,4.09,4.08,4.05,4.03,4.00,3.98,3.94,3.92,3.88,3.84,3.80,3.76,3.72,3.67,3.62,3.57,3.52,3.48,3.43,3.37,3.31,3.25,3.19,3.12,3.06,2.99,2.92,2.87,2.80,2.74,2.67,2.61,2.54,2.47,2.40,2.33,2.26,2.21,2.14,2.07,2.00,1.93,1.86,1.79,1.73,1.66,1.60,1.54,1.48,1.42,1.35,1.29,1.22,1.16,1.10,1.04,0.98,0.94,0.88,0.83,0.77,0.72,0.67,0.62,0.57,0.52,0.48,0.44,0.40,0.36,0.32,0.28,0.25,0.21,0.18,0.15,0.13,0.11,0.09,0.07,0.05,0.04,0.02,0.01,0.01,0.00,0.00]y1 = [22.60,23.39,24.27,25.45,26.78,28.30,29.75,30.86,32.34,34.06,36.00,38.69,41.29,46.88,50.25,53.15,55.22,57.65,61.04,63.47,68.09,71.36,71.69,69.49,67.67,65.42,61.75,58.15,55.43,53.57,54.53,54.76,56.02,57.72,59.22,60.26,60.82,60.00,59.18,57.25,55.58,54.47,53.71,53.30,53.27,54.15,55.09,56.36,57.19,57.52,57.62,57.55,56.40,55.63,54.44,53.81,53.57,53.14,53.34,54.25,54.13,54.84,55.31,55.41,55.62,56.00,55.63,55.16,54.39,53.98,53.85,53.56,53.28,53.40,53.78,54.29,54.53,54.63,54.81,55.10,54.95,54.54,54.05,53.78,53.58,53.52,53.06,53.17,53.52,53.64,53.81,53.73,53.64,53.94,53.59,53.15,52.70,52.60,52.28,51.99,51.62,51.64,51.61,51.81,51.52,51.43,50.73,50.12,49.80,49.12,48.41,48.07,47.69,47.27,47.45,47.12,46.66,46.21,45.64,44.68,43.32,41.93,40.07,38.38,36.20,33.33,30.39,27.32,23.77,19.61,15.33,13.88,15.64,17.82,20.16,23.61,26.95,30.24,32.15,31.35,30.97,29.86,27.51,24.47,22.41,20.55,20.44,20.44,21.27,22.56,25.36,26.92,28.51,29.10,29.56,29.47,28.16,26.54,25.53,23.89,22.90,22.52,22.15,23.17,24.55,25.62,26.61,26.85,26.91,26.95,26.52,25.38,24.46,23.52,23.12,22.87,22.10,21.70,23.16,23.97,24.92,25.58,26.50,26.95,27.12,25.98,24.50,23.94,22.91,21.73,20.86,20.67,21.14,22.83,23.84,24.29,25.08,24.86,24.47,23.15,22.60]x = [0.00,0.00,0.01,0.01,0.02,0.03,0.05,0.07,0.09,0.11,0.13,0.16,0.19,0.22,0.25,0.28,0.32,0.35,0.39,0.43,0.48,0.51,0.56,0.60,0.66,0.71,0.76,0.82,0.87,0.93,0.99,1.03,1.09,1.15,1.21,1.27,1.33,1.39,1.45,1.51,1.58,1.62,1.69,1.75,1.81,1.87,1.93,1.99,2.05,2.11,2.16,2.21,2.27,2.32,2.38,2.44,2.49,2.54,2.60,2.65,2.74,2.78,2.83,2.88,2.93,2.98,3.02,3.07,3.12,3.16,3.21,3.24,3.29,3.33,3.37,3.41,3.45,3.49,3.53,3.56,3.60,3.63,3.66,3.70,3.73,3.76,3.79,3.82,3.85,3.88,3.91,3.93,3.95,3.98,4.00,4.02,4.04,4.06,4.07,4.09,4.10,4.11,4.12,4.13,4.14,4.14,4.15,4.15,4.15,4.14,4.14,4.13,4.12,4.11,4.09,4.08,4.05,4.03,4.00,3.98,3.94,3.92,3.88,3.84,3.80,3.76,3.72,3.67,3.62,3.57,3.52,3.48,3.43,3.37,3.31,3.25,3.19,3.12,3.06,2.99,2.92,2.87,2.80,2.74,2.67,2.61,2.54,2.47,2.40,2.33,2.26,2.21,2.14,2.07,2.00,1.93,1.86,1.79,1.73,1.66,1.60,1.54,1.48,1.42,1.35,1.29,1.22,1.16,1.10,1.04,0.98,0.94,0.88,0.83,0.77,0.72,0.67,0.62,0.57,0.52,0.48,0.44,0.40,0.36,0.32,0.28,0.25,0.21,0.18,0.15,0.13,0.11,0.09,0.07,0.05,0.04,0.02,0.01,0.01,0.00,0.00]y = [22.6,23.39,24.27,25.45,26.78,28.3,29.75,30.86,32.34,34.06,36.0,38.69,39.29,26.88,30.25,33.15,35.22,37.65,31.04,33.47,38.09,40.36,40.69,39.48,37.67,35.42,31.75,38.15,35.43,33.57,34.53,34.76,36.02,37.72,39.22,30.25,30.82,40.0,39.18,37.25,35.58,34.47,33.71,33.3,33.27,34.15,35.09,36.36,37.19,37.52,37.62,37.55,36.4,35.63,34.44,33.81,33.57,33.14,33.34,34.25,34.13,34.84,35.31,35.41,35.62,36.0,35.63,35.16,34.39,33.98,33.85,33.56,33.28,33.4,33.78,34.29,34.53,34.63,34.81,35.1,34.95,34.54,34.05,33.78,33.58,33.52,33.06,33.17,33.52,33.64,33.81,33.73,33.64,33.94,33.59,33.15,32.7,32.6,32.28,31.99,31.61,31.64,31.61,31.81,31.52,31.43,30.72,30.11,29.79,29.11,28.40,28.07,27.68,27.27,27.45,27.11,26.65,26.21,25.64,25.68,25.32,25.93,25.07,24.38,24.2,23.33,25.39,27.32,23.77,21.61,21.33,21.88,21.64,21.82,20.16,23.61,26.95,30.24,30.15,30.35,30.97,29.86,27.51,24.47,22.41,20.55,20.24,20.24,21.27,22.56,25.36,26.92,28.51,26.1,26.56,26.47,26.16,26.54,25.53,23.89,22.9,22.52,22.15,23.17,24.55,25.62,26.61,26.85,26.91,26.95,26.52,25.38,24.46,23.52,23.12,22.87,22.1,21.7,23.16,23.97,24.92,25.58,26.5,26.95,27.12,25.98,24.5,23.94,22.91,21.73,20.86,20.67,21.14,22.83,23.84,24.29,25.08,24.86,24.47,23.15,22.6]v = VectorCosine(x2,y2)    vv = VectorCosine(x1,y1)    vvv=VectorCosine(x,y)# 计算向量余弦相似度cos1 = np.vstack([v,vv])p1 = 1 - pdist(cos1,'cosine')print(p1)    cos2 = np.vstack([v,vvv])p2 = 1 - pdist(cos2,'cosine')print(p2)plt.figure(1)plt.plot(x,y)plt.figure(2)plt.plot(x2,y2)plt.figure(3)plt.plot(x1,y1)plt.show()if __name__ == '__main__':main()

  第二图与第一张图相似度为[0.62020321],第三图与第一张图相似度为[0.3941908]。

这里写图片描述

  基于此方法,如下图所示,取特定数据中的一段,做为比较相似度的基准,拿测试数据进行比较相似度,如果值越大,则相似度越高。
这里写图片描述

  欢迎读者反馈。
  
Python科学计算软件包下载地址:
1. Scipy, 第三方Scipy3.6
2. NumPy+MKL, numpy+mkl 3.6

参考:

1. 《【Python】Windows下安装scipy库步骤》 CSDN博客 阿秀的工作室 2017.1
2. 《距离度量以及python实现(二)》 denny的学习专栏 徐其华 2017.6
3. 《使用Python Matplotlib绘图并输出图像到文件中的实践》 CSDN博客 肖永威 2018.4


http://chatgpt.dhexx.cn/article/QC3BDl7f.shtml

相关文章

【python科学计算发行版】

python科学计算发行版 python是科学计算的有力工具,但在进行计算过程中需要安装很多依赖包,会对使用造成不便,所以总结了一下几个包含丰富科学计算包的python发行版本供参考。 Windows下使用的Winpython著名的计算集合python(x,y)数据科学平…

Python科学计算pdf

下载地址:网盘下载 内容简介 编辑 本书介绍如何用Python开发科学计算的应用程序,除了介绍数值计算之外,还着重介绍如何制作交互式的2D、3D图像,如何设计精巧的程序界面,如何与C语言编写的高速计算程序结合,…

python科学计算的几个例子

python解常微分方程 python解常微分方程的步骤如下: 将计算区间分为n个小段,在每一小段上将求解的曲线作为直线处理;将一个n阶常微分方程转换成[y_n,y_n-1,…,y_i,…,y_0]向量的线性方程组,其中y_i表示y的i阶导数;确…

python科学计算与数据可视化——Matplotlib

Matplotlib(https://matplotlib.org/)是一个用来绘图的python库,它的matplotlib.pyplot模块提供了一个绘图系统。 matplotlib中最重要的函数就plot,它可以绘制二维图像 使用subplot函数,你就可以在同一个图像里绘制多个子图 #1.…

常用的python科学计算库有哪些_python科学计算常用的数学科学计算库有哪些?

1.numpy(高效多维数据表示) NumPy数组可以将许多数据处理任务表述为简洁的数组表达式,否则需要编写循环。用数组表达式代替循环的做法,通常被称为矢量化。通常矢量化数组运算要比等价的纯Python方式快上一两个数量级,尤其是各种数值计算。 假设我们想要在一组值(网格型)上计…

Python 初步了解科学计算和数据分析

推荐自己的专栏:分享一些Python案例,将所学用出来随着Python语言生态环境的完善,众多科学计算和数据分析库(例如NumPy、SciPy、Pandas、Matplotlib、IPython等),使得Python成为科学计算和数据分析的首选语言…

Python科学计算

Python 科学计算 NumPy(MatLab 替代品之一) 数组的算数和逻辑运算傅立叶变换和用于图形操作的例程与线性代数有关的操作。 NumPy 拥有线性代数和随机数生成的内置函数 frmemeta SciPy(科学计算) SciPy是一个开源的算法库和数学工具包。 其包含最优化、线…

深度阐述数据建模及可视化系统技术方案

1.系统概述 数据建模及可视化系统系统是一站式全链路数据生命周期管家,帮助用户管理数据资产并挖掘价值。平台提供多源异构的数据采集模块、实时/离线计算框架,简洁易用的开发环境和平台接口,为政府机构、企业、科研机构、第三方软件服务商等…

数据可视化现状调研

数据可视化现状调研 概述 数据可视(Data visualization)数据可视化主要旨在借助于图形化手段,清晰有效地传达与沟通信息。但是,这并不就意味着,数据可视化就一定因为要实现其功能用途而令人感到枯燥乏味,…

数据可视化课程大纲和教学设计及源代码

一、课程基本信息 二、课程定位 《数据可视化》课程是面向全校学生的一门公共选修课。本课程包括16学时的理论教学和16学时的实践教学,在校内完成。 《数据可视化》课程是一门理论性和实践性都很强的课程。本课程本着“技能培养为主、理论够用为度”的原则&#x…

工业数据可视化

工业大数据是指在工业领域中,围绕典型智能制造模式,从客户需求到销售、订单、计划、研发、设计、工艺、制造、采购、供应、库存、发货和交付、售后服务、运维、报废或回收再制造等整个产品全生命周期各个环节所产生的各类数据及相关技术和应用的总称。其…

数据可视化选择题

第一章 打开可视化大门 多选(3分) 可视化的分类包含: A.科学可视化B.信息可视化C.智能可视化D.可视分析学 ABD ‏2. 以下哪张图片为科学可视化结果: A. B. C. [外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-cynHvYcn…

52个数据可视化图表鉴赏

文章目录 1.弧线图2.面积图3.箭头图4.条形图5.布林线指标6.箱线图7.气泡地图8.子弹图9.凹凸图10.日历图11.烛台图12.卡通图13.弦图14.分级统计图15.组合图表16.连接地图17.控制图18.南丁格尔玫瑰图19.交叉表20.环形图21.漏斗图22.甘特图23.热图24.六边形平铺地图25.直方图26.地…

解构数据可视化

文章目录 什么是数据可视化数据可视化四大过程1. 确定主题2. 提炼数据3. 确认图表4. 可视化设计及布局 可视化映射1. 可视化空间2. 标记3. 视觉通道 什么是数据可视化 数据可视化的目标是透过数据的表象,洞悉其中的规律,指导用户高效、准确地进行决策。 …

数据可视化学习之大屏学习

一 前言 什么是数据可视化大屏?数据可视化大屏是以大屏为主要展示载体的数据可视化设计。可视化大屏就是一种非常有效的数据可视化工具,它可以将业务的关键指标以可视化的方式展示到一个或多个LED屏幕上,不仅使业务人员能够从复杂的业务数据…

数据可视化学习路线

写在前面 有幸看到了这篇关于数据可视化学习的指导文章,由于原作链接访问异常,只得从百度快照中看到原文,所以这里搬运过来,特此声明本文系【转载】,在此感谢原作者,以下为原文正文(略有删减)。 原作者: 张…

数据挖掘——数据可视化

数据可视化 1.数据可视化第一关 数据可视化的内涵1>数据可视化是什么?2>为什么需要数据可视化?3>历史演变4>习题 第二关 初识数据第三关 柱状图第四关 散点图第五关 直方图 2.数据可视化进阶第一关 热图1>热图的作用?2>习题…

初识前端数据可视化

目录 前端数据可视化的开发工具 前端三件套 Echarts.js Highcharts.js D3.js Vue.js python Tableau 编译器 数据可视化的分支 信息可视化 科学可视化 可视分析学 什么是前端?前端,通俗来说就是网页呈现给我们看的那部分。比如hao123这个…

数据可视化 复习笔记2022

1.可视化释义 可视化对应Visualize和Visualization。Visualize是动词,即“生成符合人类感知”的图像;通过可视元素传递信息。Visualization是名词,表达“使某物、某事可见的动作或事实”,对某个原本不可见的事物在人的大脑中形成一幅可感知的…

前端数据可视化入门

这是一篇给大家提供数据可视化开发的入门指南,介绍了可视化要解决的问题和可以直接使用的工具,我将从下面几个方面给大家介绍,同时以阿里/蚂蚁的可视化团队和资源举例说明: 什么是数据可视化? 怎样进行数据可视化&…