导读:春节快到了!你的假期都安排上了吗?今天我们就用这些程序员才懂的幽默,给你拜个早年吧! 来源:程序员最幽默(ID:humor1024)编译配字 01 Java VS C 02 功能先上了再说 …
NLTK 3.5 documentation
官方文档给出了各种安装方法,其中,提到了命令行安装指导: Command line installation The downloader will search for an existing nltk_data directory to install NLTK data. If one does not exist it will atte…
FreqDisk
nltk FreqDisk函数能够统计数组当中单词出现的次数。
text [hadoop,spark,hive,hadoop,hadoop,spark,lucene,hadoop,spark,hive,hadoop,hadoop,spark,pig,zookeeper,flume,stream,hadoop,hadoop,spark,pig,zookeeper,flume,stream,hadoop,hadoop,spark,pig,zookeep…
先读入数据
import pandas as pd
data pd.read_excel(rD:\python\zxzy\amazon_asin\review.xlsx)
title data[review_revs]
data.head(1) 对每条review进行分句
#分句
import nltk
from nltk.tokenize import sent_tokenize
sent []
for i in title:sent.append(sent_toke…