【工具篇7】spaCy安装、使用教程,快来get!
-
spaCy 是一个自然语言处理库,包括分词、词性标注、词干化、命名实体识别、名词短语提取等功能~
接下来,将告诉你如何在平台快捷安装以及使用~
【安装】
# 安装 spaCy 3 For CUDA 11.2,根据镜像 CUDA 版本替换 [] 内版本 pip install spacy[cuda112]==3.0.6 # 安装 spaCy 2 For CUDA 11.2,根据镜像 CUDA 版本替换 [] 内版本 pip install spacy[cuda112]==2.3.5 # 通过 spacy 模块下载模型因为墙可能不可用,可通过下面 pip 安装方式安装 python -m spacy download en_core_web_sm # 安装 3.0.0 en_core_web_sm pip install https://ghproxy.com/https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-3.0.0/en_core_web_sm-3.0.0-py3-none-any.whl --no-cache # 安装 2.3.1 en_core_web_sm pip install https://ghproxy.com/https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.3.1/en_core_web_sm-2.3.1.tar.gz --no-cache
【使用】
import spacy # Load English tokenizer, tagger, parser and NER nlp = spacy.load("en_core_web_sm") # Process whole documents text = ("When Sebastian Thrun started working on self-driving cars at " "Google in 2007, few people outside of the company took him " "seriously. “I can tell you very senior CEOs of major American " "car companies would shake my hand and turn away because I wasn’t " "worth talking to,” said Thrun, in an interview with Recode earlier " "this week.") doc = nlp(text) # Analyze syntax print("Noun phrases:", [chunk.text for chunk in doc.noun_chunks]) print("Verbs:", [token.lemma_ for token in doc if token.pos_ == "VERB"]) # Find named entities, phrases and concepts for entity in doc.ents: print(entity.text, entity.label_)
-
本篇教程参考了技术小哥哥编写的文档,如有任何疑问,请在本帖下留言哈