基于BERT+Bi-LSTM+CRF 的航天领域命名实体识别研究
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

基金项目:


Research on Named Entity Recognition in Aerospace Field Based on BERT+Bi-LSTM+CRF
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    针对互联网开放数据中文本表述模糊、实体边界不清等问题,构建航天语料库Space-Corpus,提出一种 基于BERT+Bi-LSTM+CRF 的航天领域命名实体识别模型。基于微调的多层双向Transformer 编码器(bidirectional encoder representations from transformer, BERT)模型生成输入语料的向量化表示, 结合双向长短期记忆网络 (bi-directional long short-term memory,Bi-LSTM)获取上下文特征,通过条件随机场(conditional random field,CRF) 层进行序列解码标注,输出得分最高的预测标签。实验结果表明,该模型在Space-Corpus 语料库上较基于BERT 模 型、基于BERT+Bi-LSTM 以及基于CNN+Bi-LSTM+CRF 识别模型的准确率、召回率及F1 值均有提升。

    Abstract:

    Aiming at the problems of fuzzy text expression and unclear entity boundary in Internet open data, this paper constructs Space-Corpus, and proposes a named entity recognition model based on BERT + Bi-LSTM + CRF. The bidirectional encoder representations from transformer (BERT) model based on two-way training Transformer generates the vectorized representation of the input corpus, combines with bi-directional long short-term memory (Bi-LSTM) to obtain the context features, decodes and annotates the sequence through conditional random field (CRF), and outputs the predicted label with the highest score. Experimental results show that the proposed model outperforms the BERT model, BERT + Bi-LSTM model and CNN + Bi-LSTM + CRF model in terms of accuracy, recall and F1 score on Space-Corpus corpus.

    参考文献
    相似文献
    引证文献
引用本文

夏旭东.基于BERT+Bi-LSTM+CRF 的航天领域命名实体识别研究[J].,2024,43(02).

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2023-10-21
  • 最后修改日期:2023-11-25
  • 录用日期:
  • 在线发布日期: 2024-03-07
  • 出版日期: