NLP的文本分类方法梳理

btikc 2024-08-30 13:26:41 技术文章 8 ℃ 0 评论

本文目的是浏览梳理已有的，用深度学习方式的NLP的文本分类方法。

包括的模型有：

1.fastText

使用bi-gram/tri-gram，用NCE损失函数。性能很快。

2.TextCNN

框架是embedding--->conv--->max pooling--->fully connected layer-------->softmax

3.TextRNN

框架是embedding--->bi-directional lstm--->concat output--->average----->softmax layer

4.RCNN

循环卷积神经网络。

1)recurrent structure (convolutional layer) 2)max pooling 3) fully connected layer+softmax

5.Hierarchical Attention Network

结构是embedding--->wrod encoder--->wrod Attention--->Sentence Encoder--->Sentence Attetion--->FC+Softmax

6.seq2seq with attention

结构是1)embedding 2)bi-GRU too get rich representation from source sentences(forward & backward). 3)decoder with attention

7.Transformer("Attend Is All You Need")

8.Dynamic Memory Network

9.EntityNetwork:tracking state of the world

10.Ensemble models

11.Boosting:

12.BiLstmTextRelation;

与TextRNN结构相同。输入是特殊设计的了。

13.twoCNNTextRelation;

用了2个不同卷积层来提取特征。

14.BiLstmTextRelationTwoRNN

2个bi-directional lstm + softmax

这些算法依赖环境是python2.7+tensorflow1.1。

其中TextCNN模型已经迁移到python3.6了。

网站首页 > 技术文章正文