Skip to content

Latest commit

 

History

History
155 lines (113 loc) · 6.31 KB

README.md

File metadata and controls

155 lines (113 loc) · 6.31 KB

Sentiment-Analysis

一个情感分析的练手项目

数据集

酒店的评价文本,正面和负面各1000个,本实验在这基础上做情感分类。数据集来源: http://www.datatang.com/data/11936.

预处理

预处理包括以下几个步骤:

  • 替换不相关符号
  • 去掉标点
  • 分词
  • 去掉停用词(停用词见停用词表)

识别模型

LSTM

参数设置

embedding_unit = 200
lstm_unit = 120
hidden_units = [80, ]

结构


_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
embedding_4 (Embedding)      (None, None, 200)         2381000   
_________________________________________________________________
lstm_1 (LSTM)                (None, 120)               154080    
_________________________________________________________________
dense_6 (Dense)              (None, 80)                9680      
_________________________________________________________________
dense_7 (Dense)              (None, 1)                 81        
=================================================================
Total params: 2,544,841
Trainable params: 2,544,841
Non-trainable params: 0

结果

Epoch 30/40
1757/1757 [==============================]1757/1757 [==============================] - 7s 4ms/step - loss: 4.2043e-05 - acc: 1.0000 - val_loss: 4.0901e-05 - val_acc: 1.0000

Epoch 31/40
1757/1757 [==============================]1757/1757 [==============================] - 7s 4ms/step - loss: 4.0512e-05 - acc: 1.0000 - val_loss: 3.9406e-05 - val_acc: 1.0000

Epoch 32/40
1757/1757 [==============================]1757/1757 [==============================] - 7s 4ms/step - loss: 3.9030e-05 - acc: 1.0000 - val_loss: 3.7960e-05 - val_acc: 1.0000

Epoch 33/40
1757/1757 [==============================]1757/1757 [==============================] - 7s 4ms/step - loss: 3.7592e-05 - acc: 1.0000 - val_loss: 3.6552e-05 - val_acc: 1.0000

Epoch 34/40
1757/1757 [==============================]1757/1757 [==============================] - 7s 4ms/step - loss: 3.6195e-05 - acc: 1.0000 - val_loss: 3.5186e-05 - val_acc: 1.0000

Epoch 35/40
1757/1757 [==============================]1757/1757 [==============================] - 8s 5ms/step - loss: 3.4838e-05 - acc: 1.0000 - val_loss: 3.3860e-05 - val_acc: 1.0000

Epoch 36/40
1757/1757 [==============================]1757/1757 [==============================] - 7s 4ms/step - loss: 3.3522e-05 - acc: 1.0000 - val_loss: 3.2575e-05 - val_acc: 1.0000

Epoch 37/40
1757/1757 [==============================]1757/1757 [==============================] - 7s 4ms/step - loss: 3.2247e-05 - acc: 1.0000 - val_loss: 3.1331e-05 - val_acc: 1.0000

Epoch 38/40
1757/1757 [==============================]1757/1757 [==============================] - 8s 5ms/step - loss: 3.1013e-05 - acc: 1.0000 - val_loss: 3.0130e-05 - val_acc: 1.0000

Epoch 39/40
1757/1757 [==============================]1757/1757 [==============================] - 7s 4ms/step - loss: 2.9836e-05 - acc: 1.0000 - val_loss: 2.9014e-05 - val_acc: 1.0000

Epoch 40/40
1757/1757 [==============================]1757/1757 [==============================] - 7s 4ms/step - loss: 2.8728e-05 - acc: 1.0000 - val_loss: 2.7935e-05 - val_acc: 1.0000

CNN

超参数

embedding_size = 200
conv_size = 5
filters = 4
hidden_units = [80, ]

结构

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
embedding_10 (Embedding)     (None, None, 200)         2381000   
_________________________________________________________________
conv1d_9 (Conv1D)            (None, None, 4)           4004      
_________________________________________________________________
global_max_pooling1d_5 (Glob (None, 4)                 0         
_________________________________________________________________
dense_17 (Dense)             (None, 80)                400       
_________________________________________________________________
dense_18 (Dense)             (None, 1)                 81        
=================================================================
Total params: 2,385,485
Trainable params: 2,385,485
Non-trainable params: 0

结果

Train on 1757 samples, validate on 196 samples
2s 1ms/step - loss: 0.0012 - acc: 1.0000 - val_loss: 0.0012 - val_acc: 1.0000

Epoch 21/30
1757/1757 [==============================]1757/1757 [==============================] - 2s 1ms/step - loss: 0.0012 - acc: 1.0000 - val_loss: 0.0012 - val_acc: 1.0000

Epoch 22/30
1757/1757 [==============================]1757/1757 [==============================] - 2s 1ms/step - loss: 0.0011 - acc: 1.0000 - val_loss: 0.0011 - val_acc: 1.0000

Epoch 23/30
1757/1757 [==============================]1757/1757 [==============================] - 2s 1ms/step - loss: 0.0011 - acc: 1.0000 - val_loss: 0.0011 - val_acc: 1.0000

Epoch 24/30
1757/1757 [==============================]1757/1757 [==============================] - 2s 1ms/step - loss: 0.0010 - acc: 1.0000 - val_loss: 0.0010 - val_acc: 1.0000

Epoch 25/30
1757/1757 [==============================]1757/1757 [==============================] - 2s 1ms/step - loss: 9.8992e-04 - acc: 1.0000 - val_loss: 9.6657e-04 - val_acc: 1.0000

Epoch 26/30
1757/1757 [==============================]1757/1757 [==============================] - 2s 1ms/step - loss: 9.4789e-04 - acc: 1.0000 - val_loss: 9.2600e-04 - val_acc: 1.0000

Epoch 27/30
1757/1757 [==============================]1757/1757 [==============================] - 2s 1ms/step - loss: 9.0847e-04 - acc: 1.0000 - val_loss: 8.8791e-04 - val_acc: 1.0000

Epoch 28/30
1757/1757 [==============================]1757/1757 [==============================] - 2s 1ms/step - loss: 8.7143e-04 - acc: 1.0000 - val_loss: 8.5210e-04 - val_acc: 1.0000

Epoch 29/30
1757/1757 [==============================]1757/1757 [==============================] - 2s 994us/step - loss: 8.3659e-04 - acc: 1.0000 - val_loss: 8.1839e-04 - val_acc: 1.0000

Epoch 30/30
1757/1757 [==============================]1757/1757 [==============================] - 2s 995us/step - loss: 8.0378e-04 - acc: 1.0000 - val_loss: 7.8663e-04 - val_acc: 1.0000