Skip to content

Latest commit

 

History

History
73 lines (58 loc) · 2.38 KB

README.md

File metadata and controls

73 lines (58 loc) · 2.38 KB

**[This code belongs to the paper "Clickbait Convolutional Neural Network"

Requirements

  • Python 3
  • Tensorflow > 0.12
  • Numpy

Training

Print parameters:

./train.py --help
optional arguments:
  -h, --help            show this help message and exit
  --embedding_dim EMBEDDING_DIM
                        Dimensionality of character embedding (default: 128)
  --filter_sizes FILTER_SIZES
                        Comma-separated filter sizes (default: '3,4,5')
  --num_filters NUM_FILTERS
                        Number of filters per filter size (default: 128)
  --l2_reg_lambda L2_REG_LAMBDA
                        L2 regularizaion lambda (default: 0.0)
  --embedding_reg_lambda embedding_REG_LAMBDA
                        embedding regularizaion lambda (default: 0.01)
  --dropout_keep_prob DROPOUT_KEEP_PROB
                        Dropout keep probability (default: 0.5)
  --batch_size BATCH_SIZE
                        Batch Size (default: 64)
  --num_epochs NUM_EPOCHS
                        Number of training epochs (default: 100)
  --evaluate_every EVALUATE_EVERY
                        Evaluate model on dev set after this many steps
                        (default: 100)
  --checkpoint_every CHECKPOINT_EVERY
                        Save model after this many steps (default: 100)
  --allow_soft_placement ALLOW_SOFT_PLACEMENT
                        Allow device soft device placement
  --noallow_soft_placement
  --log_device_placement LOG_DEVICE_PLACEMENT
                        Log placement of ops on devices
  --nolog_device_placement

Train:

./train.py

Evaluating

./eval.py --eval_train --checkpoint_dir="./runs/1459637919/checkpoints/"

Replace the checkpoint dir with the output from the training. To use your own data, change the eval.py script to load your data.

Dataset

The folder data caontains the training and evaluating data.
test.txt is evaluating data and train.txt is training data.
The first column is the tag of the headline, 1 is clickbaits and 0 is non-clickbaits.
The second column is the type of the headline. 0 means News, 1: Blog, 2: BBS, 3: WeiXin.

References