Skip to content

Latent Dirichlet Allocation train with Supervised Data

Notifications You must be signed in to change notification settings

EdmundHee/LDA-Script

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Latent Dirichlet Allocation (Python)

To understand what is Latent Dirichlet Allocation: read here

Scripts created for supervised learning What is supervised learning

Library Required

  • numpy
  • nltk

Data Cleaning

  • Lemmatization from (NLTK)
  • Stop words

Create data set (Mac - Terminal)

  1. Create "build" directory in the same folder
    mkdir build
  1. Create text file inside "build" directory (Eg. food.txt)
  2. Content of food text file should be as follow Format of content
    "<KEYWORDS"|"<CLASS TYPE>"

Example

    "Cabbage Celery Chicory Corn"|"Vegetable"
    "Beef Chicken Fish"|"Meat"
  1. Type the following command to train the model
    python classifier.py -t -f food.txt -m food
  1. Two file will be created in build folder:
    • stopwords.p
    • food_trained.p (Trained Model)
  2. Result of perplexity and keyword weightage will be display in terminal

How to classify after train (Mac - Terminal)

  1. In terminal type
    python classifier.py -c -m food -l "beef"
  1. Result will be shown in terminal

Check commands (Mac - Terminal)

  1. In terminal type
python classifier.py --sos

Commands

Command Type Default Value function Example
--alpha float 0.005 Alpha value --alpha 0.001
--beta float 0.005 Beta value --beta 0.001
-t boolean false Trigger train model function -t
-c boolean false Trigger classification function -c
-m string no default value Model name -m food
-l string no default value String to pass in for classification -c "Beef Chicken"
-k integer 10 Number of topics -k 50
-i integer 100 Number of iteration -i 200
-f string no default value Filename -f food.txt
--sos boolean false Display example and command list --sos

Reference

About

Latent Dirichlet Allocation train with Supervised Data

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages