Skip to content

Latest commit

 

History

History
20 lines (20 loc) · 800 Bytes

todo.md

File metadata and controls

20 lines (20 loc) · 800 Bytes

Todo

  • a better tagging method by java.
  • address extraction.
  • object extraction.
  • crf model by java.
  • crf model by python.
  • a better segment implement.
  • cilin similarity.
  • hownet similarity.
  • pinyin similarity.
  • ac+trie tree segment.
  • double array trie tree segment.
  • sentiment tendency.
  • literal value similarity.
  • training word2vec in java.
  • optimizing double arrays trie tree by storing arrays avoid building everytime.
  • Aho-Corasick + double array trie to optimize matching.
  • optimizing core dictionary by persistence avoid building ac-tree everytime.
  • optimizing name dictionary by persistence avoid building double arrays everytime.
  • optimizing idiom dictionary by persistence avoid building double arrays everytime.