-
Notifications
You must be signed in to change notification settings - Fork 174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
怎么训练中文word+character+ngram 的Context特征 #16
Comments
同问,想要训练word+character+ngram。窗口为5的时候,character或ngram怎么选择,是直接取word to word里面的character么,还是只取前后5个character。 |
同问,求详细解释 |
@sherrytong 同问楼主,这一块有没有什么进展,也想要训练word+character+ngram,想问一下输入应该是怎么样的,是<word\character\ngram>三者的concatenate()(拼接)吗?还是什么?感觉不太清楚具体的输入。不像word2vec使用gensim,训练word embedding 输入分词好后的语料,这里一直很迷糊!希望给解答 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
你好,最近在用ngram2vec工具,有点困惑,要得到word+character+ngram这种context Features,我的语料要怎么处理呢?分词还是分字?
如果是分词的话,脚本里要怎么传参数才能得到character特征呢? 我在代码里看没有找到这部分内容
The text was updated successfully, but these errors were encountered: