Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vectorizer.fit_transform 错误 #1

Open
wangxuezhan opened this issue Nov 7, 2018 · 1 comment
Open

vectorizer.fit_transform 错误 #1

wangxuezhan opened this issue Nov 7, 2018 · 1 comment

Comments

@wangxuezhan
Copy link

作者你好,我在执行你的源代码TFIDF_space.py的时候, tfidfspace.tdm = vectorizer.fit_transform(bunch.contents)这一行报错TypeError: cannot use a string pattern on a bytes-like object
使用你博客上面的代码,报错TypeError: 'builtin_function_or_method' object is not iterable,请问是什么原因?刚开始学,不太懂,请您指教!我的微信zhan10,期待您的解答!

@chongwangcc
Copy link

chongwangcc commented Nov 20, 2018

问题定位: 在 TFIDF_space.py 文件 的 stpwrdlst = readfile(stopword_path).splitlines() 一行;
原因:读入停用词文件,使用“rb”模式读字节了,改为“r”模式就好了
运行环境:win10, pycharm中
解决方案:

  1. 在 Tools.py 中新建函数:
# 读取文件
def readfile_str(path):  
    with open(path, "r",encoding="utf8") as fp:  
        content = fp.read()  
    return content
  1. 在 TFIDF_space.py 中 导入函数
    from Tools import readfile_str,
  2. TFIDF_space.py 文件中 stpwrdlst = readfile(stopword_path).splitlines() 替换 为 stpwrdlst = readfile_str(stopword_path).splitlines()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants