-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
一些小小的建议 #5
Comments
还有一个,SEMENTIC_SEARCH_API_KEY还蛮难弄的,可以考虑在SEMENTIC_SEARCH_API_KEY为空时,请求SEMENTIC加一个小sleep,防止429 Too Many Requests |
仔细看了看,原来有这样的代码,疏忽了 |
代码中包含两个注释# search before和# search after,感觉和直觉不符,因为# search before是在搜索未来的论文,但是before似乎是在描述过去。建议的修改: |
谢谢建议,欢迎加入进来一起开发 |
很好的建议: |
尝试在本机运行了,好棒的工作,一些改进的建议:
1、增加一些爬虫,分析论文的分区或者影响因子,优先选择高水平论文来分析;
2、增加新论文的循环次数,例如当最新年份(如2024年)的论文累积到一定数量时才停止循环;
3、is_azure : False 似乎不起作用,可能改成数字的判断会更好?
4、pdf下载容易报错(特别是IEEE),但是实际上浏览器能够打开,或许考虑在这部分加入代理,或者使用selenium来下载?
The text was updated successfully, but these errors were encountered: