Skip to content

V2.1.4

Latest
Compare
Choose a tag to compare
@zhegexiaohuozi zhegexiaohuozi released this 25 Apr 07:33
· 3 commits to master since this release
  • 支持扩展自定义SeimiDownloader 方便更灵活的定制自己的数据获取需求
    默认走系统下载器,针对特殊请求,可以自行指定自定义的下载器,如:
public class MyCoustomDownloader implements SeimiDownloader {
    @Override
    public Response process(Request request) throws Exception {
        Response seimiResponse = new Response();
        seimiResponse.setSeimiHttpType(SeimiHttpType.OK_HTTP3);
        seimiResponse.setRealUrl(request.getUrl());
        seimiResponse.setUrl(request.getUrl());
        seimiResponse.setRequest(request);
        seimiResponse.setMeta(request.getMeta());
        seimiResponse.setBodyType(BodyType.TEXT);
        String content = webGetDo(request);
        seimiResponse.setContent(content);
        return seimiResponse;
    }

    @Override
    public Response metaRefresh(String s) throws Exception {
        //看自己情况,可以不实现不处理
        return null;
    }

    @Override
    public int statusCode() {
        return 200;
    }

    @Override
    public void addCookies(String s, List<SeimiCookie> list) {
        //to do
    }
}

其中 webGetDo() 是自定义逻辑,这里没有列出来,仅作示意,你可以随意实现你想实现的逻辑。

Request next = Request.build(url, MyCrawler::parseDetail);
next.setDownloader(MyCoustomDownloader.class);
push(next);
  • 支持通过Jvm参数-Dseimi.crawler.thread-num=xx自定义每个Crawler的工作线程数,最小值为1