Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

浏览器资源不会释放 #77

Open
suntsao opened this issue Sep 6, 2024 · 1 comment
Open

浏览器资源不会释放 #77

suntsao opened this issue Sep 6, 2024 · 1 comment

Comments

@suntsao
Copy link

suntsao commented Sep 6, 2024

环境

  • pulsar-all版本:1.13.2-SNAPSHOT
  • 浏览器:google-chrome-stable-127.0.6533.99-1.x86_64.rpm

关键业务代码

PulsarSession session = PulsarContexts.createSession();
LoadOptions options = session.options("-parse -refresh");
options.getEvent().getBrowseEventHandlers().getOnDocumentActuallyReady().addLast((page, driver, other) -> {
    FeaturedDocument loadDocument = session.parse(page);
    CheckStatus checkStatus = CheckStatus.SUCCESS;
    for (SiteStatusCheckHandle handler: handlers) {
        checkStatus = handler.process(page, loadDocument);
        if(CheckStatus.SUCCESS != checkStatus){
            break;
        }
    }

    System.out.printf(page.getUrl() + " -> " + checkStatus + "\n");
    siteCheckService.handleCheckResultForUrl(page.getUrl(), checkStatus);
    return loadDocument;
});

List<String> urls = this.siteMapper.getAllSite().stream().map(siteModel -> siteModel.getUrl()).toList();
session.loadAllAsync(urls, options);
session.getContext().await();
session.close();

问题描述

服务器内存12G,采集检查的 URL 总共才 1600 条,基本上到1000 条左右,就因为内存不够无法执行;
查了一下应该是浏览器的资源没有释放;

不知道我这种写法是不是有什么问题?没有正确释放资源?

@platonai
Copy link
Owner

Use session.submit() instead of session.loadAllAsync().

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants