Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Page loaded but waitload/waitidle reach generates an error => No XHR in outputs #919

Open
alban-stourbe-wmx opened this issue Jun 6, 2024 · 1 comment
Assignees
Labels
Type: Bug Inconsistencies or issues which will cause an issue or problem for users or implementors.

Comments

@alban-stourbe-wmx
Copy link
Contributor

alban-stourbe-wmx commented Jun 6, 2024

Current Behavior:

I'm currently crawling a headless site in order to detect as many XHR requests as possible.
While debugging, I noticed that some pages are well loaded in the browser and are stable. However, katana doesn't see them that way. The code waits for the timeout to be reached before returning the error.

Capture d’écran 2024-06-06 à 12 20 13

The error is linked with this code:
Capture d’écran 2024-06-06 à 12 21 52

The context exceeds because the timeout is reached, rendering the DOM inaccessible and causing the error.
Capture d’écran 2024-06-06 à 12 15 31

However, in the browser, the page seems to be fully loaded.
Capture d’écran 2024-06-06 à 12 14 03

According to the code, the page should have been closed and the reply processed, but this is not the case. The code is waiting for the timeout.

The problem is that XHR and other requests detected before the timeout are not written to the output because of the error.

Expected Behavior:

I'd like to understand why the page is considered unstable and yet appears to be so when viewed in the browser.

Steps To Reproduce:

  1. Run Katana in headless mode
  2. Choose web page endpoint with XHR and other requests starting from the page
  3. Open browser for debugging
  4. Check that a page is fully loaded but does not close in the browser, meaning that it is about to reach the defined timeout.

Anything else:

It could also be a performance problem with my computer. However, I'm on a MAC OS:

  • 2.6 GHz Intel Core i7 6 cores
  • 16 Go 2667 MHz DDR4
    And I also had the same problem on another PC.

Have a nice day !

@alban-stourbe-wmx alban-stourbe-wmx added the Type: Bug Inconsistencies or issues which will cause an issue or problem for users or implementors. label Jun 6, 2024
@dogancanbakir dogancanbakir self-assigned this Jun 7, 2024
@Fly-Playgroud
Copy link

katana can try use rod's Page.waitStable or waitDOMStable method instead of idle

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Bug Inconsistencies or issues which will cause an issue or problem for users or implementors.
Projects
None yet
Development

No branches or pull requests

3 participants