Katana will not exclude links with parameters such as js and css from crawling #987

CatDrinkCoffee · 2024-08-10T15:35:20Z

Normally, the crawler will not request js and css pages once more, but when I used the -sb parameter to observe the browser crawling process, I found that Katana actually had the following problem

For example: .js and .css will not be visited once during the crawling process, but if it is with parameters, such as .js?ver=1.1, the crawler will choose to visit this page once, which will cause a huge number of crawler requests. Now many pages may have a parameter value after the js link. I think this is a defect and hope it can be fixed. Thank you

The picture below is what I captured when the crawler chose to visit this js (this js link is with parameters)

zrquan · 2024-08-15T13:59:54Z

Have you tried the -igq flag?

CatDrinkCoffee · 2024-09-02T03:28:45Z

Have you tried the -igq flag?

There is no -igq parameter in the document, but this is not a defect, right? I don't need to use other parameters to circumvent this defect. From the execution process, the program itself is designed not to crawl specific links, but in some special cases, this design fails.

zrquan · 2024-09-02T06:18:30Z

There is no -igq parameter in the document

Sorry, it should be the -iqp parameter.

CatDrinkCoffee added the Type: Bug Inconsistencies or issues which will cause an issue or problem for users or implementors. label Aug 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Katana will not exclude links with parameters such as js and css from crawling #987

Katana will not exclude links with parameters such as js and css from crawling #987

CatDrinkCoffee commented Aug 10, 2024

zrquan commented Aug 15, 2024

CatDrinkCoffee commented Sep 2, 2024

zrquan commented Sep 2, 2024

Katana will not exclude links with parameters such as js and css from crawling #987

Katana will not exclude links with parameters such as js and css from crawling #987

Comments

CatDrinkCoffee commented Aug 10, 2024

zrquan commented Aug 15, 2024

CatDrinkCoffee commented Sep 2, 2024

zrquan commented Sep 2, 2024