Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to pass static configuration through url list file #86

Open
AlasdairGray opened this issue Nov 9, 2021 · 3 comments
Open

How to pass static configuration through url list file #86

AlasdairGray opened this issue Nov 9, 2021 · 3 comments
Assignees
Labels
bug Something isn't working

Comments

@AlasdairGray
Copy link
Member

AlasdairGray commented Nov 9, 2021

@petrospaps I'm trying to override the dynamic=true parameter in the localconfig.properties file by using the value on the end of the url list file.

Following the instructions in the README has not resulted in the static scraper being used. Below are the snippets of the url list file that I tried.

https://bgee.org/sitemap_main.xml,static
https://bgee.org/?page=gene&gene_id=ENSG00000274928,static
https://bgee.org/?page=gene&gene_id=ENSG00000274928, static

None of these overrode the dynamic setting in the local config file.

Even removing the setting from the local config file did not result in the static scraper being used.

I'm basing this on the following exert of the log file

11:26:16.619 [INFO] hwu.elixir.scrape.scraper.examples.FileScraper - Attempting to scrape: https://bgee.org/?page=gene&gene_id=ENSG00000274928
11:26:16.619 [INFO] hwu.elixir.scrape.scraper.ScraperFilteredCore - dynamic scraping setting
11:26:27.713 [ERROR] hwu.elixir.scrape.scraper.ScraperCore - URL timed out: https://bgee.org/?page=gene&gene_id=ENSG00000274928. Trying JSoup.
11:26:28.295 [DEBUG] hwu.elixir.scrape.scraper.ScraperFilteredCore - Number of JSONLD sections: 0
@AlasdairGray AlasdairGray added the bug Something isn't working label Nov 9, 2021
@AlasdairGray
Copy link
Member Author

Equivalent log messages when static is set in the configuration file

tatic scraping setting
11:31:29.682 [DEBUG] hwu.elixir.scrape.scraper.ScraperFilteredCore - Number of JSONLD sections: 0
11:31:29.728 [INFO] hwu.elixir.scrape.scraper.ScraperCore - https://bgee.org/?page=gene&gene_id=ENSA

@AlasdairGray
Copy link
Member Author

Related to #85

petrospaps added a commit that referenced this issue Dec 10, 2021
@AlasdairGray
Copy link
Member Author

Need to document how to annotate the configuration file

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants