-
-
Notifications
You must be signed in to change notification settings - Fork 75
Frequently Asked Questions
MWoffliner can not scrape any online MediaWiki instance.
Here are the prerequisites:
- MediaWiki version must be
1.17
or higher - MediaWiki API must be activated and one of following end-points must be activated
- Mediawiki instance must be stable and able to provide proper responses for all articles requested
--mwUrl
value is the MediaWiki base URL. It should be considered like an URL prefix on which the URL paths (for example --mwWikiPath
value) will be appended. Usually the --mwUrl
URL is only composed from the protocol scheme and the domain name (for example https://en.wikipedia.org
), but if the whole MediaWiki is not available at the root of the host, then you might have to add a path. You can observe the Mediawiki base URL just by loading the main page of the remote MediaWiki instance, but it's also given on the Special:Version
page, here for example on Wikipedia in English.
--mwWikiPath
value is the MediaWiki wiki base URL path. This is the Web browser visible path configured to access any article; the article ID being appended directly after. Usually this is just /wiki/
. You can also put there the index.php end-point path. For example, for Wikipedia in English, you can indifferently configure /wiki/
or /w/index.php
. You can observe the Mediawiki base URL just by loading the main page of the remote MediaWiki instance, but it's also given on the Special:Version
page, here for example on Wikipedia in English.
--mwActionApiPath
value is the MediaWiki "tradition" API path. Usually the path value here is very similar to the one of --mwModulePath
as api.php
is positioned just beside load.php
. You can find it by loading the Special:Version
page. For example for Wikipedia in English, this is /w/api.php
and you can see it here.
--mwModulePath
value is the MediaWiki module load path. Usually the path value here is very similar to the one of --mwActionApiPath
as load.php
is positioned just beside api.php
. You can find it by loading the Special:Version
page. For example for Wikipedia in English, this is /w/load.php
and you can see it here.
--mwRestApiPath
value is the MediaWiki REST API URL path for RestApi
(desktop) HTML renderer. You can find it by loading the Special:Version
page to get the rest.php
. For example for Wikipedia in English, this is /w/rest.php
and you can see it here.
To retrieve HTML pages from a remote MediaWiki instance, MWoffliner deals with Mediawiki APIs. MediaWiki provides multiples ways to retrieve HTML pages, but depending of the version of MediaWiki and the way it is setup, many of them might be unavailable. Per default, MWoffliner will do it's best to pick the right API: priority given on modern & mobile friendly API end-points (see https://github.com/openzim/mwoffliner/wiki/API-end%E2%80%90points). If you want to force the usage of a specific one, then use the option --forceRender
.