As of 1.1.0
this extention has been made to work with CKAN 2.9. While attempts have been made to maintain compatibility with prior version of CKAN, there may be issues. If any issues are discovered we are happy to accept PRs. Alternatively for compatibility <2.9 the 1.0.0
tag can be used.
Use pip
to install this plugin. This example installs it in /var/www
source /home/www-data/pyenv/bin/activate
pip install -e git+https://github.com/mediasuitenz/ckanext-oaipmh.git#egg=ckanext-oaipmh --src /var/www
cd /var/www/ckanext-oaipmh
pip install -r requirements.txt
python setup.py develop
Make sure the ckanext-harvest extension is installed as well.
Important: You need to have a sysadmin user called "harvest" on your CKAN instance!
- add
oaipmh_harvester
tockan.plugins
indevelopment.ini
(orproduction.ini
) - restart your webserver
- with the web browser go to
<your ckan url>/harvest/new
- as URL fill in the base URL of an OAI-PMH conforming repository, e.g. http://boris.unibe.ch/cgi/oai2 for more see http://www.openarchives.org/Register/BrowseSites
- select Source type
OAI-PMH Harvester
- if your OAI-PMH needs credentials, add the following to the "Configuration" section:
{"username": "foo", "password": "bar" }
- if you only want to harvest a specific set, add the following to the "Configuration" section:
{"set": "baz"}
- if you want to harvest data in a specific metadata format, add the following to the "Configuration" section:
{"metadata_prefix": "oai_dc"}
(currentlyoai_dc
andoai_ddi
are supported) - if your OAI-PMH source does not support HTTP POST and you want to enforce HTTP GET, add the following to the "Configuration" section:
{"force_http_get": true}
(defaults tofalse
) - Save
- on the harvest admin click Reharvest
On the command line do this:
- activate the python environment
cd
to the ckan directory, e.g./usr/lib/ckan/default/src/ckan
- start the consumers:
# ckan >= 2.9
ckan harvester gather-consumer
ckan harvester fetch-consumer
# ckan < 2.9
paster --plugin=ckanext-oaipmh harvester gather_consumer
paster --plugin=ckanext-oaipmh harvester fetch_consumer
- run the job:
# ckan >= 2.9
ckan harvester run
# ckan < 2.9
paster --plugin=ckanext-oaipmh harvester run
The harvester should now start and import the OAI-PMH metadata.
To make it easier to develop, tests are setup that allow to do that:
. ~/default/bin/activate
cd /var/www/ckanext-oaipmh
nosetests --logging-filter=ckanext.oaipmh.harvester --ckan --with-pylons=test.ini ckanext/oaipmh/tests
In this example the logging filter is used to only show messages of the harvester.