You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When I run the code in preprocess_dump.py to process the lastest wikidata dump (as of April 16) with 28 processes, I got the following error with processes 28. However, the code seems still running and produce processed tables.
Do you know if the error is something I should care about or I can just ignore it?
Thank you a lot!
Process Process-28: Traceback (most recent call last): File "**/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap self.run() File "**/lib/python3.8/multiprocessing/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "**/simple-wikidata-db/simple_wikidata_db/preprocess_utils/worker_process.py", line 151, in process_data out_queue.put(process_json(ujson.loads(json_obj), language_id)) File "**/simple-wikidata-db/simple_wikidata_db/preprocess_utils/worker_process.py", line 91, in process_json datatype = claim['mainsnak']['datatype'] KeyError: 'datatype'
The text was updated successfully, but these errors were encountered:
I haven't gotten a chance to try and reproduce the error, but it looks like at least one of the claim objects doesn't have a datatype key. I haven't seen this error previously, so I wonder if it's something in most recent dump?
One small fix would be to disregard all claims which don't have a datatype key, and then count how many you drop (or write them to some error log file)?
Thanks a lot for your reply.
I do not know much about wikidta so I'm not sure datatype key is something recent.
I'll try your suggestion.
However, even with the error reported, it seems like the code works fine and all the output tables seem okie.
Hi,
Thank you for the useful github code.
When I run the code in preprocess_dump.py to process the lastest wikidata dump (as of April 16) with 28 processes, I got the following error with processes 28. However, the code seems still running and produce processed tables.
Do you know if the error is something I should care about or I can just ignore it?
Thank you a lot!
Process Process-28: Traceback (most recent call last): File "**/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap self.run() File "**/lib/python3.8/multiprocessing/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "**/simple-wikidata-db/simple_wikidata_db/preprocess_utils/worker_process.py", line 151, in process_data out_queue.put(process_json(ujson.loads(json_obj), language_id)) File "**/simple-wikidata-db/simple_wikidata_db/preprocess_utils/worker_process.py", line 91, in process_json datatype = claim['mainsnak']['datatype'] KeyError: 'datatype'
The text was updated successfully, but these errors were encountered: