Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The dbmgr tool needs to be faster/more frugal with memory #13

Open
jkoshy opened this issue Apr 13, 2011 · 1 comment
Open

The dbmgr tool needs to be faster/more frugal with memory #13

jkoshy opened this issue Apr 13, 2011 · 1 comment
Assignees
Milestone

Comments

@jkoshy
Copy link
Contributor

jkoshy commented Apr 13, 2011

A complete planet.osm dump currently holds of the order of a billion nodes, ninety million ways and just a bit less than a million relations, per the current statistics for the OSM database.

In order to be able to deal with a data set of this size in a reasonable amount of time, the db-mgr ingestion tool needs to be sped up considerably, and also made frugal in its memory consumption.

@ghost ghost assigned jkoshy Apr 13, 2011
@jkoshy
Copy link
Contributor Author

jkoshy commented Apr 20, 2011

Python's cProfile module was used to profile an ingestion run.

A faster JSON decoder may help.

>>> p.sort_stats('time').print_stats(25)
Wed Apr 20 22:52:58 2011    prof.1

       606063949 function calls (538093040 primitive calls) in 2355.761 CPU seconds

   Ordered by: internal time
   List reduced from 569 to 25 due to restriction <25>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
2894113/2423987  293.772    0.000 1037.676    0.000 /usr/lib/python2.6/json/decoder.py:162(JSONObject)
62802266/179064  252.699    0.000 1249.725    0.007 /usr/lib/python2.6/json/scanner.py:38(iterscan)
  1841908  236.037    0.000  236.037    0.000 {built-in method acquire}
    89532  191.165    0.002  191.165    0.002 {method 'sendall' of '_socket.socket' objects}
   186671  100.853    0.001  100.853    0.001 {method 'recv' of '_socket.socket' objects}
 33943035   86.064    0.000   86.064    0.000 {_json.scanstring}
4965746/89532   84.138    0.000 1247.784    0.014 /usr/lib/python2.6/json/decoder.py:206(JSONArray)
 12299462   83.452    0.000  131.961    0.000 /usr/lib/python2.6/json/decoder.py:60(JSONNumber)
  2423987   81.286    0.000   81.286    0.000 ./apiserver/osmelement.py:138(from_mapping)
   897109   73.387    0.000  143.122    0.000 ./dbmgr/dbm_input.py:39(_make_osm_iterator)
128843635   67.806    0.000   67.806    0.000 {built-in method end}
 50502797   52.652    0.000   52.652    0.000 {built-in method scanner}
   897497   50.951    0.000   94.286    0.000 ./dbmgr/dbm_stats.py:98(increment_stats)
   153426   50.850    0.000  212.364    0.001 ./datastore/lrucache.py:258(insert_slab)
 31401133   46.769    0.000   46.769    0.000 {built-in method match}
 11241805   45.127    0.000   91.058    0.000 /usr/lib/python2.6/json/decoder.py:152(JSONString)
  3330996   32.307    0.000   76.072    0.000 ./apiserver/osmelement.py:116(__init__)
   153426   32.257    0.000   48.994    0.000 ./datastore/lrucache.py:233(_remove_slab_items)
   387550   27.715    0.000   27.742    0.000 {map}
  4443187   27.343    0.000   27.343    0.000 ./datastore/slabutil.py:35(_make_numeric_slabkey)
 43465635   24.734    0.000   24.734    0.000 {getattr}
   632818   24.690    0.000   91.443    0.000 /usr/lib/python2.6/threading.py:116(acquire)
 43700595   24.624    0.000   24.624    0.000 {built-in method span}
34895320/34895075   17.521    0.000   17.521    0.000 {len}
   897108   17.395    0.000 2285.756    0.003 ./dbmgr/dbm_ops.py:50(add_element)

jkoshy pushed a commit that referenced this issue Apr 21, 2011
* Add an `as_mapping()` method to `class OSMElement`; this translates
  an `OSMElement` instance into a Python object that can be directly
  converted to JSON.
* Use the `as_mapping()` method when converting elements and slabs
  to JSON prior to transmitting them on the wire.
* Use the `cjson` library instead of the default JSON library bundled
  with Python.

Issue:	#13
jkoshy pushed a commit to jkoshy/osm-api-server that referenced this issue Apr 22, 2011
* Add an `as_mapping()` method to `class OSMElement`; this translates
  an `OSMElement` instance into a Python object that can be directly
  converted to JSON.
* Use the `as_mapping()` method when converting elements and slabs
  to JSON prior to transmitting them on the wire.
* Use the `cjson` library instead of the default JSON library bundled
  with Python.

Issue:	MapQuest#13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant