Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Out of memory running parse.py #8

Open
yschimke opened this issue Mar 5, 2013 · 2 comments
Open

Out of memory running parse.py #8

yschimke opened this issue Mar 5, 2013 · 2 comments

Comments

@yschimke
Copy link

yschimke commented Mar 5, 2013

I worked around using

export JAVA_TOOL_OPTIONS="-Xmx4G"

[info] imported 3600000 alternateNames so far
[info] imported 3700000 alternateNames so far
[error] java.lang.OutOfMemoryError: GC overhead limit exceeded
[error] at java.lang.String.substring(String.java:1913)
[error] at java.lang.String.split(String.java:2303)
[error] at java.lang.String.split(String.java:2355)
[error] at com.foursquare.twofishes.importers.geonames.AlternateNamesReader$$anonfun$readAlternateNamesFile$1.apply(AlternateNamesReader.scala:25)
[error] at com.foursquare.twofishes.importers.geonames.AlternateNamesReader$$anonfun$readAlternateNamesFile$1.apply(AlternateNamesReader.scala:20)
[error] at scala.collection.Iterator$class.foreach(Iterator.scala:660)
[error] at scala.collection.Iterator$$anon$29.foreach(Iterator.scala:607)
[error] at com.foursquare.twofishes.importers.geonames.AlternateNamesReader$.readAlternateNamesFile(AlternateNamesReader.scala:20)
[error] at com.foursquare.twofishes.importers.geonames.GeonamesParser.loadAlternateNames(GeonamesParser.scala:500)
[error] at com.foursquare.twofishes.importers.geonames.GeonamesParser$$anonfun$loadIntoMongo$1.apply$mcV$sp(GeonamesParser.scala:100)
[error] at com.foursquare.twofishes.importers.geonames.GeonamesParser$$anonfun$loadIntoMongo$1.apply(GeonamesParser.scala:100)
[error] at com.foursquare.twofishes.importers.geonames.GeonamesParser$$anonfun$loadIntoMongo$1.apply(GeonamesParser.scala:100)
[error] at com.twitter.util.Duration$.inMilliseconds(Time.scala:346)
[error] at com.foursquare.twofishes.util.Helpers$.duration(Helpers.scala:8)
[error] at com.foursquare.twofishes.importers.geonames.GeonamesParser$.loadIntoMongo(GeonamesParser.scala:99)
[error] at com.foursquare.twofishes.importers.geonames.GeonamesParser$.main(GeonamesParser.scala:76)
[error] at com.foursquare.twofishes.importers.geonames.GeonamesParser.main(GeonamesParser.scala)
[error] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[error] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
[error] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[error] at java.lang.reflect.Method.invoke(Method.java:601)
[error] at scala.tools.nsc.util.ScalaClassLoader$$anonfun$run$1.apply(ScalaClassLoader.scala:78)
[error] at scala.tools.nsc.util.ScalaClassLoader$class.asContext(ScalaClassLoader.scala:24)
[error] at scala.tools.nsc.util.ScalaClassLoader$URLClassLoader.asContext(ScalaClassLoader.scala:88)
[error] at scala.tools.nsc.util.ScalaClassLoader$class.run(ScalaClassLoader.scala:78)
[error] at scala.tools.nsc.util.ScalaClassLoader$URLClassLoader.run(ScalaClassLoader.scala:101)
[error] at scala.tools.nsc.ObjectRunner$.run(ObjectRunner.scala:33)
[error] at scala.tools.nsc.ObjectRunner$.runAndCatch(ObjectRunner.scala:40)
[error] at scala.tools.nsc.MainGenericRunner.runTarget$1(MainGenericRunner.scala:56)
[error] at scala.tools.nsc.MainGenericRunner.process(MainGenericRunner.scala:80)
[error] at scala.tools.nsc.MainGenericRunner$.main(MainGenericRunner.scala:89)
[error] at scala.tools.nsc.MainGenericRunner.main(MainGenericRunner.scala)
java.lang.RuntimeException: Nonzero exit code returned from runner: 1
at scala.sys.package$.error(package.scala:27)
[trace] Stack trace suppressed: run last indexer/compile:run-main for the full output.
error Nonzero exit code returned from runner: 1
[error] Total time: 227 s, completed Mar 5, 2013 5:54:54 PM

@yschimke
Copy link
Author

yschimke commented Mar 6, 2013

One other thing I noticed looking into this using jvisualvm was that the large heap settings of 6-12GB are applied to SBT (xsbt.boot.Boot) while the process that was failing was a forked Scala process (scala.tools.nsc.MainGenericRunner), running com.foursquare.twofishes.importers.geonames.GeonamesParser.main. This had standard heap settings until I tweaked JAVA_TOOL_OPTIONS

@blackmad
Copy link
Contributor

blackmad commented Mar 6, 2013

thanks for finding that. I hadn't had issues with OOM before, but that
makes a lot of sense.

I keep meaning to trim the mem requirements of the builder. the two issues
are that expanding mongo records is slow, so it's nice to read everything
into memory before writing to mongo, and that some of the output files
require an in memory sort of the keys.

I also keep meaning to publish prebuilt indexes, along with pre-matched
open polygons. I'm going to try to find time for that in the next few weeks.

--dave

On Wed, Mar 6, 2013 at 9:10 AM, Yuri Schimke [email protected]:

One other thing I noticed looking into this using jvisualvm was that the
large heap settings of 6-12GB are applied to SBT (xsbt.boot.Boot) while the
process that was failing was a forked Scala process
(scala.tools.nsc.MainGenericRunner), running
com.foursquare.twofishes.importers.geonames.GeonamesParser.main. This had
standard heap settings until I tweaked JAVA_TOOL_OPTIONS


Reply to this email directly or view it on GitHubhttps://github.com//issues/8#issuecomment-14500813
.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants