Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error converting tiffs to n5 #34

Open
BioinfoTongLI opened this issue Nov 10, 2021 · 4 comments
Open

Error converting tiffs to n5 #34

BioinfoTongLI opened this issue Nov 10, 2021 · 4 comments

Comments

@BioinfoTongLI
Copy link

Hello there!

I am trying to prepare the data by convering tiff file to n5 using spark-local/convert-tiff-tiles-n5.py.
The package was built by following the README.
By running spark-local/convert-tiff-tiles-n5.py -i /to/my/jsonI am getting this error. All the tiffs are in the same folder as the json file.

Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
21/11/10 21:25:16 WARN Utils: Your hostname, imaging-gpu-tl10 resolves to a loopback address: 127.0.0.1; using 10.0.101.148 instead (on interface ens3)
21/11/10 21:25:16 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
21/11/10 21:25:16 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
error.log: command not found
21/11/10 21:25:18 ERROR Executor: Exception in task 29.0 in stage 0.0 (TID 29)
java.lang.RuntimeException: dimensionality mismatch
	at org.janelia.stitching.ConvertTIFFTilesToN5Spark.convertTileToN5(ConvertTIFFTilesToN5Spark.java:199)
	at org.janelia.stitching.ConvertTIFFTilesToN5Spark.lambda$convertTilesToN5$cbf5f68e$1(ConvertTIFFTilesToN5Spark.java:161)
	at org.apache.spark.api.java.JavaPairRDD$$anonfun$toScalaFunction$1.apply(JavaPairRDD.scala:1040)
	at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
	at scala.collection.Iterator$class.foreach(Iterator.scala:893)
	at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
	at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:59)
	at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:104)
	at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:48)
	at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:310)
	at scala.collection.AbstractIterator.to(Iterator.scala:1336)
	at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:302)
	at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1336)
	at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:289)
	at scala.collection.AbstractIterator.toArray(Iterator.scala:1336)
	at org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$12.apply(RDD.scala:939)
	at org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$12.apply(RDD.scala:939)
	at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2074)
	at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2074)
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
	at org.apache.spark.scheduler.Task.run(Task.scala:109)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
21/11/10 21:25:18 WARN TaskSetManager: Lost task 29.0 in stage 0.0 (TID 29, localhost, executor driver): java.lang.RuntimeException: dimensionality mismatch
	at org.janelia.stitching.ConvertTIFFTilesToN5Spark.convertTileToN5(ConvertTIFFTilesToN5Spark.java:199)
	at org.janelia.stitching.ConvertTIFFTilesToN5Spark.lambda$convertTilesToN5$cbf5f68e$1(ConvertTIFFTilesToN5Spark.java:161)
	at org.apache.spark.api.java.JavaPairRDD$$anonfun$toScalaFunction$1.apply(JavaPairRDD.scala:1040)
	at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
	at scala.collection.Iterator$class.foreach(Iterator.scala:893)
	at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
	at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:59)
	at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:104)
	at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:48)
	at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:310)
	at scala.collection.AbstractIterator.to(Iterator.scala:1336)
	at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:302)
	at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1336)
	at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:289)
	at scala.collection.AbstractIterator.toArray(Iterator.scala:1336)
	at org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$12.apply(RDD.scala:939)
	at org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$12.apply(RDD.scala:939)
	at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2074)
	at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2074)
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
	at org.apache.spark.scheduler.Task.run(Task.scala:109)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)

21/11/10 21:25:18 ERROR TaskSetManager: Task 29 in stage 0.0 failed 1 times; aborting job
21/11/10 21:25:18 ERROR Executor: Exception in task 30.0 in stage 0.0 (TID 30)
java.lang.RuntimeException: dimensionality mismatch
	at org.janelia.stitching.ConvertTIFFTilesToN5Spark.convertTileToN5(ConvertTIFFTilesToN5Spark.java:199)
	at org.janelia.stitching.ConvertTIFFTilesToN5Spark.lambda$convertTilesToN5$cbf5f68e$1(ConvertTIFFTilesToN5Spark.java:161)
	at org.apache.spark.api.java.JavaPairRDD$$anonfun$toScalaFunction$1.apply(JavaPairRDD.scala:1040)
	at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
	at scala.collection.Iterator$class.foreach(Iterator.scala:893)
	at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
	at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:59)
	at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:104)
	at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:48)
	at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:310)
	at scala.collection.AbstractIterator.to(Iterator.scala:1336)
	at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:302)
	at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1336)
	at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:289)
	at scala.collection.AbstractIterator.toArray(Iterator.scala:1336)
	at org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$12.apply(RDD.scala:939)
	at org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$12.apply(RDD.scala:939)
	at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2074)
	at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2074)
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
	at org.apache.spark.scheduler.Task.run(Task.scala:109)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 29 in stage 0.0 failed 1 times, most recent failure: Lost task 29.0 in stage 0.0 (TID 29, localhost, executor driver): java.lang.RuntimeException: dimensionality mismatch
	at org.janelia.stitching.ConvertTIFFTilesToN5Spark.convertTileToN5(ConvertTIFFTilesToN5Spark.java:199)
	at org.janelia.stitching.ConvertTIFFTilesToN5Spark.lambda$convertTilesToN5$cbf5f68e$1(ConvertTIFFTilesToN5Spark.java:161)
	at org.apache.spark.api.java.JavaPairRDD$$anonfun$toScalaFunction$1.apply(JavaPairRDD.scala:1040)
	at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
	at scala.collection.Iterator$class.foreach(Iterator.scala:893)
	at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
	at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:59)
	at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:104)
	at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:48)
	at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:310)
	at scala.collection.AbstractIterator.to(Iterator.scala:1336)
	at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:302)
	at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1336)
	at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:289)
	at scala.collection.AbstractIterator.toArray(Iterator.scala:1336)
	at org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$12.apply(RDD.scala:939)
	at org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$12.apply(RDD.scala:939)
	at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2074)
	at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2074)
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
	at org.apache.spark.scheduler.Task.run(Task.scala:109)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)

Driver stacktrace:
	at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1602)
	at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1590)
	at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1589)
	at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
	at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1589)
	at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:831)
	at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:831)
	at scala.Option.foreach(Option.scala:257)
	at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:831)
	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1823)
	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1772)
	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1761)
	at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
	at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:642)
	at org.apache.spark.SparkContext.runJob(SparkContext.scala:2034)
	at org.apache.spark.SparkContext.runJob(SparkContext.scala:2055)
	at org.apache.spark.SparkContext.runJob(SparkContext.scala:2074)
	at org.apache.spark.SparkContext.runJob(SparkContext.scala:2099)
	at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:939)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
	at org.apache.spark.rdd.RDD.withScope(RDD.scala:363)
	at org.apache.spark.rdd.RDD.collect(RDD.scala:938)
	at org.apache.spark.api.java.JavaRDDLike$class.collect(JavaRDDLike.scala:361)
	at org.apache.spark.api.java.AbstractJavaRDDLike.collect(JavaRDDLike.scala:45)
	at org.janelia.stitching.ConvertTIFFTilesToN5Spark.convertTilesToN5(ConvertTIFFTilesToN5Spark.java:174)
	at org.janelia.stitching.ConvertTIFFTilesToN5Spark.run(ConvertTIFFTilesToN5Spark.java:121)
	at org.janelia.stitching.ConvertTIFFTilesToN5Spark.main(ConvertTIFFTilesToN5Spark.java:100)
Caused by: java.lang.RuntimeException: dimensionality mismatch
	at org.janelia.stitching.ConvertTIFFTilesToN5Spark.convertTileToN5(ConvertTIFFTilesToN5Spark.java:199)
	at org.janelia.stitching.ConvertTIFFTilesToN5Spark.lambda$convertTilesToN5$cbf5f68e$1(ConvertTIFFTilesToN5Spark.java:161)
	at org.apache.spark.api.java.JavaPairRDD$$anonfun$toScalaFunction$1.apply(JavaPairRDD.scala:1040)
	at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
	at scala.collection.Iterator$class.foreach(Iterator.scala:893)
	at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
	at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:59)
	at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:104)
	at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:48)
	at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:310)
	at scala.collection.AbstractIterator.to(Iterator.scala:1336)
	at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:302)
	at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1336)
	at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:289)
	at scala.collection.AbstractIterator.toArray(Iterator.scala:1336)
	at org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$12.apply(RDD.scala:939)
	at org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$12.apply(RDD.scala:939)
	at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2074)
	at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2074)
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
	at org.apache.spark.scheduler.Task.run(Task.scala:109)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)

And the json file looks like this:

[
    {
        "index": 0,
        "file": "02_03_01.ome.tif",
        "position": [
            -6036,
            6036
        ],
        "size": [
            2160,
            2160
        ],
        "pixelResolution": [
            0.1481,
            0.1481
        ],
        "type": "GRAY16"
    }
]

Any ideas what I did wrong?
Thanks!
Tong

@bogovicj
Copy link
Contributor

Hi @BioinfoTongLI ,

The default block size for n5's is 128,128,64 ( see here ), but it looks like you're exporting a 2d dataset. Is that correct?

If so, try re-running adding -b 128,128, so the block dimensions match your image dimensions.

@BioinfoTongLI
Copy link
Author

Hi @bogovicj, I've tried both 3D and 2D with the -b respectively.
For the 2D version, by adding the -b, the error changed from
Caused by: java.lang.RuntimeException: dimensionality mismatch
to Caused by: java.lang.NullPointerException.
As for 3D, the error is always Caused by: java.lang.RuntimeException: dimensionality mismatch, with/without the -b.

@bogovicj
Copy link
Contributor

@BioinfoTongLI

Let's deal with one case at a time.

2D

Let me make sure I understand what you did. You ran the script convert-tiff-tiles-n5.py with your 2D data and initially got the dimensionality mismatch error, then you added a -b option and you got a NullPointerException.

Is that correct? If so, please provide the full error error trace for the NullPointerException you're getting now

3D

Did you run the same script with 3D tifs? or are they 2D tifs that you want to assemble into a 3D n5?
More detail here would help. Seeing the full error traces (at least the first ~5 lines) would help.

@BioinfoTongLI
Copy link
Author

Hi @bogovicj ,

Is that correct? If so, please provide the full error error trace for the NullPointerException you're getting now

Yes, exactly. And sure. Here are the output of each condition.

2d_with_b.log
2d_default.log

Did you run the same script with 3D tifs? or are they 2D tifs that you want to assemble into a 3D n5?

They are true canonical 3D ome.tifs. The Z dimension can be read correctly with QuPath and Fiji.
And here are the logs for each condition.

3d_default.log
3d_with_b.log

Just to clarify, I ran the script with 2D optiopns on 2D data and 3D options on 3D data.
Many thanks for your help! And let me know if you need anything else.

Best,
Tong

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants