-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
windowing and type conversion possbile using input_prep_fn
, output_prep_fn
?
#16
Comments
If you have the profiling results still, can you tell where the astype conversion is happening (i.e. are you sure flowdec is doing it)? I don't recall there being anything in there that tries to manage types beyond the implicit cast that comes with specifying data array inputs like def input_prep_fn(tensor_name, tensor):
# tensor_name will be one of 'data' or 'kernel'
return tf.cast(tensor, tf.float32) But I don't think this will change anything since that should be what happens without the explicit cast. On the output side of things though, perhaps something like this will help? def output_prep_fn(tensor_name, tensor, inputs):
# This is only applied to single result tensor so tensor_name can be ignored
return tf.cast(tf.clip_by_value(tensor, tf.uint16.min, tf.uint16.max), tf.uint16) Those would then get passed in like When the tensors are passed to those functions, any operations that you add would get appended to the TF graph and most likely run on a GPU. So for example, saying Does that help? The type conversion and apodization are easy to express as tensorflow operations but the window function might be trickier. If you can tell me more about that I could probably help find what tensorflow functions would make it efficient. Either way though, I think all of the above should work using [input|output]_prep_fns since it's basically like altering the source code used to build the TF graph. |
Thanks for opening the discussion. I'm following your work for quiet a while now and I must say it's really impressive!
which gives:
Both parts should do the same, but the latter one is approx. 6x faster. Any ideas? Another thing I found is, that the TF.angle() function is not (yet) implemented on GPU which causes expensive CPU<->GPU copying all the time. Another aspect is the long time it takes to run the first Session.run(). I guess TF does a set of pre-compilations in order to simplify the graph, but for a simple 3D convolution it already takes longer than the equivalent model in e.g. Matlab/numpy. Did you face the problem sometime and if so do you have any suggestions to solve it? Best |
Thanks @beniroquai ! That is odd that the one initialization is so much slower but depending on what you were working towards, a placeholder is probably a better starting point for building a graph anyhow (vs building it with constants like
I'd imagine option #2 is good if you want to use tensorflow much like you'd use numpy, where everything happens line by line. I think you can generally get better performance overall with option #1 but it's a bit harder to use and there's always that "first run" slowness. I don't think there is a good way around that outside of eager execution, but it's usually not to hard to specify inputs using placeholders instead of constants to make a reusable graph that will execute quickly after the first time (and maybe run some dummy values through once on startup if you've got a real-time use case?). |
Thanks for the detailed reply @eric-czech
This is a misunderstanding. The type conversions were not happening in flowdec, they were happening in my own batch code before and after each session run. As I noticed that this is quite slow I was looking for ways to also do this with tensorflow and when poring over your code I realized that this might be possible using the Thanks for those
Yes, that's what I was aiming for. My original question was in the sense of how to I declare this apodization array so the tensorflow graph knows about it? At the time I was looking at the this code snippet from
...and wondering whether I would have to add an additional line for an apodization array. I'm still not quite clear on this. Then, do I have to add it to the
So do I need to pass a Also, from browsing through the Apologies if that all sounds a bit confused, but having used
With the window function I basically meant multipilcation with an apodization array (this could be a sine-window or similar which I can easily pre-compute using numpy, so nothing tricky there). Finally there is a question about persistence. Currently I pass in the kernel each time I run a deconvolution, similar to this:
As the kernel is the same for each |
Thank you very much for the comprehensive answer! I tried some of your suggestions, but I guess the first sess.run() is always slow as it compiles the graph. Saving and loading might be an option. Eager not really - but maybe TF 2.0 makes the difference? Best |
Hey @VolkerH sorry for the delay, just getting back from vacation. I see what you mean now and thanks for laying that out. I think you could safely avoid having to add new tf.Variable tensors and instead stick to adding an apodization array as a constant in the input_prep_fn since the type and shape declaration within the graph would be implicit in how you define that function (i.e. when you multiply the image tensor by the apodization matrix as a numpy array, Tensorflow would infer the type and shape of that numpy array before adding it as a tensor to the graph forever). I hope that makes sense and if it helps further, here is an example that places both an apodization array and a PSF as constants on the graph instead of having to feed them in with placeholders: import tensorflow as tf
import numpy as np
from flowdec import restoration as fd_restoration
from flowdec import data as fd_data
# Load Hollow Bars volume downsampled to 25%
acq = fd_data.bars_25pct()
print(acq.shape())
# > {'actual': (32, 64, 64), 'data': (32, 64, 64), 'kernel': (32, 64, 64)}
# Create a dummy apodization array
apod = np.ones_like(acq.data)
def input_prep_fn(tensor_name, tensor):
# Multiply image tensor by apodization matrix (multiplication will convert `apod` to Tensor)
if tensor_name.startswith('data'):
return tensor * apod
# Return psf as constant with explicit conversion to Tensor since the output
# of these functions must be a Tensor
if tensor_name.startswith('kernel'):
return tf.constant(acq.kernel)
raise ValueError('Tensor %s not supported' % tensor_name)
# Initialize with above function
algo = fd_restoration.RichardsonLucyDeconvolver(3, input_prep_fn=input_prep_fn).initialize()
# Pass the kernel as a tiny array to ultimately be ignored
# * `algo` could then be reused on other images with both the PSF and apodization array as constants
res = algo.run(fd_data.Acquisition(data=acq.data, kernel=np.ones((1,1,1))), niter=25) I was experimenting with this via I think the real trick to keeping anything in GPU memory for multiple images will be to change the API to not create a new TensorFlow session each time (which is what happens on every algo.run call). That should overlap a good bit with #17, and at a quick glance it looks like most of the FFT operations support batch dimensions so it shouldn't be too hard to add support for multiple images and PSFs at some point with broadcasting to support the many images + one PSF case. Oh also, unfortunately I made a mistake in how I was setting references to the input placeholders that needed to be fixed for the above example to work, but I just pushed the change. |
Thanks very much for your time and that example. This answers my original question (I guess the issue can be closed). |
Hi,
I'm just trying to get into tensorflow to be able to modify flowdec to my needs.
There are two things I am trying to achieve:
astype(np.*)
).While looking through the flowdec source code to see where I could add these things I noticed the
input_prep_fn
andoutput_prep_fn
stubs and I am wondering whether I could somehow use these for the above-mentioned purposes.However, in both cases I somehow need to allocate additional arrays (or "tensors")
I notice that inputs and outputs are passed in as dictionaries.
So can I achieve these objectives by initialzing the deconvolution object with some additional key/value pairs in the input/output dictionaries and passing in appropriate
input_prep_
andoutput_prep_
functions or do I need to make modifications to the actual code inflowdec/restoration.py
?Some guidance with how to approach this would be highly appreciated.
The text was updated successfully, but these errors were encountered: