Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

second run of acquire(events) ends in freeze of MM with no images taken; specific to using "DA Z Stage" as Core-Focus #194

Closed
linshaova opened this issue Dec 1, 2020 · 27 comments
Labels
bug Something isn't working

Comments

@linshaova
Copy link

linshaova commented Dec 1, 2020

Bug report

Bug summary

Code for reproduction

# After the first run with success, the 2nd run of the following ends in freeze with image window showing black
with Acquisition(directory=save_dir, name=save_name) as acq:
    events = multi_d_acquisition_events(num_time_points=4, z_start=0, z_end=20, z_step=1)
    acq.acquire(events)

# If there's no Z involved in "events" then running this multiple times works fine

Expected outcome

This should run as many times as I ask it to
Actual outcome
Only the first run works; after that when I run it again, the image window pops up but everything freezes. Micro-manager can't even be closed normally; has to be force-closed from TaskManager.

Version Info

  • Operating system: Windows 10 build 1909
  • pycromanager version: 0.7.1
  • MicroManager version: 2.0gamma Nightlybuild 20201130
  • Python version: 3.7.9
  • Python environment (command line, IDE, Jupyter notebook, etc): iPython 7.19.0
@linshaova linshaova added the bug Something isn't working label Dec 1, 2020
@henrypinkard
Copy link
Member

I was not able to reproduce this on Windows or OSX using the demo configuration. My thought is that maybe this is a consequence of the specific Z-drive you are using? Do you see the same behavior when using the micro-manager demo configuration?

@linshaova
Copy link
Author

Thanks for your quick response! Using the demo config it seems to work fine.

My Z-drive is a simple DA Z Stage (voltage-controlled piezo) and it has no trouble using MM's Multi-D Acq. Plus, it works fine in the first run.

Here are two more symptoms that may be related and may help the diagnosis:

  1. If I add a non-zero time_interval_s parameter to the multi_d_acquisition_events call, after the first stack is acquired, the acquisition also hangs and the subsequent time points never got acquired.
  2. After a successful first run (of the original script I shared) and if I exit then re-launch iPython with the MM session still running, the same script would also hang

@linshaova
Copy link
Author

I have some more info. My system happens to have another Thorlabs motor stage that can function as a focus stage. So I set that as the Core-Focus device and then the same script has no trouble repeatedly running.

So it certainly looks like a DA Z Stage-specific issue. To be more specific, my DA Z Stage is using a NI-DAQ device's analog-output line that connects to the piezo's input. Again, this hasn't had any issue running with MM's Multi-D.

@linshaova linshaova changed the title second run of acquire(events) ends in freeze of MM with no images taken second run of acquire(events) ends in freeze of MM with no images taken; specific to using _DA Z Stage_ as _Core-Focus_ Dec 2, 2020
@linshaova linshaova changed the title second run of acquire(events) ends in freeze of MM with no images taken; specific to using _DA Z Stage_ as _Core-Focus_ second run of acquire(events) ends in freeze of MM with no images taken; specific to using "DA Z Stage" as Core-Focus Dec 2, 2020
@henrypinkard
Copy link
Member

Okay, good testing work! It seems like its either a problem with that specific device adapter or the acquisition engine of pycro-manager failing to reset something. I want to figure out a bit more specifically what your hardware is doing. Can you run the following with the code below and let me know what it prints?

def hook_fn(event):
   print(event)
   return event

with Acquisition(directory=save_dir, name=save_name, pre_hardware_hook_fn=hook_fn) as acq:
    events = multi_d_acquisition_events(num_time_points=4, z_start=0, z_end=20, z_step=1)
    acq.acquire(events)

@linshaova
Copy link
Author

linshaova commented Dec 2, 2020

Seems like not much useful info:

[{'axes': {'z': 0, 'time': 0}, 'z': 0, 'properties': []}, {'axes': {'z': 1, 'time': 0}, 'z': 1, 'properties': []}, {'axes': {'z': 2, 'time': 0}, 'z': 2, 'properties': []}, {'axes': {'z': 3, 'time': 0}, 'z': 3, 'properties': []}, {'axes': {'z': 4, 'time': 0}, 'z': 4, 'properties': []}, {'axes': {'z': 5, 'time': 0}, 'z': 5, 'properties': []}, {'axes': {'z': 6, 'time': 0}, 'z': 6, 'properties': []}, {'axes': {'z': 7, 'time': 0}, 'z': 7, 'properties': []}, {'axes': {'z': 8, 'time': 0}, 'z': 8, 'properties': []}, {'axes': {'z': 9, 'time': 0}, 'z': 9, 'properties': []}, {'axes': {'z': 10, 'time': 0}, 'z': 10, 'properties': []}, {'axes': {'z': 11, 'time': 0}, 'z': 11, 'properties': []}, {'axes': {'z': 12, 'time': 0}, 'z': 12, 'properties': []}, {'axes': {'z': 13, 'time': 0}, 'z': 13, 'properties': []}, {'axes': {'z': 14, 'time': 0}, 'z': 14, 'properties': []}, {'axes': {'z': 15, 'time': 0}, 'z': 15, 'properties': []}, {'axes': {'z': 16, 'time': 0}, 'z': 16, 'properties': []}, {'axes': {'z': 17, 'time': 0}, 'z': 17, 'properties': []}, {'axes': {'z': 18, 'time': 0}, 'z': 18, 'properties': []}, {'axes': {'z': 19, 'time': 0}, 'z': 19, 'properties': []}, {'axes': {'z': 20, 'time': 0}, 'z': 20, 'properties': []}, {'axes': {'z': 0, 'time': 1}, 'z': 0, 'properties': []}, {'axes': {'z': 1, 'time': 1}, 'z': 1, 'properties': []}, {'axes': {'z': 2, 'time': 1}, 'z': 2, 'properties': []}, {'axes': {'z': 3, 'time': 1}, 'z': 3, 'properties': []}, {'axes': {'z': 4, 'time': 1}, 'z': 4, 'properties': []}, {'axes': {'z': 5, 'time': 1}, 'z': 5, 'properties': []}, {'axes': {'z': 6, 'time': 1}, 'z': 6, 'properties': []}, {'axes': {'z': 7, 'time': 1}, 'z': 7, 'properties': []}, {'axes': {'z': 8, 'time': 1}, 'z': 8, 'properties': []}, {'axes': {'z': 9, 'time': 1}, 'z': 9, 'properties': []}, {'axes': {'z': 10, 'time': 1}, 'z': 10, 'properties': []}, {'axes': {'z': 11, 'time': 1}, 'z': 11, 'properties': []}, {'axes': {'z': 12, 'time': 1}, 'z': 12, 'properties': []}, {'axes': {'z': 13, 'time': 1}, 'z': 13, 'properties': []}, {'axes': {'z': 14, 'time': 1}, 'z': 14, 'properties': []}, {'axes': {'z': 15, 'time': 1}, 'z': 15, 'properties': []}, {'axes': {'z': 16, 'time': 1}, 'z': 16, 'properties': []}, {'axes': {'z': 17, 'time': 1}, 'z': 17, 'properties': []}, {'axes': {'z': 18, 'time': 1}, 'z': 18, 'properties': []}, {'axes': {'z': 19, 'time': 1}, 'z': 19, 'properties': []}, {'axes': {'z': 20, 'time': 1}, 'z': 20, 'properties': []}, {'axes': {'z': 0, 'time': 2}, 'z': 0, 'properties': []}, {'axes': {'z': 1, 'time': 2}, 'z': 1, 'properties': []}, {'axes': {'z': 2, 'time': 2}, 'z': 2, 'properties': []}, {'axes': {'z': 3, 'time': 2}, 'z': 3, 'properties': []}, {'axes': {'z': 4, 'time': 2}, 'z': 4, 'properties': []}, {'axes': {'z': 5, 'time': 2}, 'z': 5, 'properties': []}, {'axes': {'z': 6, 'time': 2}, 'z': 6, 'properties': []}, {'axes': {'z': 7, 'time': 2}, 'z': 7, 'properties': []}, {'axes': {'z': 8, 'time': 2}, 'z': 8, 'properties': []}, {'axes': {'z': 9, 'time': 2}, 'z': 9, 'properties': []}, {'axes': {'z': 10, 'time': 2}, 'z': 10, 'properties': []}, {'axes': {'z': 11, 'time': 2}, 'z': 11, 'properties': []}, {'axes': {'z': 12, 'time': 2}, 'z': 12, 'properties': []}, {'axes': {'z': 13, 'time': 2}, 'z': 13, 'properties': []}, {'axes': {'z': 14, 'time': 2}, 'z': 14, 'properties': []}, {'axes': {'z': 15, 'time': 2}, 'z': 15, 'properties': []}, {'axes': {'z': 16, 'time': 2}, 'z': 16, 'properties': []}, {'axes': {'z': 17, 'time': 2}, 'z': 17, 'properties': []}, {'axes': {'z': 18, 'time': 2}, 'z': 18, 'properties': []}, {'axes': {'z': 19, 'time': 2}, 'z': 19, 'properties': []}, {'axes': {'z': 20, 'time': 2}, 'z': 20, 'properties': []}, {'axes': {'z': 0, 'time': 3}, 'z': 0, 'properties': []}, {'axes': {'z': 1, 'time': 3}, 'z': 1, 'properties': []}, {'axes': {'z': 2, 'time': 3}, 'z': 2, 'properties': []}, {'axes': {'z': 3, 'time': 3}, 'z': 3, 'properties': []}, {'axes': {'z': 4, 'time': 3}, 'z': 4, 'properties': []}, {'axes': {'z': 5, 'time': 3}, 'z': 5, 'properties': []}, {'axes': {'z': 6, 'time': 3}, 'z': 6, 'properties': []}, {'axes': {'z': 7, 'time': 3}, 'z': 7, 'properties': []}, {'axes': {'z': 8, 'time': 3}, 'z': 8, 'properties': []}, {'axes': {'z': 9, 'time': 3}, 'z': 9, 'properties': []}, {'axes': {'z': 10, 'time': 3}, 'z': 10, 'properties': []}, {'axes': {'z': 11, 'time': 3}, 'z': 11, 'properties': []}, {'axes': {'z': 12, 'time': 3}, 'z': 12, 'properties': []}, {'axes': {'z': 13, 'time': 3}, 'z': 13, 'properties': []}, {'axes': {'z': 14, 'time': 3}, 'z': 14, 'properties': []}, {'axes': {'z': 15, 'time': 3}, 'z': 15, 'properties': []}, {'axes': {'z': 16, 'time': 3}, 'z': 16, 'properties': []}, {'axes': {'z': 17, 'time': 3}, 'z': 17, 'properties': []}, {'axes': {'z': 18, 'time': 3}, 'z': 18, 'properties': []}, {'axes': {'z': 19, 'time': 3}, 'z': 19, 'properties': []}, {'axes': {'z': 20, 'time': 3}, 'z': 20, 'properties': []}]

@henrypinkard
Copy link
Member

Everything printed in one big list, instead of one per line, which means the acquisition engine is trying to sequence over Z and time. I wonder if this is the behavior you want?

In your setup, is your camera sending out TTL triggers that tell the Z stage to move to the next position?

@linshaova
Copy link
Author

Yes I'm using TTL triggers from a Andor Zyla to tell the Z stage to move at each exposure, and it works with MM's Multi D. Acq.

I'm not sure what "sequence over Z and time" means. Maybe the engine is doing that, but it's not like I knowingly did something for that to happen. All I wanted was to have pycro-manager reproduce what Multi. D could do. Thanks!

@henrypinkard
Copy link
Member

"sequence over Z and time" just means that its using TTL triggers, and it is trying to do both a time series of z-stacks through TTLs without communicating with software in between successive time points, as opposed to running a TTL triggered Z stack, talking to software, and repeating. I'm not sure if this is any different from MM, but one thing to try would be explicitly breaking this up by calling acq.acquire(events) multiple times, with each event corresponding to a different time point. You could do this by calling multi_d_acquisition_events to get the list of events and then manually breaking up this list into sub-lists where each one has only one value for 'time' in all of its events

Looking at the code just now, I notice is that there's no stopStageSequence call in the acquisition engine. Maybe this is required by some, but not all Z drives.

I compiled a new version. If you go here, you can download AcqEngJ-0.7.9.jar. Could you drop this into the plugins/micro-manager folder in your installation of MM, and get rid of the previous version to test?

@linshaova
Copy link
Author

linshaova commented Dec 9, 2020

Thanks for the new compile! It behaves similarly as before: after the first successful run, subsequent runs of the same script would pop up the acquired-image window but nothing else happens and everything stops immediately, with "Finished" in the window title and a black image. An empty folder is created in the save folder. The only improvement is that nothing hangs and I can properly close MM and quit the iPython session.

If I add a non-zero time_interval_s to the multi_d_acquisition_events call, only the first stack is acquired and after the requested interval expires, the acquisition quits and properly finishes. And this behavior is consistent; i.e., in every run of a session including the first, the first stack is acquired and then quits. Thanks!

@henrypinkard
Copy link
Member

Ok so its definitely an internal thing in the acquisition engine. Can you check the core logs (in core_logs folder of your MM install directory) to see if it points to a specific error? If not, then 2 options are: 1) Try to reproduce with a minimal example making the calls to the core directly. 2) Run the acquisition engine form source and debug it (so I would need to remote control your system)

@henrypinkard
Copy link
Member

Here are the relevant bits of the acquisition engine stripped out in Java code, but this can easily be translated to Python. This was before I added stopStageSequence. If you were able to reproduce this error from a script it would be much easier to see where things go wrong

DoubleVector zSequence = new DoubleVector() ;

// Loop and add all z positions
zSequence.add(zPos);
core_.loadStageSequence(zStage, zSequence);
core_.prepareSequenceAcquisition(core_.getCameraDevice());
core_.startStageSequence(zStage);

while (core_.isSequenceRunning()) {
                 //wait
}
core_.stopSequenceAcquisition();

//N is number of images
core_.startSequenceAcquisition(N, 0, true);

//loop and call N times to throw away the images
core_.popNextTaggedImage();
 

@henrypinkard
Copy link
Member

Or maybe @nicost or @marktsuchida have ideas on what core calls are missing to properly clean up everything

@nicost
Copy link
Member

nicost commented Dec 9, 2020

Not sure, but it would be useful to generate a trouble report (Help > Report a Problem). You can either post it here (very long!) or send it using the in-build functions, and I can have a look at it.

Also @henrypinkard, in the code above, I would start the StageSequence after the camera Sequence has stopped. Otherwise, it is possible that triggers will be send to the stage that will move it to undesired positions.

Does this acqEngine also incorporate Property Sequences? These are very useful for things like fast laser channel switching.

@henrypinkard
Copy link
Member

Whoops I wrote that out of order. Whats actually going on is below:

DoubleVector zSequence = new DoubleVector() ;

// Loop and add all z positions
zSequence.add(zPos);
core_.loadStageSequence(zStage, zSequence);
core_.prepareSequenceAcquisition(core_.getCameraDevice());
core_.startStageSequence(zStage);

//N is number of images
core_.startSequenceAcquisition(N, 0, true);

//loop and call N times to throw away the images
core_.popNextTaggedImage();

while (core_.isSequenceRunning()) {
                 //wait
}
core_.stopSequenceAcquisition();

Is there anything obviously wrong here?

Yes, its built to do property sequencing, but as far as I know no ones tested it so it might need more work

@henrypinkard
Copy link
Member

Also @henrypinkard, in the code above, I would start the StageSequence after the camera Sequence has stopped. Otherwise, it is possible that triggers will be send to the stage that will move it to undesired positions.

Don't they need to be running at the same time?

@nicost
Copy link
Member

nicost commented Dec 9, 2020

Yes, I thought you had code there to stop a previous camera sequence acquisition, which would have to happen before starting the stage sequence. What you have now looks correct (indeed, stage and property sequences need to start before the camera sequence starts).

@linshaova
Copy link
Author

Hi @nicost , thanks for offering help. I sent you a bug report from MM; let me know if you receive it from lin.shao_at_yale.edu. In the bug report, it might be a bit confusing because the focus device, the DA Z Stage in my case, is referred to as Galvo.

@henrypinkard I can understand what you want me to do with that script. There's some details I don't know how to handle in Python. For example, how do I create a zSequence instance of type mmcorej.DoubleVector in Python? Also, I translated the same thing to a Beanshell script; when running it, however, I got a error message about "circular buffer is empty" at the line where core_.popNextTaggedImage() is called. How do I allocate a buffer for the acquisition?

@henrypinkard
Copy link
Member

See here. Call bridge.construct_java_object('mmcorej.DoubleVector'), then add individual steps in the sequence using its API

I dont think you have to initialize it, but you just need to keep looping until it says its not empty (i.e. once the images get added to it)

@linshaova
Copy link
Author

Thanks for the pointers. I managed to write a Python script; see below. The same Java exception "Circular buffer is empty" occurred (error messages following the script). I still think some initialization step is missing.

from pycromanager import Bridge
bridge = Bridge()
core_ = bridge.get_core()

nZ = 40
zStart = -20
zStep = 1
zSequence = bridge.construct_java_object('mmcorej.DoubleVector', args=[nZ])

for i in range(nZ):
    zSequence.set(i, zStart + i*zStep)

zStage = core_.get_focus_device()
core_.load_stage_sequence(zStage, zSequence)
core_.prepare_sequence_acquisition(core_.get_camera_device())
core_.start_stage_sequence(zStage)

nT = 3
tot = nZ * nT
core_.start_sequence_acquisition(tot, 0, True)

for i in range(tot):
    core_.pop_next_tagged_image()  #exception occurs here

while core_.is_sequence_running():
    pass

core_.stop_sequence_acquisition()

Error messages:

---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
~\Documents\pycroMM\testdbg.py in <module>
     23
     24 for i in range(tot):
---> 25     core_.pop_next_tagged_image()
     26
     27 while core_.is_sequence_running():

~\miniconda3\lib\site-packages\pycromanager\core.py in <lambda>(instance, signatures_list, *args)
    373                 fn = lambda instance, *args, signatures_list=tuple(
    374                     methods_with_name
--> 375                 ): instance._translate_call(signatures_list, args)
    376                 fn.__name__ = method_name_modified
    377                 fn.__doc__ = "{}.{}: A dynamically generated Java method.".format(

~\miniconda3\lib\site-packages\pycromanager\core.py in _translate_call(self, method_specs, fn_args)
    493
    494         self._socket.send(message)
--> 495         return self._deserialize(self._socket.receive())
    496
    497     def _deserialize(self, json_return):

~\miniconda3\lib\site-packages\pycromanager\core.py in receive(self, timeout)
    161         if self._debug:
    162             print("DEBUG, recieved: {}".format(message))
--> 163         self._check_exception(message)
    164         return message
    165

~\miniconda3\lib\site-packages\pycromanager\core.py in _check_exception(self, response)
    166     def _check_exception(self, response):
    167         if "type" in response and response["type"] == "exception":
--> 168             raise Exception(response["value"])
    169
    170     def close(self):

Exception: java.lang.Exception: Circular buffer is empty.
mmcorej.MMCoreJJNI.CMMCore_popNextImageMD__SWIG_0(Native Method)
mmcorej.CMMCore.popNextImageMD(CMMCore.java:1052)
mmcorej.CMMCore.popNextTaggedImage(CMMCore.java:188)
mmcorej.CMMCore.popNextTaggedImage(CMMCore.java:193)
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
java.lang.reflect.Method.invoke(Method.java:498)
org.micromanager.internal.zmq.ZMQServer.runMethod(ZMQServer.java:339)
org.micromanager.internal.zmq.ZMQServer.parseAndExecuteCommand(ZMQServer.java:394)
org.micromanager.internal.zmq.ZMQServer.lambda$initialize$1(ZMQServer.java:103)
java.util.concurrent.FutureTask.run(FutureTask.java:266)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
java.lang.Thread.run(Thread.java:748)

@henrypinkard
Copy link
Member

You just need the code to swallow the exceptions until images are available. Corrected version below. It works for me with demo configuration (after turning on sequencing in the demo z).

from pycromanager import Bridge
bridge = Bridge()
core_ = bridge.get_core()

nZ = 40
zStart = -20
zStep = 1
zSequence = bridge.construct_java_object('mmcorej.DoubleVector', args=[nZ])

for i in range(nZ):
    zSequence.set(i, zStart + i*zStep)

zStage = core_.get_focus_device()
core_.load_stage_sequence(zStage, zSequence)
core_.prepare_sequence_acquisition(core_.get_camera_device())
core_.start_stage_sequence(zStage)

nT = 3
tot = nZ * nT
core_.start_sequence_acquisition(tot, 0, True)

i = 0
while i < tot:
    try:
        core_.pop_next_tagged_image()  #exception occurs here
        i += 1
    except:
        pass

while core_.is_sequence_running():
    pass

core_.stop_sequence_acquisition()

@linshaova
Copy link
Author

Okay I changed the script as suggested. And what I got is the same as before: the first run goes through and the second run makes everything hang (both MM and iPython have to be forced quit). Thanks!

@henrypinkard
Copy link
Member

Can you figure out on which line it hangs?

@linshaova
Copy link
Author

Good reminder... It seems to hang at the line of core_.load_stage_sequence(zStage, zSequence)

@henrypinkard
Copy link
Member

Based on this I think this is a issue with your device adapter (i.e. not specific to pycromanager). Though I have no idea why this works with the MM MDA. There is perhaps something in there that's masking the problem with the device adapter. I think you should probably open an issue on the main micro-manager repository since thats where the device adapters live.

Maybe you can also experiment with calling stop_stage_sequence at the end of the script either before or after stop_sequence acquistion? Or calling it multiple times? If you can figure out whatever hack is making this work in the MDA, there could be ways of duplicating it. Unfortunately, it is not straightforward to figure this out from the other MM MDA acquisition engine, because its written in the languge called Clojure that nobody understands

@linshaova
Copy link
Author

Thanks! I tried calling stop_sequence_acquistion as you suggest and now at the second run, load_stage_sequence would throw an exception instead of hang (I start MM from command line and therefore I can see all printed out info in the command window):

2020-12-11T12:18:19.526560 tid11520 [IFO,dev:AO1] Error calling WriteAnalogF64: Task specified is invalid or does not exist.
                                    [           ] Status Code: -200088
2020-12-11T12:18:19.527561 tid11520 [IFO,App] Error: Line 15: run-time error : Error in device "Galvo": (Error message unavailable) (-200088)

which is still not right but allows proper MM shutdown at least.

More importantly, I can reproduce the same symptoms using a ~directly translated Beanshell script (attached here testSequenceAcq.bsh; suffix changed to .txt to allow the upload), which confirms what you suspected as a device (NI DAQ in this case) adapter issue in MM itself. I suspect some mishandling of NI DAQmx calls if those are what's used in MM.

I'm also curious why MDA works. While I have no clue so far, I did notice a key difference between running MDA- and the pycroM-based acquisition: in the former, there's a noticeable pause (based on laser shutting) between time points even if the Interval is set to 0 ms whereas in the latter, the acquisition is continuous without a break if interval is set to 0.

In any case, thanks for your patience with me and all the help with uncovering an issue unrelated to pycro-manager at all! I'll open this issue in MM's repository.

@henrypinkard
Copy link
Member

You're welcome, glad you're able to get a little closer to figuring it out

I'm also curious why MDA works. While I have no clue so far, I did notice a key difference between running MDA- and the pycroM-based acquisition: in the former, there's a noticeable pause (based on laser shutting) between time points even if the Interval is set to 0 ms whereas in the latter, the acquisition is continuous without a break if interval is set to 0.

I think this is almost certainly because I've yet to implement property sequencing in the acquisition engine. Should be straitforward. I'll let you know when I've got it in there.

And if you feel inspired to delve into the confusing mess that is the MDA acquisition engine, its here

@henrypinkard
Copy link
Member

Closing this for now. I made another issue for property sequencing: #207

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants