flush() does not wait on Windows #118

rdiez · 2018-12-20T09:19:38Z

First of all, many thanks for PureJavaComm. I have been using it for years without problems.

I am writing a number of small JavaFX applications that send data to rather simple serial port devices. The devices do not normally respond. I normally test the applications on Linux (my main development platform) and on a Windows 7 PC.

I recently started seeing problems on several Windows 7 PCs. The data sent is abruptly cut off. The other side just gets one byte, or a few of them. This happens with as little as 3 bytes of data.

Interestingly enough, this issue does not happen with an FTDI-based USB to RS-232 adapter. But it happens with an FTDI-based USB to RS-485 adapter. And with most other RS-232 adapters too, for example, with one based on a cp210x chip from Silicon Labs, or with Prolific-based adapters.

The Linux PC is fine, even with those same adapters, and with a virtual serial port too (created with socat PTY).

I am using PureJavaComm version 1.0.2 . By the way, I find it somewhat disturbing that the source code version in GitHub is not up to date with the binary release. I noticed because there is no release tag for version 1.0.2. It is also mentioned in the following issue: #116

I am also using JNA version 4.5.2. JNA has been upgraded recently quite a lot for a library described as "mature". I wonder whether I should upgrade it again, and whether that would impact PureJavaComm. I could not find any mention of what JNA versions PureJavaComm has been tested against, or whether the JNA version does not matter much.

This is what my source code looks like:

ByteArrayOutputStream bos = blah blah...

OutputStream outputStream = serialPort.getOutputStream();

outputStream.write( bos.toByteArray() );

outputStream.flush();

serialPort.close();

On Linux, and on Windows with the FTDI adapter, the call to flush() waits until all data has been sent. But on Windows, with the other RS-232 or RS-485 adapters, it does not wait. The call to close() apparently drops most of the outgoing bytes.

I have looked at the source code behind flush(), and it does check for eventual errors coming from tcdrain(). But I am not seeing any exceptions.

Can anyone help me with this issue?

I have been thinking about implementing a workaround with the OUTPUT_BUFFER_EMPTY event. But all I could find were incomplete examples. What is the best way to block a thread until all bytes have been sent? Are such events called on a separate thread, and do I need to worry about synchronisation? Is OUTPUT_BUFFER_EMPTY only for the internal buffer, or it is fired when the last bit has actually been sent over the serial interface?

Or is there any way you can poll the serial port to see if all data has been sent?

Thanks in advance,
rdiez

nyholku · 2018-12-20T09:45:29Z

Hi,

I will try to get the source and versions tags all in sync during the Christmas break.
JNA version should matter very little. if JNA fails in anyway then you'd likely get a serious crash.
to try OUTPUT_BUFFER_EMPTY simply add SerialPortEventListener into the port. Your event listener will then be called when the buffer is empty and you can then close the port. Might be safer though to set up a semaphore in your own code, wait() on that and notifyAll() in the eventlistener and close the port when the wait returns. This is because I'm not 100% sure if closing the port while the eventlistener is being called has been tested properly.
if your are just sending a few known number of bytes then you know how long it takes and you could just sleep() with some margin, workaround, I know but might save the day.
does not look like a good design to continuously open/close devices, feels like asking for trouble
fundamentally I think the problem is with the drivers, outputStream.flush() is simple call to FlushFileBuffers which as far as I can google should do the job and for some devices as you testified it does what it says on the label.

wbr Kusti

rdiez · 2018-12-20T10:56:42Z

Many thanks for your quick response.

About point 5), I do not want my application to block the serial port under Windows for a long time. Normally, the user spends some time filling in the data and clicking on various options, and then presses the "Send" button once. Therefore, I think that opening and closing the port is justified in this scenario.

Sometimes you need to change the cable to another device and use another one of the small applications. You may end up with 2 them open on your PC. Having to press some "connect" and "disconnect" button every time, so that each application locks and releases the serial port, would probably inconvenience my users.

About point 6), I did try searching the Internet, but I could not find anybody complaining that FlushFileBuffers does not work.

I do not know much about the Windows API, but I have some concerns about a possible interaction of FlushFileBuffers() with the overlapped I/O mode. After all, overlapping tends to mean "do no block". I found the following comment in the following page:

http://forums.codeguru.com/showthread.php?147771-Serial-port-question

in this case, you might need to setup an event and WaitForSingle/MultipleObject() until output is done.

If FlushFileBuffers() always blocks, then there is no easy wait to asynchronously wait for it to finish. If it does start the flush in an "overlapped" manner, then the code should wait for it to finish before returning, shouldn't it?

About point 3), I can play a little with OUTPUT_BUFFER_EMPTY. Sun's original documentation states that this is an optional event that may not always be available. Can you make a statement about it in PureJavaComm? I mean, is it always available, and if not, on which platforms?

I'll post again when I have done some more testing.

Thanks again,
rdiez

rdiez · 2018-12-20T12:38:07Z

There is one thing I haven't quite understood yet in src/jtermios/windows/JTermiosImpl.java . Please keep in mind that I am struggling here, because I do not really known that much about the Windows API.

Routine write() will start an overlapped write to the serial port, but will not wait for it to complete. Say that the same thread quickly calls flush() afterwards, which goes down to tcdrain() and then to FlushFileBuffers().

Will the data written in an overlapped manner be considered to be in the serial port buffer already? Or is there a race condition here?

I mean, should tcdrain() start in the same way as write(), by checking m_WritePending first, and waiting for any in-flight overlapped operation to finish, before calling FlushFileBuffers()?

nyholku · 2018-12-20T12:58:11Z

That is a good question! Obviously I expected that any write submitted should eventually complete and thus the as it is should work. But, given the rather complex and cumbersome Windows API (why could they not just copy *nix) I would not be too surprised if what you speculate is actually true. Feel free to test that by adding that check and wait there before FlushFileBuffers()s, that would be a valuable data point re debugging.

rdiez · 2018-12-20T13:22:28Z

For the record, OUTPUT_BUFFER_EMPTY is not enough. It does improve the situation quite a lot: instead of letting only the first byte through, waiting for this event lets the first data chunks out, and only loses the last chunk. Evidence is piling up that flush() is not waiting correctly.

I am using the compiled .jar at the moment. The further testing you suggested will require changes to my NetBeans projects. I think that will have to wait until after the Christmas break. Hopefully you will have uploaded the latest sources then, so that we will be at the same level.

rdiez · 2019-01-09T13:46:55Z

I have investigated a little further:

Adding to the beginning of tcdrain() the same m_WritePending check as write() does shows some improvement. Instead of dropping most of the data sent when the serial port is closed, it just loses a number of bytes at the end. Many of the bytes at the beginning do get sent.

This all happens with a single call to write for 1813 bytes, which is below the internal buffer size, so there is no internal data fragmentation at all.

I have been trying to call WaitCommEvent() for EV_TXEMPTY after FlushFileBuffers() in tcdrain(), but I cannot get it to work. That event never triggers at that point.

I cannot find a way to avoid losing the bytes at the end, other than just waiting for a while before closing the port, which is not a very elegant solution.

The source code for the Windows implementation of JTermiosImpl.java shows a number of warnings, at least with NetBeans 8.2. I would fix at least some of them.
I am not inclined to create pull requests because we know that the repository is not up to date (see above). Merge conflicts would create unnecessary work.
Using your project is increasing straining my trust. After all, you are distributing a binary that is not backed up by up-to-date sources. I cannot tell whether I am hitting known issues that have been fixed later on. I cannot verify that the binaries match the sources.

Best regards,
rdiez

nyholku · 2019-01-09T18:24:12Z

I just pushed 1.0.2 rebranded as 1.0.3 cause I could not be sure that the 1.0.2 binaries out there were 100% compatible with my local and github source code. But now everything is hopefully in sync.

rdiez mentioned this issue Jan 11, 2019

Cannot pass NULL to WriteFile() in lpNumberOfBytesWritten, should use JNA definitions #120

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

flush() does not wait on Windows #118

flush() does not wait on Windows #118

rdiez commented Dec 20, 2018

nyholku commented Dec 20, 2018

rdiez commented Dec 20, 2018

rdiez commented Dec 20, 2018

nyholku commented Dec 20, 2018

rdiez commented Dec 20, 2018

rdiez commented Jan 9, 2019

nyholku commented Jan 9, 2019

flush() does not wait on Windows #118

flush() does not wait on Windows #118

Comments

rdiez commented Dec 20, 2018

nyholku commented Dec 20, 2018

rdiez commented Dec 20, 2018

rdiez commented Dec 20, 2018

nyholku commented Dec 20, 2018

rdiez commented Dec 20, 2018

rdiez commented Jan 9, 2019

nyholku commented Jan 9, 2019