Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Navigation data stream freezes in noisy WiFi environments #3

Open
bruceadams opened this issue Aug 19, 2013 · 13 comments
Open

Navigation data stream freezes in noisy WiFi environments #3

bruceadams opened this issue Aug 19, 2013 · 13 comments

Comments

@bruceadams
Copy link
Contributor

First, I want to say again how much I enjoyed your presentation at LambdaJam.

One of the challenges you faced at LambdaJam was a noisy WiFi environment, which (if I understood what you were saying correctly) caused the collection of navigation data to freeze. Just a couple days ago, I saw @jimweirich have a similar problem during his presentation at Steel City Ruby.

In large part inspired by your LambdaJam talk, I now have a AR Drone and have been experimenting with using this library. I'd like to figure out what is happening and then see if I can enhance the library to tolerate glitches.

Do you or @jimweirich have any insights into what is happening or how to better handle whatever is happening?

@bruceadams
Copy link
Contributor Author

Hmm. Have you already fixed this with 22721ed?

@gigasquid
Copy link
Owner

Maybe. The trouble is that I have trouble simulating the noisy wifi environment. I only really saw it at Lambda Jam in the hotel venue. I had bumped the timeouts during the conf and it seemed to help some, but I didn't try the reconnects. If you have any ideas, let me know.

@jimweirich
Copy link

The issue at Steel City Ruby didn't seem to be the data stream freezing because the data stream continued to update on the terminal. The problem seemed to be that drone was not recognizing the video target and I'm not sure why that was happening.

Oh wait. Just had a thought as I was typing.

The drone camera was pointed toward the audience with an open set of windows in the background. It might be possible that the glare from the windows was preventing the drone from seeing the target. I've seen that happen at the node copter event in Edinburgh.

@bruceadams
Copy link
Contributor Author

@jimweirich Ooh, yes, I can easily imagine daylight coming through those huge windows would be a problem for the camera.

I was thinking of a different problem. For a little while, you had the screen refreshing with navigation data, then it stopped updating. You kicked it (metaphorically speaking) to get the data refresh going again. That looked similar to my (vague) understanding of what @gigasquid was fighting at Lambda Jam.

@bruceadams
Copy link
Contributor Author

I'm back to looking at this. In my house (city location, moderately noisy WiFi environment), I find that receiving nav data often stops after about two seconds (both on an older checkout and on current master). I'm trying to figure out how learn more about what is happening. Did you see this issue during #cincycopter yesterday? (By the way #cincycopter sounded like a blast!)

@jimweirich
Copy link

Yes, I saw it at cincycopter. I was just smoke testing a new version of Argus, so I didn't investigate further. So both the clojure and ruby libraries are having problems here.

@gigasquid
Copy link
Owner

Bruce,
What code are you running that is freezing? I agree that there is still a problem here. Most of the time when I see it, it manifests itself in a SocketTimeout exception when I am trying to read the navigation data.

@bruceadams
Copy link
Contributor Author

I've been running a slight variant of your nav_test with more debug output.

Also, I added time stamps to log messages, see pull request #4 . (I'm barely able to read a log file without time stamps in it.)

The other thing I've been fiddling with, which may be causing spurious failures, is removing the call to communication-check. It's past flight time in my household (quiet hours), so I can't double check that right now.

@gigasquid
Copy link
Owner

Ah yes - you do need the communication check in there. If you look at the navigation data - there are two settings that get sent back

:communication :ok, :com-watchdog :ok

If you don't send a command to the drone in a certain amount of time - then the com-watchdog will be reported a as problem. If the problem is uncorrected, the communication will drop. The communication check sends a com watchdog reset if there is a problem and keeps the communication going.

I had a bug in the code earlier when I was working with mulit-drones and it didn't send the communication watch dog properly and saw the same thing. Current master should have the problem corrected.

@bruceadams
Copy link
Contributor Author

Oops. Yes. Turning the communications reset back on corrects the problem I was seeing. Sorry for my false diagnosis.

There is a related thing I've been struggling to understand.

  • Once the first "Watchdog Reset" message is logged, it is logged again a lot, for nearly every navigation data packet received.
  • Turning off the communication-check does not appear to harm my ability to give the drone commands.

I was confused into doubting that the :reset-watchdog command was having any impact.

Looking harder at the navigation data, I can see that the "Watchdog Reset" messages stop for about 0.2 seconds after each command is sent to the drone, well, except for the :reset-watchdog command itself.

I gather this library does not maintain a steady stream of outgoing communications to the drone? (I haven't found any code that looks like it does.)

Is there a way to do a simple "takeoff" (a command that takes several seconds for the drone to complete) without triggering the :problem message from the drone?

@gigasquid
Copy link
Owner

From the drone docs:

  • If the drone does not receive any traffic for more than 50ms; It will then set its ARDRONE_COM_WATCHDOG_MASK bit in the ardrone_state field (2nd field) of the navdata packet. To exit this mode, the client must send the AT Command AT*COMWDG.
  • The drone does not receive any traffic for more than 2000ms; it will then stop all communication with the client, and internally set the ARDRONE_COM_LOST_MASK bit in its state variable. The client must then reinitialize the network communication with the drone.

If you want to have a steady stream of commands without using the watchdog reset, you can try using the

 (drone-do-for seconds :take-off) 

It repeats a command every 30 ms for as long as you define.

@bruceadams
Copy link
Contributor Author

Thank you.

I learned from Jim's presentation that the drone wants periodic (and fast) communication. In learning this code, I didn't understood the watchdog reset as a normal thing, especially when I saw it happening every 0.01 seconds. It looked like failure recovery, leading me to struggle to figure out what normal was.

@gigasquid
Copy link
Owner

I had thought about changing out the communication model to something else - I don't know if it would help or not.

Right now it is just using plain on Java Sockets. I could use the java.nio.channels stuff or even use aleph which is build on top of it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants