Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Previously configured peer never gets undiscovered, even if removed from the peer list. #520

Open
OscarMrZ opened this issue Oct 30, 2024 · 2 comments
Assignees

Comments

@OscarMrZ
Copy link

OscarMrZ commented Oct 30, 2024

Bug report

Required Info:

  • Operating System:
    • Ubuntu 22.04
  • Installation type:
    • Binaries, from humble
  • Version or commit hash:
    • humble
  • DDS implementation:
    • rmw_cyclonedds
  • Client library (if applicable):
    • N/A

Steps to reproduce issue

I'm configuring unicast discovery, trying to replicate the behavior achieved in rolling with the new env variables.

This is the config in my pc:

<CycloneDDS>
    <Domain>
        <General>
            <AllowMulticast>false</AllowMulticast>
            <MaxMessageSize>65500B</MaxMessageSize>
        </General>
        <Discovery>
            <ParticipantIndex>auto</ParticipantIndex>
            <Peers>
                <Peer Address="localhost"/>
                <Peer Adddress="robot-hostname"/>
            </Peers>
            <MaxAutoParticipantIndex>500</MaxAutoParticipantIndex>
        </Discovery>
        <Internal>
            <SocketReceiveBufferSize min="10MB"/>
            <Watermarks>
                <WhcHigh>500kB</WhcHigh>
            </Watermarks>
        </Internal>
    </Domain>
</CycloneDDS>

And this is the config on my robot

<CycloneDDS>
    <Domain>
        <General>
            <AllowMulticast>false</AllowMulticast>
            <MaxMessageSize>65500B</MaxMessageSize>
        </General>
        <Discovery>
            <ParticipantIndex>auto</ParticipantIndex>
            <Peers>
                <Peer Address="localhost"/>
            </Peers>
            <MaxAutoParticipantIndex>500</MaxAutoParticipantIndex>
        </Discovery>
        <Internal>
            <SocketReceiveBufferSize min="10MB"/>
            <Watermarks>
                <WhcHigh>500kB</WhcHigh>
            </Watermarks>
        </Internal>
    </Domain>
</CycloneDDS>

To the best of my knowledge, this would be analogous to Peer A (my pc) configured with localhost and a static peer in the list and Peer B configured with localhost and no static peers on the list.

After a clean restart of everything ROS 2 related, I publish a simple test message from the robot:

ros2 topic pub /hellostd_msgs/msg/String "data: 'hello'"

and in my pc

ros2 topic list

And as expected, after a little while, I can properly see the topic from the robot in the pc and can confirm all the traffic is unicast. After that, I would like to disconnect from the robot (removing it from the peer list), and stop receiving messages from it in Peer A and also stop seeing its topics. In order to do that, I terminate all the ROS 2 process (SIGTERM) and change the cyclone config in the pc, only specifying now the localhost peer.

Expected behavior

The first robot (Peer B) stops sending messages to my pc after the lease duration time specified in the SPDP message has expired.. This would also mean that my pc shouldn't be aware of the topics in Peer B after removing it from the list of peers. I shouldn't need to restart Peer B, which is a robot that shouldn't care about if I am or not connected.

Actual behavior

Peer B continuously sends INFO_TS messages to my PC, never undiscovering it. This makes my pc discover Peer B again, seeing all its topics even if it is not in the peer list. Only killing all ros2 processes on the robot stops this behavior and achieves what I expect.

Additional information

You can find a traffic capture demonstrating this behavior. Please let me know if this may be due to some misunderstanding by me side about the undiscovery process or if I am missing some configuration parameters. Thank you very much!

example_compressed.pcapng.gz

  • You can tell when I kill ros2 and switch to a config without the robot as a peer when the IGMP messages start to appear.
  • After that, even with Peer B not in my pc peer list, it gets discovered and I can effectively see the topic.
  • While with only one publisher the INFO_TS is being sent at a rather slow rate, but testing with all the robot nodes on (+250), the traffic is not negligible.
  • robot is 10.0.0.119, pc 10.0.0.185
@OscarMrZ OscarMrZ changed the title INFO_TS messages constantly being sent from previous peer Previously configured peer never gets undiscovered, even if removed from the peer list. Oct 30, 2024
@mjcarroll
Copy link
Member

@eboasson this sounds like a Cyclone specific configuration rather than something at the RMW layer, do you mind to take a look?

@OscarMrZ
Copy link
Author

To add a little bit more of info, I'm receiving a INFO_TS message from the ros2daemon and from my publisher, concretely every 8 seconds (which happens to be the heartbeat interval). However, this happens even with best effort QoS and to the best of my knowledge this should not be the case.

@mjcarroll I opened it here because I'm not sure if this cyclone specific or a problem or the rmw implementation, do you think it should be closed and reopened elsewhere?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants