Previously configured peer never gets undiscovered, even if removed from the peer list. #520

OscarMrZ · 2024-10-30T16:05:51Z

Bug report

Required Info:

Operating System:
- Ubuntu 22.04
Installation type:
- Binaries, from humble
Version or commit hash:
- humble
DDS implementation:
- rmw_cyclonedds
Client library (if applicable):
- N/A

Steps to reproduce issue

I'm configuring unicast discovery, trying to replicate the behavior achieved in rolling with the new env variables.

This is the config in my pc:

<CycloneDDS>
    <Domain>
        <General>
            <AllowMulticast>false</AllowMulticast>
            <MaxMessageSize>65500B</MaxMessageSize>
        </General>
        <Discovery>
            <ParticipantIndex>auto</ParticipantIndex>
            <Peers>
                <Peer Address="localhost"/>
                <Peer Adddress="robot-hostname"/>
            </Peers>
            <MaxAutoParticipantIndex>500</MaxAutoParticipantIndex>
        </Discovery>
        <Internal>
            <SocketReceiveBufferSize min="10MB"/>
            <Watermarks>
                <WhcHigh>500kB</WhcHigh>
            </Watermarks>
        </Internal>
    </Domain>
</CycloneDDS>

And this is the config on my robot

<CycloneDDS>
    <Domain>
        <General>
            <AllowMulticast>false</AllowMulticast>
            <MaxMessageSize>65500B</MaxMessageSize>
        </General>
        <Discovery>
            <ParticipantIndex>auto</ParticipantIndex>
            <Peers>
                <Peer Address="localhost"/>
            </Peers>
            <MaxAutoParticipantIndex>500</MaxAutoParticipantIndex>
        </Discovery>
        <Internal>
            <SocketReceiveBufferSize min="10MB"/>
            <Watermarks>
                <WhcHigh>500kB</WhcHigh>
            </Watermarks>
        </Internal>
    </Domain>
</CycloneDDS>

To the best of my knowledge, this would be analogous to Peer A (my pc) configured with localhost and a static peer in the list and Peer B configured with localhost and no static peers on the list.

After a clean restart of everything ROS 2 related, I publish a simple test message from the robot:

ros2 topic pub /hellostd_msgs/msg/String "data: 'hello'"

and in my pc

ros2 topic list

And as expected, after a little while, I can properly see the topic from the robot in the pc and can confirm all the traffic is unicast. After that, I would like to disconnect from the robot (removing it from the peer list), and stop receiving messages from it in Peer A and also stop seeing its topics. In order to do that, I terminate all the ROS 2 process (SIGTERM) and change the cyclone config in the pc, only specifying now the localhost peer.

Expected behavior

The first robot (Peer B) stops sending messages to my pc after the lease duration time specified in the SPDP message has expired.. This would also mean that my pc shouldn't be aware of the topics in Peer B after removing it from the list of peers. I shouldn't need to restart Peer B, which is a robot that shouldn't care about if I am or not connected.

Actual behavior

Peer B continuously sends INFO_TS messages to my PC, never undiscovering it. This makes my pc discover Peer B again, seeing all its topics even if it is not in the peer list. Only killing all ros2 processes on the robot stops this behavior and achieves what I expect.

Additional information

You can find a traffic capture demonstrating this behavior. Please let me know if this may be due to some misunderstanding by me side about the undiscovery process or if I am missing some configuration parameters. Thank you very much!

example_compressed.pcapng.gz

You can tell when I kill ros2 and switch to a config without the robot as a peer when the IGMP messages start to appear.
After that, even with Peer B not in my pc peer list, it gets discovered and I can effectively see the topic.
While with only one publisher the INFO_TS is being sent at a rather slow rate, but testing with all the robot nodes on (+250), the traffic is not negligible.
robot is 10.0.0.119, pc 10.0.0.185

mjcarroll · 2024-11-08T15:54:18Z

@eboasson this sounds like a Cyclone specific configuration rather than something at the RMW layer, do you mind to take a look?

OscarMrZ · 2024-11-21T12:18:19Z

To add a little bit more of info, I'm receiving a INFO_TS message from the ros2daemon and from my publisher, concretely every 8 seconds (which happens to be the heartbeat interval). However, this happens even with best effort QoS and to the best of my knowledge this should not be the case.

@mjcarroll I opened it here because I'm not sure if this cyclone specific or a problem or the rmw implementation, do you think it should be closed and reopened elsewhere?

OscarMrZ changed the title ~~INFO_TS messages constantly being sent from previous peer~~ Previously configured peer never gets undiscovered, even if removed from the peer list. Oct 30, 2024

mjcarroll assigned eboasson Nov 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Previously configured peer never gets undiscovered, even if removed from the peer list. #520

Previously configured peer never gets undiscovered, even if removed from the peer list. #520

OscarMrZ commented Oct 30, 2024 •

edited

Loading

mjcarroll commented Nov 8, 2024

OscarMrZ commented Nov 21, 2024

Previously configured peer never gets undiscovered, even if removed from the peer list. #520

Previously configured peer never gets undiscovered, even if removed from the peer list. #520

Comments

OscarMrZ commented Oct 30, 2024 • edited Loading

Bug report

Steps to reproduce issue

Expected behavior

Actual behavior

Additional information

mjcarroll commented Nov 8, 2024

OscarMrZ commented Nov 21, 2024

OscarMrZ commented Oct 30, 2024 •

edited

Loading