Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

due to select() paho-mqtt is unable to connect if more than 1024 file handle are used #819

Open
showfuture opened this issue Feb 27, 2024 · 6 comments
Labels
Status: Available No one has claimed responsibility for resolving this issue. Type: Enhancement A new feature for a minor or major release.

Comments

@showfuture
Copy link

Problem description:

Python version: 3.9
Paho-MQTT version: 1.6.1
When using 1000 threads, each thread as a client to connect to the MQTT service, due to the _socketpair_compat function in loop_start, only a few hundred clients can be connected, and all clients cannot be connected successfully.
After adjusting the system file handle number to 65535, it still fails to connect.
However, if the _socketpair_compat function is commented out, all clients can connect successfully.

Question:

Is there any way to solve this problem?

@github-actions github-actions bot added the Status: Available No one has claimed responsibility for resolving this issue. label Feb 27, 2024
@JamesParrott
Copy link

If you really need 1000 threads I would strongly suggest a library with native async support, e.g.:

https://github.com/toreamun/asyncio-paho

@PierreF
Copy link
Contributor

PierreF commented Feb 28, 2024

That's a nice issue... pretty obscure to find the cause if you never see such issue. tl; dr: we should no longer use select()

Here is how to reproduce the same issue you had with an every more strange code:

import paho.mqtt.client as mqtt
import time

# Here the magic happen :)
files = [open("/etc/hosts") for _ in range(1019)]

mqttc = mqtt.Client(mqtt.CallbackAPIVersion.VERSION2)
mqttc.connect("mqtt.eclipseprojects.io")
mqttc.loop_start()

time.sleep(5)  # Give network the time to do the handshake
print(mqttc.is_connected())

This will fail, the client will not be connected. To fix this code, just change the number 1019 in 1018 :)

More seriously, the issue is:

>>> mqttc._sockpairR
<socket.socket fd=1024, family=2, type=1, proto=0, laddr=('127.0.0.1', 52282), raddr=('127.0.0.1', 45195)>
>>> select.select([mqttc._sockpairR], [], [], 1)  # This is approximately what loop does
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: filedescriptor out of range in select()

This issue is that select (only on Linux ?) can't work with FD >= 1024

WARNING: select() can monitor only file descriptors numbers that are less than FD_SETSIZE (1024)
-- https://manpages.debian.org/unstable/manpages-dev/select.2.en.html

In your program, you should have about 340 connections working. Socket pair (as it name said) create 2 FDs. 340 * 3 (the MQTT socket & the two sockets of the socket pair) = 1020. Then add stdout, stdin and stderr -> 1023.

The immediate fix is to don't use select() which means don't use loop(), loop_start() or loop_forever(). This mostly means use the external loop a.k.a an ayncio (either with a third-party that wrap it, or directly - there is some example).
It should also be possible to use multiple processes to spread the connections to avoid reaching the FD number 1024, but I think this is too complex for the neeed.

The right fix is to change paho so that it stop using select() and use modern solution (probably Python selectors).

@PierreF PierreF changed the title due to the _socketpair_compat function in loop_start, fails to connect due to select() paho-mqtt is unable to connect if more than 1024 file handle are used Feb 28, 2024
@showfuture
Copy link
Author

If you really need 1000 threads I would strongly suggest a library with native async support, e.g.:

https://github.com/toreamun/asyncio-paho

thanks,I will try!

@showfuture
Copy link
Author

This mostly means use the external loop a.k.a an ayncio (either with a third-party that wrap it, or directly - there is some example

Can you provide me with some examples or other packages that can solve this problem?

@PierreF
Copy link
Contributor

PierreF commented Feb 29, 2024

This mostly means use the external loop a.k.a an ayncio (either with a third-party that wrap it, or directly - there is some example

Can you provide me with some examples or other packages that can solve this problem?

I'm not using paho-mqtt with asyncio, so I don't really know one. I've seen the name https://github.com/sbtinstruments/aiomqtt passed in another issue.
You can also look at:

@showfuture
Copy link
Author

@MattBrittan MattBrittan added the Type: Enhancement A new feature for a minor or major release. label Jul 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Available No one has claimed responsibility for resolving this issue. Type: Enhancement A new feature for a minor or major release.
Projects
None yet
Development

No branches or pull requests

4 participants