Replies: 1 comment 1 reply
-
I have asked some of my buddies on ACCU (The Association of C and C++ Users) and got an interesting answer, to try chatGPT! So I did and got an answer that looks like a way forward. It said that SIGCHLD can be avoided by using "the double fork technique". The idea is that instead of calling system you fork but then the child also forks, creating a grandchild. The child exits and the grandchild does the work. The parant has to do some form of IPC to know when the grandchild has finished. ChatGPT suggested using a good old fashonioned unix pipe. This really does seem like it might work. I think I'll raise a bug report ticket for this SIGCHLD issue with this as the suggested solution. |
Beta Was this translation helpful? Give feedback.
-
when librdkafka is used in conjunction with kerberos, librdkafka uses the posix system call to run kinit. system does fork and exec and the parent does a waitpid. That is normally not a problem but it can be a problem with applications that use librdkafka. Applications that use SIGCHLD for their own purposes will receive this signal when librdkafka runs kinit. The call to waitpid will cause the invoking thread to receive that signal. I am working on a kafka program that uses a framework that employs SIGCHLD for its own purposes. This framework is also used to create threads and dispatch framework events to them. One of these threads is my kafka producer. Whenever it invokes kinit via librdkafka the thread receives SIGCHLD which is reported by the framework into its own special logfile. After all while the log fills up with these messages. A bug report has been filed against out software for issuing high volumes of these messages. I need to find a fix somehow.
I've done a bit of reading up on system and I find that it is down to that call to waitpid. Other developers that have hit this problem where they app calls a library that call system and their library also uses facilities that use SIGCHLD. The suggestion people made then was that the app should not use system. It can use fork and exec and then coordinate between parent and child using some communication mechanism of choice. Domain sockets was suggested. It's not the only way, there are a number of options. I wonder if the librdkafka developers would consider doing this to help out people who have to use librdkafka in conjunction with another library that uses SIGCHLD.
I can understand if people think this is very niche and that the system call is fine for 99.99% of uses cases. I agree that it is unusual to find a library that uses SIGCHLD for its own purposes and tells its users not to use that signal. But googling I find that though this case is not common, its not rare either. Other people have hit this problem. They are not users of librdkafka, they hit SIGCHLD when using the system call for other reasons. It does make me wonder if linux should provide another function that does the same job as system but that doesn't call waitpid. I will do some investigation to see if there is any mileage there. But in the meantime, could librdkafka be changed please?
Beta Was this translation helpful? Give feedback.
All reactions