Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash in mod_sofia #2417

Closed
zooptwopointone opened this issue Apr 2, 2024 · 4 comments
Closed

Crash in mod_sofia #2417

zooptwopointone opened this issue Apr 2, 2024 · 4 comments
Labels
bug Something isn't working

Comments

@zooptwopointone
Copy link

zooptwopointone commented Apr 2, 2024

Describe the bug
We have been having a problem with Freeswitch above 1.10.7 crashing every once in a while. Generally within 2 weeks. The older version does not have this issue. The crash always shows that it happens in mod_sofia. I have about 5 servers running and all will eventually hit this problem.

To Reproduce

This is not something I can manually reproduce. it will just occur sometimes. and seems to only happen under load.
We use mostly bridging calls, mod_lua, xml_curl for dialplan.

Expected behavior
no crashing

Package version or git hash

1.10.11-release+git20231222T180831Zf24064f~64bit (git f24064f 2023-12-22 18:08:31Z 64bit)

Running on Debian 12

Trace logs
No useful logs that I can find.

backtrace from core file
I have a full backtrace of this but it is 20MB. I also didn't want to post the full details here as it does have phone numbers in the dump.

Core was generated by /usr/local/freeswitch/bin/freeswitch -u freeswitch -g freeswitch -ncwait -nonat. Program terminated with signal SIGSEGV, Segmentation fault. #0 0x00007f1b581c89bc in sofia_outgoing_channel (session=0x0, var_event=0x7f186410e6b0, outbound_profile=0x7f18640c6200, new_session=0x7f18458d8410, pool=<optimized out>, flags=<optimized out>, cancel_cause=0x0) at mod_sofia.c:4970 4970 memcpy(&sa.sin_addr, he->h_addr, sizeof(struct in_addr));

Please let me know How I can provide the full information more securely

@zooptwopointone zooptwopointone added the bug Something isn't working label Apr 2, 2024
@zooptwopointone
Copy link
Author

I found that this only hits this code if you have the sip_gethostbyname option enabled. Sometimes it was crashing inside the small chunk of code dealing with the results in Mem_cpy.

I have since disabled the use of this as it is not longer needed by me so it has resolved the crashes. Though something is still going on there.

System's had two difference setups for this. 1 was no DNS cache hitting the DNS servers for every request. and Other systems using nscd or dns-masq. All of them had the same issue.

This is informational for anyone who hits this same issue.

@andywolk
Copy link
Contributor

We need a backtrace to analyze. One line from the backtrace is not enough.

@zooptwopointone
Copy link
Author

I am going to close this as I don't have the backtrace to give anymore. I also just worked around the issue by not using this feature. This was used to resolve Short DNS names causing the system to append domains and try to find a name. I no longer use the feature so nolonger an issue for me. But just in case someone runs into it they can get the Dump and hopefully this will be helpful.

@azzbcc
Copy link
Contributor

azzbcc commented Dec 20, 2024

I think this is a bug in gethostbyname, I meet same crash by using LibKS.

signalwire/libks#62
signalwire/libks#211

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants