Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed AudioMediaPort use after free #4200

Merged
merged 4 commits into from
Dec 10, 2024
Merged

Fixed AudioMediaPort use after free #4200

merged 4 commits into from
Dec 10, 2024

Conversation

sauwming
Copy link
Member

@sauwming sauwming commented Dec 6, 2024

As described in #3928 point 1, due to the now async nature of the conference bridge, we need to make sure that the port is not prematurely destroyed.

==41294==ERROR: AddressSanitizer: heap-use-after-free on address 0x6190000f3178 at pc 0x00010296eecc bp 0x00016dec8bb0 sp 0x00016dec8ba8
READ of size 8 at 0x6190000f3178 thread T19
    #0 0x10296eec8 in op_remove_port conference.c:1728
    #1 0x10296e1b4 in handle_op_queue conference.c:355
    #2 0x102969ae0 in get_frame conference.c:2416

0x6190000f3178 is located 1016 bytes inside of 1056-byte region [0x6190000f2d80,0x6190000f31a0)
freed by thread T0 here:
    #1 0x1024b1388 in MyAudioMediaPort::~MyAudioMediaPort()

@sauwming sauwming self-assigned this Dec 6, 2024
@sauwming sauwming added this to the release-2.16 milestone Dec 6, 2024
@sauwming
Copy link
Member Author

sauwming commented Dec 6, 2024

To be honest, the existing port warning confused me.

If I just comment port.on_destroy in the patch, it will give the warning: Warning, media port %s is using group lock, but it does not seem to have a pool., even though I already use my own pool, and it's not obvious what users should do next to fix the situation.

I'm not sure if my proposed fix is right though, so please correct it if it's inaccurate.

Please also help check for the other media ports in pjsua2 media.cpp, @nanangizz, because I see similar patterns in Tonegen and AudioMedia, but since they use group lock, they should be safe?

@sauwming sauwming requested a review from nanangizz December 6, 2024 08:30
@nanangizz
Copy link
Member

nanangizz commented Dec 6, 2024

To be honest, the existing port warning confused me.

If I just comment port.on_destroy in the patch, it will give the warning: Warning, media port %s is using group lock, but it does not seem to have a pool., even though I already use my own pool, and it's not obvious what users should do next to fix the situation.

Right, it sounds a bit confusing indeed. Let me try to clarify it, and then help me clarify it for port developers :)

To avoid premature destroy, port must use its own pool, never use app's pool (as app may call pjmedia_port_destroy() & release app's pool immediately after pjmedia_conf_remove_port()). Unfortunately there is no way to verify this (whether the port has & uses its own pool). A workaround is to check on_destroy availability, based on assumption that it should be the right place to release the pool.

Note that the port's group lock, while being needed to properly time the port's pool release, is actually optional, because the bridge can check and install one automatically if there is none.

If the port already has a group lock, implicitly it means that it is already aware of possible premature destroy, it should have its own pool & the pool release timing should have been managed using the group lock.

If the port does not have group lock, the bridge will call pjmedia_port_init_grp_lock() to install one (to time the port destroy). And I was thinking that the function can also be used to check on_destroy availability.

@@ -321,7 +326,8 @@ static pj_status_t get_frame(pjmedia_port *port, pjmedia_frame *frame)

static pj_status_t put_frame(pjmedia_port *port, pjmedia_frame *frame)
{
AudioMediaPort *mport = (AudioMediaPort *) port->port_data.pdata;
struct port_data *pdata = (struct port_data *)port->port_data.pdata;
AudioMediaPort *mport = pdata->mport;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that get/put_frame() may be called after the AudioMediaPort instance is destroyed, so perhaps mport field need to be reset in AudioMediaPort destructor. Even perhaps better to protect all access to the field with (group) lock.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ouch, this is more complicated than I anticipated and never seems to be mentioned anywhere.
Perhaps need to add this in the PR desc: Due to the async nature of conf port removal, media port must be prepared to receive further get/put_frame() callbacks until the removal completes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added to backward compatibility issue in PR & 2.15 release note.

@sauwming
Copy link
Member Author

sauwming commented Dec 9, 2024

Right, it sounds a bit confusing indeed. Let me try to clarify it, and then help me clarify it for port developers :)

I would suggest:
(a) Warning, port %s on_destroy() not found. To avoid premature destroy, media port must manage its own pool. App shall not release this pool, but let the port do so in on_destroy() or in its grp lock handler.

(b) Warning, port %s on_destroy() not found. To avoid premature destroy, media port must manage its own pool, which can only be released in on_destroy() or in its grp lock handler.

(c) Warning, port %s on_destroy() not found. To avoid premature destroy, media port must control its own lifetime. See PR #3928.

@nanangizz
Copy link
Member

Right, it sounds a bit confusing indeed. Let me try to clarify it, and then help me clarify it for port developers :)

I would suggest: (a) Warning, port %s on_destroy() not found. To avoid premature destroy, media port must manage its own pool. App shall not release this pool, but let the port do so in on_destroy() or in its grp lock handler.

(b) Warning, port %s on_destroy() not found. To avoid premature destroy, media port must manage its own pool, which can only be released in on_destroy() or in its grp lock handler.

(c) Warning, port %s on_destroy() not found. To avoid premature destroy, media port must control its own lifetime. See PR #3928.

I'd vote for (b), with "see PR #3928 for more info" perhaps?

* and then destroys the pool immediately, it may cause crash as the port
* may have not really been destroyed and may still be accessed.
* When port has a pool, it usually implements on_destroy() for releasing
* If port uses app's pool, it needs to implement on_destroy() for releasing
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"If port uses app's pool" implies that port may still use app's pool, while it must not. Even perhaps we should clarify somewhere here that port must create & use its own pool?

Copy link
Member

@nanangizz nanangizz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did quick check on the code, Tonegen & AudioMedia seems okay. Also done a quick test for Tonegen using pjsua2_demo.cpp, while AudioMedia should already have been tested when testing Android.

@sauwming sauwming merged commit bb02bf3 into master Dec 10, 2024
36 checks passed
@sauwming sauwming deleted the amport-fix branch December 10, 2024 04:12
nanangizz pushed a commit that referenced this pull request Dec 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants