Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use QueueUserAPC to run NtCancelIoFile on each thread #125

Conversation

stevefan1999-personal
Copy link
Contributor

@stevefan1999-personal stevefan1999-personal commented Oct 15, 2024

Since NtCancelIoFileEx will cancel all ongoing requests of the entire process:

The NTCancelIoFileEx function allows you to cancel requests in threads other than the calling thread. The NtCancelIoFile function only cancels requests in the same thread that called the NtCancelIoFile function. NtCancelIoFileEx cancels only outstanding I/O on the handle, it does not change the state of the handle; this means that you cannot rely on the state of the handle because you cannot know whether the operation was completed successfully or canceled.

This means the mere call to NtCancelIoFile in the NtCancelIoFileEx implementation is not enough, because NtCancelIoFile only impacts the current calling thread. We can workaround this by injecting an APC call to each thread of the current process (if we can open it with THREAD_SET_CONTEXT) and then cancelling the I/O request under the impersonation of that thread, which eventually cancels all ongoing I/O unless it is networked, which is a very rare case (unless you work with SMB).

Partially fixes #80, for mio to run without having to resort to single threaded runtime

@stevefan1999-personal
Copy link
Contributor Author

Tried to replace it with the internal thread enumeration function, not sure how do I include them

#if (YY_Thunks_Target < __WindowsNT6)

    // 最低受支持的客户端    Windows Vista [桌面应用|UWP 应用]
    // 最低受支持的服务器    Windows Server 2008[桌面应用 | UWP 应用]
    __DEFINE_THUNK(
        ntdll,
        12,
        NTSTATUS,
        NTAPI,
        NtCancelIoFileEx,
        HANDLE handle,
        IO_STATUS_BLOCK* io,
        IO_STATUS_BLOCK* io_status
    )
    {
        if (const auto _pfnNtCancelIoFileEx = try_get_NtCancelIoFileEx())
        {
            return _pfnNtCancelIoFileEx(handle, io, io_status);
        }
        struct CancelIoData 
        {
            HANDLE handle;
            IO_STATUS_BLOCK* io_status;
        };

        StringBuffer<char> _Buffer;
        auto _pProcessInfo = GetCurrentProcessInfo(_Buffer);
        if (!_pProcessInfo)
        {
            break;
        }

        const auto _uCurrentThreadId = GetCurrentThreadId();

        for (ULONG i = 0; i != _pProcessInfo->ThreadCount; ++i)
        {
            auto& _Thread = _pProcessInfo->Threads[i];

            if (_uCurrentThreadId == static_cast<DWORD>(reinterpret_cast<UINT_PTR>(_Thread.ClientId.UniqueThread)))
                continue;

            auto _hThread = OpenThread(THREAD_SET_CONTEXT, FALSE, static_cast<DWORD>(reinterpret_cast<UINT_PTR>(_Thread.ClientId.UniqueThread)));
            if (!_hThread)
            {
                continue;
            }

            QueueUserAPC([](ULONG_PTR param) {
                auto data = (CancelIoData*)param;
#ifndef __USING_NTDLL_LIB
                const auto NtCancelIoFile = try_get_NtCancelIoFile();
                if (!NtCancelIoFile)
                {
                    // 正常来说不应该走到这里
                    delete data;
                    return;
                }
#endif

                NtCancelIoFile(data->handle, data->io_status);
                delete data;
            }, _hThread, (ULONG_PTR) new CancelIoData {handle, io_status});

            CloseHandle(_hThread);
        }

#ifndef __USING_NTDLL_LIB
        const auto NtCancelIoFile = try_get_NtCancelIoFile();
        if (!NtCancelIoFile)
        {
            // 正常来说不应该走到这里
            return STATUS_NOT_SUPPORTED;
        }
#endif
        // 最坏打算,清除所有的调用
        return NtCancelIoFile(handle, io_status);
    }
#endif

@MouriNaruto
Copy link
Member

MouriNaruto commented Oct 16, 2024

You should not reference any Wine (LGPL) content due to the license (MIT) issues.

Kenji Mouri

@MouriNaruto
Copy link
Member

MouriNaruto commented Oct 16, 2024

What about using CancelIo API instead of NtCancelIoFile API internally.

image

Using more documented API will add flexibility for the future compatibility.

Kenji Mouri


//downlevel逻辑会把该文件所有IO动作给取消掉!凑合用吧。
return CancelIo(hFile);
// https://github.com/wine-mirror/wine/blob/100645ac4de77879f6a36181234f76794bcbecb2/dlls/kernelbase/file.c#L3005
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

}
#endif

NtCancelIoFile(data->handle, data->io_status);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stevefan1999-personal
Copy link
Contributor Author

stevefan1999-personal commented Oct 16, 2024

Strange, why is the APC call stuck with GetModuleFileNameA?
image

nvm: its the base stack, but the APC call is indeed deadlocked

@mingkuang-Chuyu mingkuang-Chuyu self-requested a review October 19, 2024 05:18
@mingkuang-Chuyu
Copy link
Collaborator

mingkuang-Chuyu commented Oct 19, 2024

Thank you for your support. But I'm worried that this PR will cause other issues.

  1. APC is asynchronous, APC maybe still executing after the CancelIoEx function returns.

  2. If the target thread have no APC, this will cause leaks.

@stevefan1999-personal
Copy link
Contributor Author

stevefan1999-personal commented Oct 19, 2024

Thank you for your support. But I'm worried that this PR will cause other issues.

  1. APC is asynchronous, APC maybe still executing after the CancelIoEx function returns.

  2. If the target thread have no APC, this will cause leaks.

I agree after reading APC documentations and Overlapped IO changes since Vista, but I still think we should have early exit if overlapped != null, since it actually behaved even worse if we cancel all requests, the cancellation not only corrupted mio's state machine, but also corrupted database async commit as well in one private case. Doing nothing is simply better if we just the let code track the IO state manually at this point.

Sadly it is a fundamental Overlapped IO change and there is no workaround, because the network model in Windows XP (since NT 4) is still single threaded first back when multicore programming is still not commercially available.

@mingkuang-Chuyu mingkuang-Chuyu changed the base branch from master to Fea/YY/PendingPR October 22, 2024 01:50
@mingkuang-Chuyu mingkuang-Chuyu merged commit 89330b3 into Chuyu-Team:Fea/YY/PendingPR Oct 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

让rust程序支持Windows XP RTM(rust程序兼容性问题请统一在此帖回复)
3 participants