-
Notifications
You must be signed in to change notification settings - Fork 258
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using SimpleVisor to run dynamic recompiled code emulating non-x86-64 code #19
Comments
After some thoughts about it, it seems what I'm looking for is mostly how to reroute FS and GS segments in a safe way. Having a virtual memory different from a Windows process is a goal neither desirable nor necessary. When running the generated code with its own ABI, FS and GS will respectively point upon a 4GB guest data cache and a 4GB host basic block cache. When exiting the generated code for executing external code, FS and GS must be restored to their Windows values. And should external events like interrupts happen, FS and GS should also be restored to their Windows values. I'm not fluent enough with the hypervisor to determine whether it is possible and how to implement it efficiently with, say, SimpleVisor. The reasons why FS and GS may be perfect for emulating guest mapping are:
Now some questions:
Best regards. |
Hi! The short answer is: simple visor would probably not be a good match for your goals. I think that the BareFlank project might have closer to what you need... that being said.. If your main goal is having FS/GS pointing to something else, what you really want is a custom LDT. Why not simply do that? On x86 you can simply do it from user-mode, with documented APIs. On x64, you can use a driver. |
I'm only considering x64 host machine. I'm not familiar with the idea to use a driver to do so (any pointer on a source showing how altering FS/GS segment through a driver ?). If I must pass by a ioctl to save or restore FS/GS segment, it sounds awfully slow as a method. I was hoping by a transparent use of vm exit to determine which FS/GS segments to save/restore it would be faster. Thanks anyway. |
Hi,
Are you familiar with the idea of an LDT?
A driver IOCTL is significantly faster than a VMEXIT.
Best regards,
Alex Ionescu
…On Thu, Mar 2, 2017 at 8:07 PM, hlide ***@***.***> wrote:
I'm only considering x64 host machine. I'm not familiar with the idea to
use a driver to do so (any pointer on a source showing how altering FS/GS
segment through a driver ?). If I must pass by a ioctl to save or restore
FS/GS segment, it sounds awfully slow as a method. I was hoping by a
transparent use of vm exit to determine which FS/GS segments to
save/restore it would be faster. Thanks anyway.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#19 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AFxIeAk-H9RxEtovpxTePEsJyVulLYEzks5rhyFugaJpZM4MMcto>
.
|
I don't think that manipulating FS/GS segment from the driver would solve the issue as they are likely to be restored (maybe from ETHREAD structure?) on taskswitch. Also, Windows uses them for exception handling, syscall on wow64 processes etc,. What you need is KVM alternative on Windows. I think KVM is what you're asking for (except it is for Linux): https://lwn.net/Articles/658511/ |
Have you taken a look at our hyperkernel? It still needs a lot of work (the scheduler cannot preempt yet), but we can at least provide a very simple C/C++ environment in a guest. With what is there, you should be able to setup whatever environment you want. We should have the scheduler more complete once I get the interrupt management code into the extended APIs as I need a clock to finish the scheduler. |
@ionescu007 yes I'm. I'm familiar with the real and protected mode (while it may be a long time ago), not with hypervisor mode. The LDT solution may not work as you seem to think. Again I'm only considering 64-bit user-land running code: no 32-bit running code and WOW64 stuff.
Supposedly I have a driver which allows for me to save/restore FS/GS in a specific thread. I still need to protect this thread against the interrupts (anything which can interrupt my user-land code to execute an asynchronous Windows code that may need to access the original FS/GS value). NOTE: I think I heard somewhere FS segment is a NULL segment (selector 0) when running in a 64-bit thread and which is unconditionally set to NULL selector whenever the thread is switched. EDIT: found it in here
|
@buraktamturk |
I have. But it raises more questions. First, is there any version running in Windows as a host? Second, it appears to use mingw64 and gcc. is it just for the driver or should I need to compile all my program with mingw64 and gcc? |
It supports Windows 8.1 and 10. We will have support for BSD and MacOS soon. You build the driver with VS and the hypervisor is built with Clang. We got rid of GCC in master. |
@rianquinn My main question is: can a hypervisor intercept an external event like an interrupt which needs to be executed by Windows and to make sure a VM_EXIT will restore FS/GS segments (MSR) and when returning into the dynarec code to save them and set specific values to FS/GS segments (MSR)? I suppose the dynarec code will execute in a virtual cpu and any call to external code (that is, needing Windows environment) will exit this virtual cpu so windows can execute the needed code then the hypervisor resumes the virtual cpu. What I'm not sure about is how the virtual cpu works. Is it associated to one host logical processor? is there a way to be sure any external events (exception or interrupts) can be handled outside the virtual cpu? And so on... I lack that kind of answers which may give the best solution. |
@rianquinn |
Currently you build guest VMs with Clang as well. In the future we want to support PE/COFF as well. We just started the hyperkernel so there is a lot of work to do. |
As for your other question, each guest is given its own set of vcpus so Windows has a VMCS and so does the guest. You have complete control of all of the segments and registers from a vcpu including interrupts. |
I see no other way for discussing how it may be possible for SimpleVisor to run dynamic recompiled code emulating non-x86-64 code.
I have written an emulator which runs some PSP programs (based on a customized MIPS32 ISA) and I would like to extend some of my realizations to other guest ISAs.
The emulator is using an HLE (High Level Emulation) principle: dynamically recompiling a user-land guest code into a native host code; kernel and hardware relative guest code are directly compiled from native host source so there is no need to emulate any functionality or hardware at lower level. Basically, a syscall will call a native host function instead of trying to emulate guest instructions step by step inside the syscall.
The dynamic compiler emits x86-64 instructions and uses its own ABI if I can say this way: up to 12 GPR registers are available for the integer register allocator. By not trying to comply with the usual Windows 64-bit ABI, I can allow faster emulation. The key is the chains of basic blocks is totally built by the dynarec so the usual ABI is not needed to be saved and restored between host basic blocks (only when calling a syscall). Some details can be found here and there.
I have different fields I would like to address with a hypervisor like yours and check if it is possible:
To execute the chains of the generated code inside a dedicated logical processor. A call to a guest syscall will exit to native Windows code to execute a functionality or recompile a new guest code. Idealistically I want that generated code being inside the first virtual 4GB address range to keep the ICache array with 32-bit function pointers as entries since most ISAs I want to emulate have 32-bit pointers. I sometimes wonder if this logical processor may need its own memory mapping or not. Is that possible with SimpleVisor or would it be better to keep the same memory mapping as the running Windows program?
To get a perfect memory emulation which mimics the guest memory mapping. I used some Windows specific tricks to be able to have a very fast memory access. While it may be enough for emulating PSP , it may not for other architectures. Another possibility I can see is to use fs segment when running generated code in its logical processor (no sure if gs can also be used for another purpose as long as logical processor doesn't exit, or can it ?). This way, fs segment may map the whole 4GB of the guest memory (also called dcache), and a simple MOV or MOVBE can be done with a FS prefix to access this guest memory. If gs can also be used, it could also map the huge icache (for each guest address, it maps a potential recompiled basic block to jump into, or to recompiler code to create a new basic block) and would allow very fast execution of chains of host basic blocks.
For those two points, do you think SimpleVisor may help?
Best regards.
The text was updated successfully, but these errors were encountered: