Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Support for get_frame() #25

Open
VIPTankz opened this issue Nov 7, 2022 · 51 comments
Open

Add Support for get_frame() #25

VIPTankz opened this issue Nov 7, 2022 · 51 comments

Comments

@VIPTankz
Copy link

VIPTankz commented Nov 7, 2022

I would very much appreciate the extra functionality to be able to call a method and have it return a numpy array of the RGB values of the current dolphin frame. I am a Deep Reinforcement Learning researcher, and this would help me very much as I want to allow an AI to use the raw pixel data to learn from. Thanks

@Felk
Copy link
Owner

Felk commented Nov 15, 2022

Hey, that sounds useful. Though using numpy is currently problematic (see #9 and the workaround there has a few other problems regarding reloading scripts I wasn't able to fix yet). What I can try to do is offer the RGB values as a regular python object. Would that still be useful, even if using numpy is somewhat problematic?

@VIPTankz
Copy link
Author

That all sounds great, provided its still possible for me to convert the python objects into a numpy array within my script (I'm currently using the workaround you linked in the post to allow me to use numpy). That would be great though! Speed is the most critical thing for me currently, but I imagine it should be considerably faster than the approaches I'm currently using which can only achieve around 70fps.

@VIPTankz
Copy link
Author

VIPTankz commented Dec 2, 2022

Have you had any luck with this so far? I spent a while looking around myself with no luck, so am very interested to hear if you have been able to make progress!

@Felk
Copy link
Owner

Felk commented Dec 7, 2022

Unfortunately, no. Right now I am hardly touching any code outside of work, so I am not sure if I'll get my toes wet any time soon, sorry. I'll get back to you if I get back into it any time soon

@Felk
Copy link
Owner

Felk commented Dec 7, 2022

API-wise, I can probably hook up the already present screenshot functionality. Similarly to the savestate API I would then offer to either save to a location or return a python bytes object (or something similar). Does that sound good?

@VIPTankz
Copy link
Author

VIPTankz commented Dec 7, 2022

A python bytes object sounds like it would work to me. Would this require constant writing to the disk? If not it sounds amazing to me

@TryTwo
Copy link

TryTwo commented Dec 8, 2022

You might want to get it from dump XFB target. It's an unscaled, pixel-perfect output. Not sure though.

@VIPTankz
Copy link
Author

VIPTankz commented Dec 8, 2022

XFB dump would work well if its pixel perfect since I will need to rescale the image after anyway (Reinforcement Learning uses tiny images like under 100x100 pixels). Thanks for the advice!

@Felk
Copy link
Owner

Felk commented Dec 8, 2022

I've taken a look and I propose this API:

from dolphin import event

def show_screenshot(width: int, height: int, data: bytes):
    print(f"received {width}x{height} image of length {len(data)}")
    # data is RGBA, so its size is width*height*4

event.on_framedrawn(show_screenshot)

Here's an example of displaying the image using the Pillow library:

import sys
# add local python 3.8 installation to path, where Pillow is installed
sys.path.append("c:/users/felk/appdata/roaming/python/python38/site-packages")
from PIL import Image

from dolphin import event

def show_screenshot(width: int, height: int, data: bytes):
    print(f"received {width}x{height} image of length {len(data)}")
    image = Image.frombytes('RGBA', (width,height), data, 'raw')
    image.show()

event.on_framedrawn(show_screenshot)

The image size depends on the emulation window size. This seems to be a Dolphin limitations: screenshots/framedumps must always be the exact size of the output window size.

You should be able to turn that bytes data in RGBA format into numpy no problem, hopefully.
Does that look workable?

@VIPTankz
Copy link
Author

VIPTankz commented Dec 9, 2022

Yeah that looks great thanks!

@Felk
Copy link
Owner

Felk commented Dec 12, 2022

added in c48006d

I'll close this once I updated the documentation and uploaded a new pre-release build

@VIPTankz
Copy link
Author

Thanks so much!

@BLCRAFT210
Copy link

The image size depends on the emulation window size. This seems to be a Dolphin limitations: screenshots/framedumps must always be the exact size of the output window size.

I'm able to get framedumps at internal resolution by checking a box in the Advanced section of the graphics settings. Perhaps this applies to event.frameadvance() as well. I can check in a couple of hours. If it doesn't, I'm sure there must be some way to get internal resolution data.

@VIPTankz
Copy link
Author

Will be interesting to know, but either way I should be good to work around it. Being able to resize the window would actually benefit me since if it makes it faster it will be good to make the window tiny. Also will the framedumping work if the renderer is set to the null renderer? I assume not?

@Felk
Copy link
Owner

Felk commented Dec 12, 2022

I'm able to get framedumps at internal resolution by checking a box in the Advanced section of the graphics settings. Perhaps this applies to event.frameadvance() [sic, framedrawn?] as well. I can check in a couple of hours

Good to know! Please do and let me know, I probably can't check myself within the next few days

Also will the framedumping work if the renderer is set to the null renderer? I assume not?

I also assume not. Though I wanna test (unless @BLCRAFT210 is faster :P)

@BLCRAFT210
Copy link

image
Confirmed: checking this box dumps frames at internal resolution. I haven't checked whether the null renderer works.

@BLCRAFT210
Copy link

Yeah, the null renderer just outputs blank images.

@Felk
Copy link
Owner

Felk commented Dec 14, 2022

Alright, thanks for confirming!

@VIPTankz
Copy link
Author

VIPTankz commented Dec 14, 2022

I think there may be an issue with the recent version. I believe there is a memory leak of some sort, as when I leave this running Dolphin's memory usage just increases until my computer runs out of RAM.

from dolphin import event, gui
import sys
sys.path.append("C:\\Users\\TYLER\\AppData\\Local\\Programs\\Python\\Python38\\Lib\\site-packages")
import numpy as np
from PIL import Image
import cv2
import random
import time

def show_screenshot(width: int, height: int, data: bytes):
    #print(f"received {width}x{height} image of length {len(data)}")
    # data is RGBA, so its size is width*height*4
    pass

red = 0xffff0000
frame_counter = 0
start = time.time()
count = 0
await event.on_framedrawn(show_screenshot)

image

I do not know if this is an issue with the new version, a setting in dolphin or something else. I tested this using multiple games and with and without the no-subinterpreters argument for dolphin

@VIPTankz
Copy link
Author

VIPTankz commented Dec 17, 2022

Given how quickly the RAM usage grows, I believe when event.framedrawn() is called, the frame data isn't removed from RAM, hence causing the issue.

@Felk
Copy link
Owner

Felk commented Dec 18, 2022

Yep, there seem to me at least 2 memory leaks related to improper python object reference counting. I'll fix it. Thanks for letting me know

@VIPTankz
Copy link
Author

Thanks!

@Felk
Copy link
Owner

Felk commented Dec 18, 2022

fixed in 26cdedc, please try that one!

@VIPTankz
Copy link
Author

Just tried it and it seems to be working correctly! Thank you very much!

@VIPTankz
Copy link
Author

VIPTankz commented Dec 24, 2022

I've continued been playing around with the new version and I believe there is an issue with a maximum speed. I noticed this as the program would randomly crash if I left it running for a long time. Below is the code I used to recreate/fix the error. I believe the code which "fixes" the error isn't an actual fix, but rather just slows the program to prevent the error, I think any code which delays the speed of the program would work. I'm not sure what the underlying cause is - maybe something to do with locks that happens at high speeds? I'm really not sure

import sys
sys.path.append("C:\\Users\\TYLER\\AppData\\Local\\Programs\\Python\\Python38\\Lib\\site-packages")
from PIL import Image
from dolphin import event,gui
from copy import deepcopy
import numpy as np
import time
    
white = 0xffffffff

steps = 0
start = time.time()
while True:
    
    (width,height,data) = await event.framedrawn()

    #Adding the two lines below fixes the problem?!?
    
    with open('leak.txt', 'a') as f:
        pass

    
    img = Image.frombytes('RGBA', (width,height), data, 'raw')
    steps += 1
    fps = round(steps / (time.time() - start))
    gui.draw_text((10, 10), white, "FPS: " + str(fps))
    

@Felk
Copy link
Owner

Felk commented Dec 24, 2022

Thanks for relentlessly testing the feature 😁 very much appreciated! I'll get back into it and also try to test it more thoroughly after Christmas 🎄

@VIPTankz
Copy link
Author

No problem! Reinforcement Learning tends to test everything it touches very thoroughly! It's also worth noting that while running that test, the window was pretty tiny (about 200x200), to try and get the maximum speed. Even with the "fix" it does still crash sometimes, just far less (about once per 100k calls to event.framedrawn()). Merry Christmas! 🎄

@VIPTankz
Copy link
Author

VIPTankz commented Jan 9, 2023

Hope you had a nice Christmas and Happy New Year! Just wondering if you've had any time to look for what was causing it to crash?

@Felk
Copy link
Owner

Felk commented Jan 9, 2023

Hey! Unfortunately, no. I was trying to sync the fork before fixing any bugs, and that turns out to be quite tedious still. I'll try to get back to this in the coming weeks

@VIPTankz
Copy link
Author

VIPTankz commented Jan 9, 2023

No problem, thanks!

@BLCRAFT210
Copy link

It seems there is no documentation for on_framedrawn and framedrawn in the events stub. I can add it, unless you want to write it yourself.

@VIPTankz
Copy link
Author

After doing much analysis of this, I've concluded the error only occurs when the window is substantially shrunk (less than 400x400 pixels). I have settled however with just using internal resolution for my use case, despite being minorly slower. I wanted to make others who may be using this aware of the issue, but I believe it probably isn't important enough to focus on. In my testing, this error only occurred when running Dolphin at 500+ fps. I'm going to close the issue now as it appears to work as intended for the most part, thank you very much for your help!

@Felk
Copy link
Owner

Felk commented Feb 22, 2023

Glad to hear it works good enough for you. I'll leave this issue open though, to track the missing documentation and also to eventually take a look at it. Might be a timing issue, because I don't really care about synchronizing python code with emulation

@Felk Felk reopened this Feb 22, 2023
@VIPTankz
Copy link
Author

Hello, I've recently been updating my code to work with the newest version of this fork (the previous version was quite old, using python 3.8 rather than 3.11). In doing this, I have run into another issue with the await event.framedrawn().

Using event.framedrawn() now seems to cause dolphin to freeze and become unresponsive. This happens somewhat unreliably though, sometimes occurring after a few seconds but sometimes occurring after many minutes. If I try to read from memory as well though, the crashing seems to happen considerably faster, almost immediately every time.

The previous issue I mentioned in the thread where Dolphin would crash and close down if I ran Dolphin too fast (usually at 400/500fps from having a tiny window) still exists, however I believe to be unrelated and is not an issue I need solving.

Code to recreate issue:

from dolphin import event, gui, savestate, memory, controller
import time

red = 0xffff0000

steps = 0
start = time.time()
while True:

    #using framedrawn works fine
    #await event.frameadvance()

    #this seems to crash, but quite slowly
    (width, height, data) = await event.framedrawn()

    steps += 1
    fps = round(steps / (time.time() - start))
    gui.draw_text((10, 10), red, "AVG FPS: " + str(fps))

    #this line causes dolphin to crash MUCH faster when used with framedrawn
    lives = memory.read_u8(0x8106483B)

@Felk
Copy link
Owner

Felk commented Sep 26, 2023

Thank you for this excellent reproducer! It appears to be a classic deadlock. Something on Dolphin's CPU thread is waiting to acquire the GIL, and something holding the GIL is waiting to acquire the CPU Thread guard lock.
For the record and future me:

  • FrameAdvance events are fired even if there are no listeners. That might not do much then, but it still does try to acquire the GIL for a short amount of time.
  • FrameAdvance events are fired from Dolphin's CPU thread: A VideoInterface callback is run on the CPU thread which calls the OnFrameBegin method. FrameAdvance is currently wired up right there
  • FrameDrawn events are fired from the Framedumper thread. The event is currently hooked up right next to the code that emits screenshots.

Now what happens is the following:

  • The python script starts running normally for some time.
  • The FrameDrawn event triggers the execution of some python code due to the await event.framedrawn(). While the Python code is being executed, the GIL is held.
  • Concurrently, the FrameAdvance event is fired from the CPU thread. It tries to acquire the GIL but waits since the GIL still held by the currently running Python code. The CPU thread waits.
  • In the meantime, the Python code reaches memory.read_u8(0x8106483B), which makes it call back into Dolphin Code. Reading memory tries to take a lock on the CPU thread first. Since the CPU thread is still busy, this call waits.
  • Deadlock!

The CPU Thread looks like this:
image

and the FrameDumper thread looks like this:
image

@Felk
Copy link
Owner

Felk commented Sep 26, 2023

Can you try this build? I scheduled all events to run on the CPU/emulation thread. I'm not entirely sure this is the best solution, but I'm interested in how it runs for you:

(.exe file only because of filesize limit on github, remaining files are identical to preview3)
dolphin-x64-exe-scripting-preview3-all-events-on-cpu-thread.zip

@VIPTankz
Copy link
Author

Yeah sure, will give it a try now

@VIPTankz
Copy link
Author

Appears to stop the crashing! Will this have any impact on the speed?

@Felk
Copy link
Owner

Felk commented Sep 26, 2023

I haven't measured to what extent, but theoretically yes, this can induce some slowdown

@VIPTankz
Copy link
Author

I did some testing to see to what extent this would be.
This was tested on the Super Mario Galaxy home screen.

With the new version - 240fps
Old Version - 250fps

Perhaps minorly slower, but is pretty negligible and is good enough for me! Thanks again!

@Felk
Copy link
Owner

Felk commented Sep 26, 2023

Alright, that's great to hear! I pushed the change to master and will take another look at the performance sometime else. Also I still got some crashes locally, so I assume the original crashing issue is still not resolved for now.

@VIPTankz
Copy link
Author

Hey, after doing some more testing I think I found an issue with the new version. Specifically, when using the controller input, if I use the event.framedrawn() I get stuttered movement. ie the game happens as though I am quickly pressing/releasing a button, even if I am just holding it down. I assume this is something to do with the work you previously mentioned.

Below is the code I used to create the error, however once again it includes a savestate from Super Mario Galaxy.

from dolphin import event, gui, savestate, memory, controller
import time
red = 0xffff0000
import random

wii_dic = {"A": False,
                "B": False,
                "One": False,
                "Two": False,
                "Plus": False,
                "Minus": False,
                "Home": False,
                "Up": False,
                "Down": False,
                "Left": False,
                "Right": False}
nun_dic = {"C": False,
                "Z": False,
                "StickX": 0,
                "StickY": 0}

steps = 0
start = time.time()

for i in range(4):
    (width, height, data) = await event.framedrawn()

savestate.load_from_slot(1)

while True:

    #this works
    #await event.frameadvance()

    #this causes stuttered movement
    (width, height, data) = await event.framedrawn()

    steps += 1
    fps = round(steps / (time.time() - start))
    gui.draw_text((10, 10), red, "AVG FPS: " + str(fps))

    time.sleep(0.01)
    nun_dic["StickX"] = -1

    controller.set_wii_nunchuk_buttons(0, nun_dic)
    controller.set_wiimote_buttons(0, wii_dic)

@Felk
Copy link
Owner

Felk commented Sep 28, 2023

Interesting, thanks again for the reproducer. I haven't dug in yet, but let me collect a few thoughts anyway:

First off, the desired state is: Inputs set by a script hold for a single frame and then expire.
Setting inputs is implemented by storing the desired input and hooking into the "input overrides" hooks supplied by upstream. This way, when the hook is invoked I can substitute the original controller state with the stored one. To make inputs expire after a frame, I have the controller module listen to the frameadvance event internally. It then clears all stored inputs.

Now theoretically, if the script uses any other event than frameadvance and therefore sets the inputs asynchronously to the frameadvance event, the events may get cleared before the override hook gets invoked, making them ineffective. Now the thing is I kinda anticipated this originally and added a used flag to each stored custom input which gets set each time an input substitution happens in the respective override hook. I then only clear inputs after a frameadvance event if that flag is set.

Sounds good in theory. But in practice this is probably flawed in many ways. For example I currently make the (wrong?) assumption that inputs will be polled deterministically around the same time each time once per frame. If that assumption does not hold, I can imagine of at least one scenario that would lead to the odd behavior you described:

time             ------------------------------------------------------>

frameadvance     |----------|----------|----------|----------|----------
framedrawn           |----------|----------|----------|----------|------
input override       AAAAAAAA   AAAAAAAA   AAAAAAAA   AAAAAAAA   AAAAAAA
input polled            |----------|-----|----------|---------------|---
effective input         A          A   not A      not A             A

Or maybe it should work as-is and the implementation is just buggy.

@Felk
Copy link
Owner

Felk commented Oct 11, 2023

Note that the way the framedrawn is prototyped right now seems fundamentally flawed and I disabled the event entirely on the master branch for now (6f0f5ed). I'm currently searching for a way to implement it properly. Once I manage to do that I hopefully also fix the occasional crashes. In the meantime I asked in the dolphin forum for some opinions: https://forums.dolphin-emu.org/Thread-schedulung-events-from-other-threads-for-scripting-purposes (post approval pending)

@VIPTankz
Copy link
Author

VIPTankz commented Nov 5, 2023

Hi there. This isn't a "solution", but I did find a temporary workaround. For anyone else working with this issue, using framedrawn in conjuntion with event.on_frameadvance(), then applying actions in a callback does appear to work as correctly.

@Felk
Copy link
Owner

Felk commented Nov 12, 2023

Hey @VIPTankz, funny thing, I had a comment draft here that basically was going to suggest doing exactly that. Right now it's only "safe" (as in, does the right thing without movement stutter) to set inputs from within the frameadvance event. But you decide what inputs you wanna have inside the framedrawn event. So my suggestion, which probably is what you did, looks like this:

from dolphin import event, controller

desired_inputs = {}

def framedrawn_handler(width, height, data):
    global desired_inputs
    desired_inputs = {"A": True}  # actually process image data and determine next input

def frameadvance_handler():
    global desired_inputs
    controller.set_wii_nunchuk_buttons(0, desired_inputs)
    pass

event.on_framedrawn(framedrawn_handler)
event.on_frameadvance(frameadvance_handler)

This works because the rough sequence of events inside the emulator looks like this:

                                +-----+          +-----+                +---------+                        
                                | CPU |          | GPU |                | python  |                        
                                +-----+          +-----+                +---------+                        
    -----------------------------\ |                |                        |                             
    | End Frame #1 (end scanout) |-|                |                        |                             
    |----------------------------| |                |                        |                             
                                   |                |                        |                             
                                   | swap           |                        |                             
                                   |--------------->|                        |                             
                                   |                |                        |                             
                                   | copy to XFB    |                        |                             
                                   |--------------->|                        |                             
                                   |                |                        |                             
                                   |                | frame drawn event      |                             
                                   |                |----------------------->|                             
                                   |                |                        |                             
                                   |                |                        | determine controller inputs 
                                   |                |                        |---------------------------- 
                                   |                |                        |                           | 
                                   |                |                        |<--------------------------- 
---------------------------------\ |                |                        |                             
| Begin Frame #2 (start scanout) |-|                |                        |                             
|--------------------------------| |                |                        |                             
                                   |                |                        |                             
                                   | frameadvance event                      |                             
                                   |---------------------------------------->|                             
                                   |                |                        |                             
                                   |        set determined controller inputs |                             
                                   |<----------------------------------------|                             
                  ---------------\ |                |                        |                             
                  | Emulation... |-|                |                        |                             
                  |--------------| |                |                        |                             
    -----------------------------\ |                |                        |                             
    | End Frame #2 (end scanout) |-|                |                        |                             
    |----------------------------| |                |                        |                             
                                   |                |                        |          

using the await style should also work reliably, and looks better for this usecase actually (EDIT this has some timing issues I'm trying to figure out. The above example using event.on_framedrawn(framedrawn_handler) is more reliable right now):

from dolphin import event, controller

while True:
    (width, height, data) = await event.framedrawn()
    desired_inputs = {"A": True}  # actually process image data and determine next input
    await event.frameadvance()
    controller.set_wii_nunchuk_buttons(0, desired_inputs)

I'm working out on how to make this obvious, or make it impossible to mis-use the API. However I haven't quite worked out the details on that yet but I might have to change the API surface in backwards-incompatible ways in the future (or offer a compatibility layer). In the meantime, this is probably the way to go using the framedrawn in conjunction with inputting right now.

@VIPTankz
Copy link
Author

VIPTankz commented Oct 9, 2024

Hi there again. I've been working with dolphin-x64-exe-scripting-preview3-all-events-on-cpu-thread.zip for over a year now (as the main branch does not have support for event.framedrawn()), and have experienced crashing consistently throughout this time. Can be fairly infrequent, but enough to be quite a pain.

Even running a very basic script which doesn't use framedrawn is giving me the issue:

from dolphin import event, gui, savestate, memory, controller

red = 0xffff0000
frame_counter = 0

while True:
    await event.frameadvance()

    frame_counter += 1
    # draw on screen
    gui.draw_text((10, 10), red, f"Frame: {frame_counter}")

I've had this issue when running at 100% speed, and when running on unlimited. The error does seem to happen more when running the program more quickly, however I've still had crashes in both cases. The frequency of the crashes varies greatly, some happening within a few hundred frames, and others occurring over 100k frames in.

My old PC was windows 10 and the new one is windows 11, and both machines have this issue.
Here's the Spec for both machines:
GPU - RTX4090, CPU - Intel i9-14900k, RAM 64gb 5800mhz

This issue has been driving me crazy for something like a year now, so any help is much appreciated. (Also please make sure any new versions have support for framedrawn, I can't use it without it!)

@Prograndma
Copy link

God bless you tanks and felk, you're saving my masters degree rn.

@mrontio
Copy link

mrontio commented Nov 27, 2024

Hello, I'm taking a crack at this but I have a few questions.

As I understand it, we want to avoid calling the EmitEvent from the FrameDumper thread because that doesn't guarantee thread safety.

I have studied the C++/Python interface and gather there's a static EventHub through which the C++ can communicate with Python (and vice versa?).

With my current understanding, this is how I see we can achieve this:

  • The FrameDumper object registers FlushFrameDump to the AfterFrameEvent, which means that after each frame is rendered there is a procedure for turning a GPU frame into a screenshot / bytes.
  • We change FlushFrameDump s.t. it copies a frame to a location the CPU knows of and can access.
  • The CPU thread picks up on the lock / signal, and then Emits the event (like below) with the data, guaranteeing thread safety and communicating with Python
    // API::GetEventHub().EmitEvent(API::Events::FrameDrawn{frame.width, frame.height, frame.data});

I'm more than happy to implement these changes (in fact I want to so these hours of exploring pay off), but @Felk, if I could ask a few questions:

  1. What's the optimal way of achieving communication via the (HookableEvent) AfterFrameEvent?
    • Is this feasible with the current implementation, or does a shared memory implementation need to be made?
  2. Is this emitting the {width, height, bytes} from the CpuThread enough to maintain functionality?
  3. Is there a better way of achieving this that I'm not seeing? I'm trying to implement it via HookableEvent thanks to your comment.

Thank you!

@mrontio
Copy link

mrontio commented Nov 29, 2024

The HookableEvent system stores hooks as a CallbackType, which to me says that there is currently no bi-directional communication with a hook handle.

using CallbackType = std::function<void(CallbackArgs...)>;

I'm going to start extending HookBase to include a mutually exclusive (or lockless) queue that can be passed as part of the handle.

The general idea then is to either:

  • Pass pointers to an existing frame, which will require a copy to that frame.
  • Pass the entire frame through the queue, which we can probably achieve with a shared circular buffer.

@Felk
Copy link
Owner

Felk commented Nov 30, 2024

Hello @mrontio, thanks for tackling this issue! I'll try to answer your question and give you all the insights I have also collected so far. Though it's been a while and because I never thought this through to fruition I may have forgotten some details.

First, let's discuss the statement "Events must happen inside the CPU thread"

Summarized, yes, this is to guarantee thread safety. I come from a background where scripting in an emulator just doesn't have any multi-threading problems because there's just one big emulation thread. It makes for a nice easy programming model, but it's just not the reality in Dolphin (or supposedly other modern emulators too). To avoid the issue for now, "Events must happen inside the CPU thread" is the restriction currently put in place.

Here's an easy scenario illustrating how things go wrong when threads are mixed:

  • The GPU thread produces a frame and calls into python code that registered on the frame drawn event.
  • The script chooses to, in response, call some methods to write into emulated memory.
  • From Dolphin's view, the GPU thread just made calls into the asynchronously running CPU/emulation thread. This easily violates Dolphins multi-threading invariants, causing all sorts of nasty crashes and hangs.

Trying to keep up the single-threaded model might lead to a more intuitive programming interface in the long run. In fact, not wanting to expose script authors to the horrors of multithreading sounds like a very good idea. The drawback to this approach is a probable performance hit.

But there's a different path. In particular, phire has quite precise and insightful ideas on his vision of scripting in Dolphin. I suggest reading this discussion on some previous attempt to get Scripting started upstream. There's also a design document, though that's quite a bit more than what my scope here is.

To summarize: his idea is to just allow all threads to trigger events, but to avoid multithreading issues through clever API design. In the above scenario, that would mean the event callback of the frame drawn event just wouldn't be allowed to call into the CPU thread.

Regarding your suggested implementation

Your current attempt does not go down the multithreading-in-python path, and instead aims to move the frame drawn event to the CPU thread somehow, If I understand correctly. Ignoring the technical details for now (whether it uses the hookable event system or not), that approach sounds perfectly reasonable to me.

Regarding whether emitting width, height, bytes is enough, I see no reason why not.

Now onto the hookable event system. Because the CPU thread and GPU thread work concurrently by design, and the hookable events are video-related events inside the CPU thread, none of the events 100% correspond to a "frame has finished drawing". Unfortunately, I didn't understand the details of your implementation proposal, but it sounds like you want the CPU thread to either receive an event/signal when the frame is done drawing, or wait on some lock until drawing is done?

  • If you want to signal the CPU, I'm not sure what's the best way to do that is. Just make sure to avoid using RunAsCPUThread or RunOnCPUThread, as those are illegal to use from anywhere but the CPU or Host thread.
    The only existing way I know is through CoreTiming#ScheduleEvent. I could have sworn the GPU thread already sends some signals to the CPU thread this way (vi interrupts), but I couldn't find the code just now.
  • If you want the CPU thread to wait on a lock, you need to be careful to not harm performance. E.g. if you wait right after the GPU thread just started drawing (e.g. right after the AfterFrameEvent event), you could accidentally kill all concurrency between the GPU and CPU thread.
    Also, there is already one place where such locking occurs: The AfterFrameEvent can get triggered before the frame is actually finished if GPU drawing lags behind CPU emulation. The GPU thread calls FinishFrameData(); in its AfterFrameEvent to ensure the frame is finished before initiating a swap.
  • If you didn't mean any of those two, I apologize for misunderstanding

Now my approach, which I never went through with, was similar to yours: grab the frame data, but from the CPU thread. My question then used to be: what event known to the CPU fulfills the following criteria:

  • The frame is fully drawn
  • Somehow re-uses the FrameDumper logic to turn framebuffer data into user-friendly image data (width, height, bytes).
  • Keep the performance characteristics of concurrent frame drawing to not slow things down much.
  • Ideally use the earliest possible event to reduce latency

So the rough idea is: Grab the frame data when it gets copied from EFB to XFB. Here's a forum post of MayImilae explaining some more details about EFB vs XFB

I used to think the best place was at the start of scanout (or at the end for some improved accuracy regarding VI register shenanigans during scanout? See the comments in VideoInterfaceManager::BeginField and VideoInterfaceManager::EndField for that), since that's a well defined event at which the full frame data exists in the XFB. However, with this approach the "field-ness" of frames must be kept in mind, to not introduce the same problem as #13.

Now we're back to the AfterFrameEvent. In theory, that's the best place, since that's literally after the frame data just got copied from EFB to XFB. But the XFB data is framebuffer data, not usable RGB data we want. That event currently is the signal for the frame dumper to asynchronously start turning the frame data into image data, e.g. for screenshot saving. This is where I basically stopped. In an easy world, the CPU just synchronously turns the XFB data into frame data right then and there. I don't know how expensive that would even be, or if it's too expensive how to utilize the asynchronous nature of the framedumper to get the frame data once it's processed. I wish you good luck experimenting!

Another somewhat unrelated thing to keep in mind is that, ideally, the framedumping would only happen when someone actually registered it as an event, as it costs performance. Right now that is done by returning true in IsFrameDumping when the respective event is registered.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants