-
Notifications
You must be signed in to change notification settings - Fork 261
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Example outputs at double speed #88
Comments
I can't reproduce this on my machine, but I think it has something to do with the number of frames that the VHS instance is able to capture. Can you try setting the framerate to 24 (the default is 60)
|
To explain this bug further, VHS will try to capture a frame every 1/framerate seconds (by default this is every 16.7 milliseconds) however if the capturing process takes longer, let's say 30 milliseconds per frame then VHS won't have enough frames to render the GIF at 60 FPS, but it will still try to. So it assumes that we are using 60 FPS when we really only have 30 FPS and results in a sped up GIF. One solution is to time how long the frame capture takes (on average) and then use that as the actual frame rate so that the GIF isn't sped up. |
That looks better!
Results in Any ideas why it might be slower to capture a frame? Here are some system/environment stats
Would I do this by counting the number of frames and dividing by the gif length or something? |
I have no idea why frames would be capturing slowly on that machine, it seems to be very fast. Do you have a lot of applications open like chrome, slack, VS Code etc... while recording the GIF?
We have some logic to see how long a frame capture took so that we can sleep for the rest of the time: Line 199 in 1508f49
So we would probably measure each frame and average the number of frames we captured each second. |
Interesting, so in docker you're able to get 60 frames per second? That's definitely really strange. |
Really appreciate all the info you've given, it's super helpful! I'll try and see if there's something is going wrong in VHS. |
Ok thanks! Extra bits of info is that I installed vhs via |
This might be a super long shot, but what happens if you set chrome to your default. VHS uses a chromium browser and since brave is chromium it might be using that. I have no idea though, this is a long shot. |
No luck - installed latest chrome, set as default, restarted computer, same output from tape |
Gotcha, it was a long shot. Really appreciate you trying it out and ruling that possibility out! |
Here's another long shot: are your devices battery powered when running vhs? In other words: might another energy profile limit rendering performance of the remote controlled browser instance? |
I don't know if #110 can solve it or not. |
They weren't plugged in, but unfortunately same result plugged in!
Thanks for the help @ysmood ! I built off e369116 and tried various framerates, but unfortunately there are still differing playback speeds. From the outputs below, speed peaks around 50fps, the default setting. If the bug is due to not being able to capture frames quickly enough, having too few total frames, and then trying to render the gif at the assumed framerate resulting in a sped up gif as I think I've correctly interpreted from your comment here, then it's strange that it slows down again above 50fps, right? Also all the output gifs are about the same size (30-32kB), which might be expected, but is also interesting. All of them were generated like this (70fps as example)
|
I have the same problem on Arch Linux (with sway). Also on a relatively fast system (Intel i7-1065G7). |
Revisiting this - after upgrading to MacOS Ventura, things look good!
Same versions as before
|
Ah, nevermind, when I increase the width and height of the terminal, it gets much worse. Even changing the framerate doesn't help then. |
I'm seeing this same thing in my environment. which like robinovitch61 said, seems to be exacerbated with the output size. this is set to 14s sleep but you can maven reports less than 8 seconds for compile time. demo.tapeOutput demo .gif
Set Theme "Catppuccin Mocha"
Set FontSize 32
Set Width 2400
Set Height 900
Set Framerate 24
Type "mvn clean install"
Sleep 250ms
Enter
Sleep 14s I'm running on a fedora37 VM with kitty - though I'm ssh'ing into the terminal from a windows box if that matters.
ffmpeg configuration
|
After a weekend of working on this problem, I have a solution to this issue. I firstly need to explain a few things and why this has been such a difficult problem to diagnose. WhyThe default frame rate configured in VHS is It gets even worse, these requests are blocking calls. We can't send key press requests while we are waiting for the images. They are queued up. So what happens if we can't do these 4 actions in When it comes to assembling these frames in a video, it is simply joining them all together and saying each frame is ImprovementsThere are lots of little improvements we can do to improve this situation. This is what I've done and experimented with:
Here is an output of the
This is a recording at I also know I have less frames than expected (likely around 50 but it doesn't add up exactly). As you can see having metrics helps considerably. Running this in a container shows how poor performing Chrome is when using software rendering instead of hardware accelerated ( Overall, these improvements helped but still did not address the root cause of the problem. SolutionThe proper solution to this issue is to stop assuming the frames are received at their expected interval and start tracking exactly when a frame is captured. Here is a sample of what this looks like:
The Be aware, we can't magically make it that the frame rate you request will have all frames captured. That comes down to the performance of the hardware used for recording. However, we can tell the user that they are dropping frames. This knowledge will allow them to tune their tape settings to match their hardware capabilities. Assembling the frames we do have using the offset will mask this problem and prevent the recording from being sped up. This give us a consistent speed video and allows us to solve issues like these. @maaslalani Will take me a few days to begin to put up the pull requests that address the improvements I mentioned. Still also need to figure out how to unit test this as well. |
@mikelorant thanks for this excellent analysis! Those improvements in performance and visibility sound great |
Addressing a few different comments with this thread:
This would never work because not all parts of the recording capture at the same frame rate. Depending on complexity of the image, the rendering time is significantly different. A mostly blank screen with a few lines of text encodes far quicker than a very busy screen filled with many colours and characters.
This is because there is a switch from Chrome using a hardware renderer to a software rendering. We switching from GPU to CPU. In general software rendering is terrible in containers and we should highly discourage it. However in some cases it can improve performance if the hardware GPU is busy or has weak performance.
This one took me a while to really understand. Grabbing images is done by using the go-rod method I've been experimenting with some ideas to solve this problem, but ttyd is making this difficult because while I can open multiple tabs in our headless Chrome, I can't open multiple pages that map back to the same ttyd session. Having multiple tabs to the same ttyd session would allow me to send key presses to one tab and record images from the other. I'm probably going to experiment with the idea of using a shared tmux session between multiple tabs and see if it works well.
Making the terminal larger just increases the size of the image and increases the latency for image generation. Lowering the frame rate will help but in most cases you need it to drop to single digits which really doesn't look great. What matters most is finding out how long it takes it get and write a complex image from the terminal. How many milliseconds that takes is what will determine what a reasonable frame rate can be set. |
@ysmood Having some challenges making some improvements especially around trying to reduce the cost of capturing the frames. The current way of taking a screenshot, which has to be done for both the text and cursor is: text, textErr := vhs.TextCanvas.CanvasToImage("image/png", quality) As I understand it, this internally translates in Instead of capturing individually the two canvas elements of xterm.js, it seems better to do the following: req := proto.PageCaptureScreenshot{
Format: proto.PageCaptureScreenshotFormatPng,
OptimizeForSpeed: true,
}
text, textErr := vhs.Page.Screenshot(true, &req) This would allow us to get both the text and cursor combined. However this one method takes 3x longer than doing a single I'll quickly mention, I know Do you know why the performance is significantly less? Are there any other options to capturing the screen in one go? Currently we define the two canvases as: vhs.TextCanvas, _ = vhs.Page.Element("canvas.xterm-text-layer")
vhs.CursorCanvas, _ = vhs.Page.Element("canvas.xterm-cursor-layer") If we could capture them both via the following this would be a big win but it still doesn't include the cursor 😢 vhs.Canvas, _ = vhs.Page.Element("canvas") Are there points here that may be better discussed on the |
@mikelorant streaming the canvases was actually a performance improvement, we originally performed screenshots and that was much slower: |
For what it's worth, I have some ideas about how to get much better performance. It involves some of the work done for Instead of using |
Agreed, I do have a working branch which can handle dropped frames and stitches things together properly. But the blocking issues with receiving frames and sending events is an architecture issue. |
I do have a bit of capacity now to look into how you did this with |
Yeah I played around a bit with running VHS scripts using to turn the captured pane into an SVG. Happy to experiment with alternative solutions. I think using a PTY (https://github.com/creack/pty/tree/master) is the way to go rather than tmux for more programmatic control |
Using |
Yup, other than |
This one is cross platform (works on Windows) and written by our very own @aymanbagabas: https://github.com/aymanbagabas/go-pty |
@maaslalani I think I have found what we need to make this work. We need two components:
For the Pseudo TTY we have the two recommendations you mentioned earlier, this part isn't a problem. The virtual terminal has been tricky to find because all the best implementations seem to be Rust based. We effectively want a headless terminal. Thankfully, I think I found something that fits our needs and is written in Go.
Would be interested if you think this would be a viable solution. I plan to do some experiments to see if this can do the job. The author (@vito) is someone I rate very highly as he was one of the main developers for Concourse CI and is now part of the Dagger team. |
Hey @mikelorant, yes I believe you are correct, we essentially need a headless terminal. I would be happy with that solution but I think there's a way to do it by rendering SVG (using freeze code). You would execute the commands in a PTY / Headless Terminal and then every frame capture the ANSI (essentially screenshot the terminal state), now that we have all the frames, we can render each to an SVG then combine those to a GIF. Does that align with your thinking? I don't mind if we do it with midterm, so long as everything works correctly. I do think your approach makes sense as well. |
This is still making |
Would love to see a resolution here. Glad that there are work arounds. Still having same issues on latest OS versions and VHS. |
Thanks for the kind words 🥲 For what it's worth I'd be happy to help if using Midterm seems like a viable path to a fix. This issue affects me too, and Midterm is pretty fun to work on, so if y'all run into issues I'll try my best to address them. |
When I run the demo tape:
I get the following, which seems to be at 2x speed compared to the example in the README:
The text was updated successfully, but these errors were encountered: