-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Correction of FrameUtility V3 #48
base: master
Are you sure you want to change the base?
Conversation
Warning, FrameUtility.add method haven't been tested since I don't understand what is the expected behaviour
Without sys, the test where failing
I haven't correct the test for FrameUtility.add since I don't understand what is the expected behaviour
I haven't been able to test it With act, the the workflow was blocked
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reviewed the general logic of the code, suggested some structural changes and exposed some doubts that I couldn't figure out myself.
Once this is solved I can take a look at the details of the implementation, at the documentation and at the tests.
|
||
|
||
class Timestamps: | ||
"""Timestamps object contains informations about the timestamps of an video. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This documentation will need to have a TLDR on top, as a normal user will not be interested in all the technical details. We can put all the technical discussion in a note, and put the TLDR inside the description of the parameter "rounding_method".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moreover, users reading documentation need to know that they need to call one of the factory methods (from_fps, from_timestamps_file, from_video_file).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this sound good to you?
"""Timestamps object contains informations about the timestamps of an video.
Both Constant Frame Rate (CFR) and Variable Frame Rate (VFR) videos are supported.
To create an Timestamps object, you need to call one of the following method ``from_fps``, ``from_timestamps_file``, ``from_video_file``.
Ex:
timestamps = Timestamps.from_fps(Fraction(24000, 1001))
Parameters:
rounding_method (RoundingMethod): A rounding method. 99% of the time, you want to use RoundingMethod.ROUND.
Note:
Video player have 2 methods to deal with timestamps. Some floor them and other round them.
This can lead to difference when displaying the subtitle.
Ex:
Player - Method - proof
mpv - round - https://github.com/mpv-player/mpv/blob/7480efa62c0a2a1779b4fdaa804a6512aa488400/sub/sd_ass.c#L499
FFmpeg - floor - https://github.com/FFmpeg/FFmpeg/blob/fd1712b6fb8b7acc04ccaa7c02b9a5c9f233cfb3/libavfilter/vf_subtitles.c#L194-L196
VLC - floor - https://code.videolan.org/videolan/vlc/-/blob/df6394ea8003e035a281b6818e6432c7d492ed2f/modules/codec/libass.c#L453-454
https://code.videolan.org/videolan/vlc/-/blob/df6394ea8003e035a281b6818e6432c7d492ed2f/include/vlc_tick.h#L120-132
MPC-HC - floor - https://github.com/clsid2/mpc-hc/blob/0994fd605a9fb4d15806d0efdd6399ba1bc5f984/src/Subtitles/LibassContext.cpp#L843
Important note:
Matroska (.mkv) file are an exception !!!
If you want to be compatible with mkv, use RoundingMethod.ROUND.
By default, they only have a precision to milliseconds instead of nanoseconds like most format.
For more detail see:
1- https://mkvtoolnix.download/doc/mkvmerge.html#mkvmerge.description.timestamp_scale
2- https://matroska.org/technical/notes.html#timestampscale-rounding
timestamps (List[int], optional): A list of [timestamps](https://en.wikipedia.org/wiki/Timestamp) in milliseconds encoded as integers.
It represent each frame [presentation timestamp (PTS)](https://en.wikipedia.org/wiki/Presentation_timestamp)
normalize (bool, optional): If True, it will shift the timestamps to make them start from 0. If false, the option does nothing.
fpms (Fraction, optional): The fpms.
last_frame_time (Fraction, optional): The last frame time not rounded.
"""
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better, but it's still not correct, because you created a more general method (and not only one that "contains information about the timestamps of a video").
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you update the doc for this?
I don't understand what you want
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I will do it later
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The math is correct, but we are missing the most important explanation: why, to obtain a frame, you need to add 0.5 to the ms BEFORE multiplying with fps * 1/1000. Why isn't it done just by:
round(ms * fps * 1/1000)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which line are you talking about?
The general formula is: ms=method_of_rounding(frame * fps *1/1000)
where method_of_rounding
is floor
or round
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mhhhh... at a second glance, the formula isn't even right: you are talking about "ms_to_frames", but the formula is actually for "frames_to_ms"
That aside, my understanding is that the result of this formula yields a floating point number, but a frame
is an integer. We must then decide which integer we should return.
In python3, there is the round
method for that, but reading the code from the other (mpv/mpc-hc/libass...) it seems like everyone has opted for what you implemented as well for PyonFX. I can't understand why there was a need to implement a custom "round" and "floor" method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mhhhh... at a second glance, the formula isn't even right: you are talking about "ms_to_frames", but the formula is actually for "frames_to_ms"
Warning, I start with ms=method_of_rounding(frame * fps *1/1000)
which is effectively frames_to_ms
, but from this equation, I isolate frame
, so it become ms_to_frames
!
In python3, there is the
round
method for that, but reading the code from the other (mpv/mpc-hc/libass...) it seems like everyone has opted for what you implemented as well for PyonFX. I can't understand why there was a need to implement a custom "round" and "floor" method.
Python round isn't the same has the mathematical one. Don't ask me why, I don't know.
Ex:
Python: round(0.5) = 0
Mathematic: round(0.5) = 1
This is why I implemented a custom round.
Now for the floor, simply to not import the one form math.floor. This is purely esthetic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mhhhh... Ok.
Then, what I would do is to expose our doubts in the python documentation.
We need to say that we just mimicked what other people did, and that we don't have the full knowledge about this matter. I'll do it later
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then, what I would do is to expose our doubts in the python documentation.
Which doubts are you talking about? The round method doesn't follow the mathematical rounding convention. See this article for more information.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was just wondering why the authors decided to go for a custom rounding method.
It doesn't seem to me that:
# We use the upper bound
upper_bound = (ms + 0.5) * fps * 1/1000
# Then, we trunc the result
trunc_frame = int(upper_bound)
# If the upper_bound equals to the trunc_frame, this means that we don't respect the inequation because it is "greater than", not "greater than or equals".
# So if it happens, this means we need to return the previous frame
if upper_bound == trunc_frame:
return trunc_frame - 1
else:
return trunc_frame
does implement the mathematical rounding. To me it seems like just a... strange rounding.
The mathematical rounding for this case would just be:
int(ms * fps * 1/1000 + 0.5)
If you don't have an explanation as of why they implemented this rounding it's ok, but it seems important to me for it to be documented.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was just wondering why the authors decided to go for a custom rounding method.
Because frames_to_ms
needs to be exactly the opposite of ms_to_frames
. I am the authors of these equation (I mean, I didn't found them online. It is purely math deduction).
To calculate frames_to_ms
, it is well known that the equation is:
Now, from the equation, our goal is to create a formula to be able to calculate an ms_to_frames
function. To do so, we need to isolate
Since there are two possible roundingMethod options (floor, round), there are two different inequations that can be deduced from the original equation, which are written in the document.
Additionally, from the inequation, we can also deduce an algorithm
The mathematical rounding for this case would just be:
int(ms * fps * 1/1000 + 0.5)
No, because
Here is a proof by contradiction
With your method $frame = round(ms \times fps \times {1 \over 1000})$
With what is wrotten in the proof doc
Little reminder to show which frame correspond to which ms
Conclusion
Like you can see, your method return
The user cannot change the value of rounding_method, because if the Timestamps object have been created from from_timestamps_file or from_video_file, the timestamps list have been rounded depending on the current rounding_method
class RoundingMethod(Enum): | ||
FLOOR: Callable[[Fraction], int] = lambda ms: int(ms) | ||
ROUND: Callable[[Fraction], int] = lambda ms: int(ms + Fraction("0.5")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no need to define them as callable, as we never call them. Simply keeping them as enum should be enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
…ew parameter ordering
""" | ||
|
||
if os.path.isfile(path_to_timestamps_file_or_content): | ||
with open(path_to_timestamps_file_or_content, "r") as f: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may need to specify the encoding to "utf-8" because timestamps can contains comments with non-ascii character.
mkvtoolnix doesn't seems to care about the encoding. It will just convert the bytes to char, so equivalent to ascii.
But if we set the encoding to "ascii", it will just raise an exception when the file contains a non-ascii char, so it would be safer to use utf-8
What do you think about it?
ffprobe can be slow for big mkv file (1 gb or more) mkvextract is way faster, so if it is available, we use it
6940066
to
546597c
Compare
Had another quick glance at the code: do you think we could avoid uploading a new font (thus making use of the font we already fetch in the workflow)? Also, how much does it take to generate the fake videos? Could we generate them on the fly while testing to keep the codebase minimal? Finally, folder name "timestamps" doesn't seem to me to represent well what's inside. If you manage to remove the file I told you, we could drop it and move the python generator file outside of it, naming it as something like "utils.py". |
If i do that, how could we generate the video locally (sewhen we don't run the workflow but simply running the file
Yes we could. It take less then 10 seconds. Important to note, we need to have installed mkvmerge AND ffmpeg to be able to run that script.
If you are talking about the folder |
|
||
if ffprobe_output_dict["format"]["format_name"] == "matroska,webm": | ||
# We only do this check for .mkv file. See the note about mkv in the class documentation | ||
time_base = Fraction(ffprobe_output_dict["streams"][0]["time_base"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, the _from_mkvextract
method doesn't perform a check for the time_base like ffmpeg because when I wrote this code, it was a information that mkvtoolnix didn't expose.
But, I asked if it could be added here.
It was introduced in version 82.0, which was released on 2 January 2024. We could perform a version check and if the user has a V82 or higher, then we perform the check.
Also, I propose to add a new file named mkvtoolnix.py
and extract the timestamps in that file since timestamps.py
is becoming too big.
video_path, | ||
"-J", | ||
] | ||
mkvmerge_output = subprocess.run(cmd, capture_output=True, text=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add --command-line-charset UTF-8 --output-charset UTF-8
to the command. It is necessary when the user asks for a filepath that contains special characters that the locale encoding (which be queried with locale.getencoding()
) doesn't support.
mkvmerge_output = subprocess.run(cmd, capture_output=True, text=True) | |
cmd = [ | |
mkvmerge_path, | |
video_path, | |
"-J", | |
"--command-line-charset", "UTF-8", | |
"--output-charset", "UTF-8", | |
] | |
mkvmerge_output = subprocess.run(cmd, capture_output=True, text=True, encoding="utf-8") |
"0:" + timestamps_file_path | ||
] | ||
|
||
mkvextract_output = subprocess.run(cmd, capture_output=True, text=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same problem has here
"-print_format", | ||
"json", | ||
] | ||
ffprobe_output = subprocess.run(cmd, capture_output=True, text=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ffprobe seems to always output the result in utf-8 has said here.
ffprobe_output = subprocess.run(cmd, capture_output=True, text=True) | |
ffprobe_output = subprocess.run(cmd, capture_output=True, text=True, encoding="utf-8") |
Finally, this should be the last version of FrameUtility.
For reference, here are the previous version: #37, #46
Why it is needed?
The previous FrameUtility would not work with VFR video, now yes.
Finally, the previous FrameUtility was more a hack than anything else.
What has been done
from_fps
,from_video_file
,from_timestamps_file
)Convert.time
)Convert.ms_to_frames
andConvert.frames_to_ms
FrameUtility
to use theTimestamps
class.What still need to be been done
FrameUtility.add
to useTimestamps
. I don't understand what is the expected behaviour of this method.