forked from PyAV-Org/PyAV
-
Notifications
You must be signed in to change notification settings - Fork 0
/
TODO.txt
150 lines (102 loc) · 5.26 KB
/
TODO.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
The Big Picture
===============
- Lots of documentation
- Uniform time representation
- High-level APIs
NEXT
====
- see do_streamcopy in ffmpeg/ffmpeg.c for hints about how to proceed for remuxing
- expose various protocols as a Protocol descriptor?
- be able to add our own protocols
- "pycall" protocol for a general function callback?
- "pyiter" protocol for an iterator interface?
- "python" protocol for calling read/write/seek on an arbitrary object
- see avformat/file.c for a really simple protocol, and avformat/allformats.c
for how to register them
- AVIOContext is in avformat/avio.h
- avio_open2 is in avformat/aviobuf.c
- either (1) manually open a "__pycall__:" file, and then set a pointer to the
object to call, or (2) fully construct everything ourselves (via ffio_fdopen)
or (3) write our own version of ffio_fdopen from avformat/aviobef.c
Encoding
========
- Stream.encode should return a list of packets. Otherwise, resampling
can easily result in a massive buffer in a fifo or codec
- Make everything mutable. Later, we can introduce a PseudoMutable base class
for everything that has a .assert_mutable() method and public is_read_only
Time
====
- What are the various rates and scales?
see: http://ffmpeg.org/pipermail/libav-user/2011-December/001041.html
see: http://stackoverflow.com/questions/12234949/ffmpeg-time-unit-explanation-and-av-seek-frame-method/16739755#16739755
- Stream.guessed_rate or AVStream.r_frame_rate:
FFmpeg's guess as to the frame rate. Often 24/1.
- Stream.average_rate or AVStream.avg_frame_rate:
Average frame rate that has been inspected in the file. This is what VLC
reports in the media information. Often near 24/1.
It may calculate this from 20 frames; see: http://ffmpeg.org/doxygen/trunk/libavformat_2utils_8c_source.html#l02768
- Stream.time_base or AVStream.time_base:
Duration of one time unit for AVStream times.
- Stream.rate or CodecContext.time_base:
The length of a single frame as reported by the file. Sometimes 1/24,
sometimes really really wrong.
possibilities for discrepencies:
- .mov supports a fixed frame rate
- .mp4 does not support a fixed frame rate?
- see: http://ffmpeg.org/doxygen/trunk/libavformat_2utils_8c_source.html#l02647
High-Level API (Conveniences)
=============================
- Container.demux(video=True)
- Container.demux(video=2, audio=2) first 2 video and audio streams OR
Container.demux(video=0, audio=0) first video and audio
- Container.decode(*args_for_demux, **kwargs_for_demux) -> straight to frames
- Stream.mux(frame) -> encode, flush, and mux
- Container.__del__ -> flush the streams and close
- Container.add_stream(template=stream)
Unsorted
========
- atexit.register something to clean up FFmpeg threads
- Investigate adding Python-defined formats. Then we could pull frames from
webcams via OS X APIs.
- Directly use the Numpy C-API
- see: http://docs.scipy.org/doc/numpy/reference/c-api.array.html
- see: https://github.com/cython/cython/wiki/tutorials-numpy#c-api-initalization
- Move the logic regard initializating a VideoReformatter into the object
itself instead of the VideoFrame.reformat method.
- Protocol for streams to pull packets from iterators. Then secondary streams
get easier: container.add_stream('audio_codec', source=av.open('music.aiff').iter_packets(only_primary=True))
- Create Codec Descriptor
- Make av.utils.SmartPointer to wrap around a pointer?
runtime_library_dirs
- Look at how stream alloc works; can we manually do this and register the
stream on a format context later?
- Context use a StringIO or similar to write into/from?
- Create wrapper around AVIOContext for arbitrary Python object with read/write
methods.
- libavdevice? Pull from video input?
- Various AudioLayout/AudioFormat/VideoFormat attributes should be writable.
- is_mutable flags on various objects (including formats, layouts, contexts,
streams, etc.) could guard the __set__ methods of properties.
- Should methods be: to_bytes, as_bytes, tobytes, asbytes??
- Plane.array_format could be a format string for array.array
- Plane.update_from_array(array.array)
- If it were HD RGB then it would be 1920 * 1080 * 3 long.
- Plane.update_from_ndarray(numpy.ndarray)
- If it were HD RGB then it would have shape (1080, 1920, 3)
- Plane.update(input_)
And then it tries if it is a buffer, memoryview, bytes, etc..
- Stream.encode(...) -> list of packets, automatically checking buffer)
- Context.mux(...) -> take a single packet, or an iterator of them:
context.mux(stream.encode(frame))
- Context.add_stream(codec_name, frame_rate) -> Context.streams.append(Stream(codec_name, frame_rate))?
- TestCase.rms_diff(one, two) -> Root-mean-square diff
- try to wrap API of testsrc filters
- FFmpeg tutorial: http://dranger.com/ffmpeg/
- also has function reference: http://dranger.com/ffmpeg/functions.html
- updated tutorial code: https://github.com/chelyaev/ffmpeg-tutorial
- Even out more of the differences:
- See README of https://github.com/chelyaev/ffmpeg-tutorial
- Replicate av_frame_get_best_effort_timestamp
http://ffmpeg.org/pipermail/ffmpeg-devel/2011-February/104327.html
http://pastebin.com/Aq8eDZw3/
http://web.archiveorange.com/archive/v/yR2T4bybpYnYCUXmzAI5