Intrinsic RGB 4096 step conversion
Further speedup by using intrinsic(SSE) to convert between YC48 and RGB. Internally, the RGB data is float number ranging 0~4096. By using 4096 steps, there is no need to regenerate the color lookup table when color mode change. As a side benefit, no need to track color mode change.
Benchmark code was simplified according to @yumetodo suggestion. (no log yet however)
Benchmark mode will be turned off automatically when rendering to file.
Now speed is more proportional to number of channels being processed. On my Core-i7:
Y-mode@720P: 4ms
R-mode@720P: 5ms
RG-mode@720P: 10ms
RGB-mode@720P: 15ms
so there are 60fps when a single plugin is running, 30fps when both are in effect.