You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have some questions about the inversion of StyleGAN-V generator in FaceForensics dataset.
In case of image (a single frame), projection works well.
However, when I tried to project the video (multiple frames at once),
I found that the projected video contains almost identical frames in the entire time step.
Is this a normal phenomenon?
For projecting the video (16 frames in my case), I change some codes in "src/scripts/project.py" as below:
adjust times step (0 to 16 frames)
In line 59, from ts = torch.zeros(num_videos, 1, device=device)
to ts = torch.arange(num_videos, 16, device=device)
make motion code trainable (comment out the line 110 and uncomment line 109)
extract target_features of real videos per frame, and change the distance as being measured between videos, not frames.
For example, in line 140, dist = (target_features - synth_features).square().sum()
In batch dimension of target_features and synth_features, they have 16 frames of a single video, not different images as original code does.
Thanks,
The text was updated successfully, but these errors were encountered:
Hello,
I have some questions about the inversion of StyleGAN-V generator in FaceForensics dataset.
In case of image (a single frame), projection works well.
However, when I tried to project the video (multiple frames at once),
I found that the projected video contains almost identical frames in the entire time step.
Is this a normal phenomenon?
For projecting the video (16 frames in my case), I change some codes in "src/scripts/project.py" as below:
adjust times step (0 to 16 frames)
In line 59, from
ts = torch.zeros(num_videos, 1, device=device)
to
ts = torch.arange(num_videos, 16, device=device)
make motion code trainable (comment out the line 110 and uncomment line 109)
extract target_features of real videos per frame, and change the distance as being measured between videos, not frames.
For example, in line 140,
dist = (target_features - synth_features).square().sum()
In batch dimension of target_features and synth_features, they have 16 frames of a single video, not different images as original code does.
Thanks,
The text was updated successfully, but these errors were encountered: