Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

option for max_sequence_length of video generation #699

Merged
merged 1 commit into from
Oct 1, 2024

Conversation

wr0124
Copy link
Collaborator

@wr0124 wr0124 commented Sep 23, 2024

Create an option for the maximum sequence length for video generation, --vid_max_sequence_length. The data_temporal_number_frames value should not exceed --vid_max_sequence_length

Example of execution. It works with:

python3 -W ignore::UserWarning  train.py \
--dataroot /path/to/online_mario2sonic_full_mario  \
--checkpoints_dir  /path/to/checkpoints \
--name  mario_vid   \
--gpu_ids 0    \
--model_type palette \
--output_print_freq 1   \
--output_display_freq 1   \
--data_dataset_mode  self_supervised_temporal_labeled_mask_online  \
--train_batch_size 1  \
--train_iter_size 1  \
--model_input_nc 3 \
--model_output_nc 3 \
--data_relative_paths \
--train_G_ema \
--train_optim adamw \
--G_netG unet_vid   \
--data_online_creation_crop_size_A 32  \
--data_online_creation_crop_size_B 32 \
--data_crop_size 32 \
--data_load_size 32  \
--data_online_creation_rand_mask_A \
--train_G_lr 0.0001 \
--dataaug_no_rotate \
--G_diff_n_timestep_train  6  \
--G_diff_n_timestep_test  3  \
--data_temporal_number_frames 8  \
--data_temporal_frame_step 1 \
--data_online_creation_mask_delta_A_ratio 0.12 0.12 \
--alg_diffusion_cond_image_creation    computed_sketch  \
--alg_diffusion_cond_computed_sketch_list canny \
--alg_diffusion_vid_canny_dropout 0.1 0.8  \
--alg_diffusion_cond_sketch_canny_range  500 1000  \
--vid_max_sequence_length 24 

@wr0124 wr0124 changed the title feat(ml): option for max_sequence_lenght of video generation option for max_sequence_lenght of video generation Sep 23, 2024
@wr0124 wr0124 requested review from royale and beniz September 23, 2024 19:31
@beniz beniz changed the title option for max_sequence_lenght of video generation option for max_sequence_length of video generation Sep 25, 2024
docs/options.md Outdated
@@ -70,7 +70,7 @@ Here are all the available options to call with `train.py`
| --G_unet_mha_num_heads | int | 1 | number of heads in the mha architecture |
| --G_unet_mha_res_blocks | array | [2, 2, 2, 2] | distribution of resnet blocks across the UNet stages, should have same size as --G_unet_mha_channel_mults |
| --G_unet_mha_vit_efficient | flag | | if true, use efficient attention in UNet and UViT |
| --G_unet_vid_max_frame | int | 24 | max frame number for unet_vid in the PositionalEncoding |
| --vid_max_sequence_length | int | 25 | max frame number for unet_vid in the PositionalEncoding |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename --G_unet_vid_max_sequence_length since this is a property of the unet_vid architecture, not the data or the video.

@@ -130,7 +130,7 @@ Generator
+------------------------------------------------+-----------------+---------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| --G_unet_mha_vit_efficient | flag | | if true, use efficient attention in UNet and UViT |
+------------------------------------------------+-----------------+---------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| --G_unet_vid_max_frame | int | 24 | max frame number for unet_vid in the PositionalEncoding |
| --vid_max_sequence_length | int | 25 | max frame number for unet_vid in the PositionalEncoding |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't modify, this is generated automatically.

@@ -379,7 +379,7 @@ def __init__(
attention_block_types=("Temporal_Self", "Temporal_Self"),
cross_frame_attention_mode=None,
temporal_position_encoding=False,
temporal_position_encoding_max_len=25,
temporal_position_encoding_max_len=None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keep the default 25 value because None cannot work, and should not be used for init.

@@ -439,7 +438,7 @@ def __init__(
upcast_attention=False,
cross_frame_attention_mode=None,
temporal_position_encoding=False,
temporal_position_encoding_max_len=25,
temporal_position_encoding_max_len=None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

set default value instead

@wr0124
Copy link
Collaborator Author

wr0124 commented Sep 25, 2024

code works with
python3 -W ignore::UserWarning train.py
--dataroot /data1/juliew/dataset/online_mario2sonic_full_mario
--checkpoints_dir /data1/juliew/checkpoints
--name mario_vid
--gpu_ids 0
--model_type palette
--output_print_freq 1
--output_display_freq 1
--data_dataset_mode self_supervised_temporal_labeled_mask_online
--train_batch_size 1
--train_iter_size 1
--model_input_nc 3
--model_output_nc 3
--data_relative_paths
--train_G_ema
--train_optim adamw
--G_netG unet_vid
--data_online_creation_crop_size_A 32
--data_online_creation_crop_size_B 32
--data_crop_size 32
--data_load_size 32
--data_online_creation_rand_mask_A
--train_G_lr 0.0001
--dataaug_no_rotate
--G_diff_n_timestep_train 6
--G_diff_n_timestep_test 3
--data_temporal_number_frames 8
--data_temporal_frame_step 1
--data_online_creation_mask_delta_A_ratio 0.12 0.12
--alg_diffusion_cond_image_creation computed_sketch
--alg_diffusion_cond_computed_sketch_list canny
--alg_diffusion_vid_canny_dropout 0.1 0.8
--alg_diffusion_cond_sketch_canny_range 500 1000
--G_unet_vid_max_sequence_length 15
~

@beniz beniz merged commit 12cfc1b into jolibrain:master Oct 1, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants