You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello, I am trying to run the supported UNet3D aplication code in the LBANN github, but it fails.
In the distconv environments and its related source codes, it looks like that the input with "labels" data_field is not supported yet. The source code also mentioned that Distconv currently only supports CosmoFlow data.
Is this possible to run unet3d application on LBANN or am I missing something? If you have a knowledge, please advise about it.
This is the main function of my source code that I modified from the example unet3d. The omitted functions are same with the original. Thank you.
if __name__ == '__main__':
desc = ('Construct and run the 3D U-Net on a 3D segmentation dataset.'
'Running the experiment is only supported on LC systems.')
parser = argparse.ArgumentParser(description=desc)
lbann.contrib.args.add_scheduler_arguments(parser)
(Omit parser.add_argument section)
lbann.contrib.args.add_optimizer_arguments(
parser,
default_optimizer="adam",
default_learning_rate=0.001,
)
args = parser.parse_args()
args.procs_per_node=4
parallel_strategy = get_parallel_strategy_args(
sample_groups=args.mini_batch_size,
depth_groups=args.depth_groups)
# Construct layer graph
volume = lbann.Input(data_field='samples')
segmentation = lbann.Input(data_field='labels')
output = UNet3D()(volume)
ce = lbann.CrossEntropy([output, segmentation])
layers = list(lbann.traverse_layer_graph([volume, segmentation]))
obj = lbann.ObjectiveFunction([ce])
for l in layers:
l.parallel_strategy = parallel_strategy
# Setup model
metrics = [lbann.Metric(ce, name='CE', unit='')]
callbacks = [lbann.CallbackPrint(),
lbann.CallbackTimer(),
lbann.CallbackGPUMemoryUsage(),
lbann.CallbackProfiler(skip_init=True),
]
# # TODO: Use polynomial learning rate decay (https://github.com/LLNL/lbann/issues/1581)
# callbacks.append(
# lbann.CallbackPolyLearningRate(
# power=1.0,
# num_epochs=100,
# end_lr=1e-5))
model = lbann.Model(epochs=args.num_epochs,
layers=layers,
objective_function=obj,
callbacks=callbacks,
)
# Setup optimizer
optimizer = lbann.contrib.args.create_optimizer(args)
# Setup data reader
data_reader = create_unet3d_data_reader(
train_dir=args.train_dir,
test_dir=args.test_dir)
# Setup trainer
trainer = lbann.Trainer(mini_batch_size=args.mini_batch_size)
# Runtime parameters/arguments
environment = lbann.contrib.args.get_distconv_environment(
num_io_partitions=args.depth_groups)
if args.dynamically_reclaim_error_signals:
environment['LBANN_KEEP_ERROR_SIGNALS'] = 0
else:
environment['LBANN_KEEP_ERROR_SIGNALS'] = 1
lbann_args = ['--use_data_store']
# Run experiment
kwargs = lbann.contrib.args.get_scheduler_kwargs(args)
lbann.contrib.launcher.run(
trainer, model, data_reader, optimizer,
job_name=args.job_name,
environment=environment,
lbann_args=lbann_args,
batch_job=args.batch_job,
**kwargs)
The text was updated successfully, but these errors were encountered:
@JBae2 There is a bug in the current UNet3D model, where the python representation of the model has drifted from some of the internal changes that have occurred in LBANN. This issue is currently being worked in PR #2151 but is not yet complete.
Hello, I am trying to run the supported UNet3D aplication code in the LBANN github, but it fails.
In the distconv environments and its related source codes, it looks like that the input with "labels" data_field is not supported yet. The source code also mentioned that Distconv currently only supports CosmoFlow data.
Is this possible to run unet3d application on LBANN or am I missing something? If you have a knowledge, please advise about it.
This is the main function of my source code that I modified from the example unet3d. The omitted functions are same with the original. Thank you.
The text was updated successfully, but these errors were encountered: