Perplexity gpu mem optimization #345

mosheraboh · 2024-03-17T12:55:16Z

No description provided.

…lexity_gpu_mem_optimizatio

SagiPolaczek · 2024-03-19T08:31:11Z

fuse/eval/examples/examples_seq_gen.py

@@ -47,11 +46,7 @@ def example_seq_gen_0(seed: int = 1234) -> Dict[str, Any]:
 [
 (
 "perplexity",
- CI(


Why did you remove that?

I see that the ground truth remained the same.. But now we don't cover CI() in the examples (i.e. test coverage)

We do cover CI in other examples I hope.
I removed it because the perplexity now is computed at the batch level (more optimized) and I'm not sure it's a metric we would want a CI for.
We can support it back when necessary.

SagiPolaczek · 2024-03-19T08:38:21Z

fuse/eval/metrics/sequence_gen/metrics_seq_gen_common.py

- target=target,
- metric_per_batch_func=partial(
- _perplexity_update, ignore_index=ignore_index
+ log_probs="log_probs", # collect log_probs - output of _perplexity_update


I think it's out of the scope of this PR but still sharing:

I don't like passing arbitrary key-values as **kwargs to the parent class where they have an actual purpose:

From MetricPerBatchDefault description:

:param kwargs: specify keywords and value arguments you want to collect from the source data. can be strings (key names) and/or actual values to collect from the results dictionary: add a "results:" prefix to the key name

I would rather see an argument like: key_values_to_collect: dict.

What's your take on that?

I agree with you.
But it's indeed out of PR scope.

SagiPolaczek · 2024-03-19T08:55:26Z

fuse/eval/metrics/sequence_gen/metrics_seq_gen_common.py

+ # avoid from overflow
+ if preds.dtype == torch.float16:
+ preds = preds.to(torch.float32)
+ preds = torch.clamp(preds, min=1e-10)


Why do you clamp after moving from float16 to float32 ? Isn't float32 should be more stable? I would guess the other way around - clamp before moving from float32 into float16

Plus, I think that values in (0, 1e-10) cannot be exist in float16, so why clamping that after moving to float32 ?

I'm getting absolute zeros in float16 precision which causes log() to be infinity.

SagiPolaczek · 2024-03-19T09:00:39Z

fuse/eval/metrics/sequence_gen/metrics_seq_gen_common.py

@@ -106,6 +117,10 @@ def _perplexity_compute(
 log_probs: List[np.ndarray],
 token_num: List[np.ndarray],
 ) -> float:
+ # avoid from overflow on large epochs
+ log_probs = [e.astype(np.float64) for e in log_probs]


numerical stability?

It might be large numbers with many GPUS so I took the safe side here.

SagiPolaczek · 2024-03-19T09:08:59Z

fuse/eval/metrics/metrics_common.py

+ def _df_dict_apply(
+ data: pd.Series, func: Callable, batch: bool = False
+ ) -> pd.Series:
+ if batch:


Not sure I understood the use-case: data is always a single data point (e.g. series) and func might be perform on a batch level? And where func expects a batch you pass batch=True?

I'm suggesting to rename batch or add a short description to make it clearer :))

The metric has some preprocessing func in a batch level.
Here we are not running a training loop,, to avoid duplication, we run the function on each sample as if it was a batch. To do that we are converting single sample to batch, call the function and then squeezed it back to sample.
As I mentioned, those are dark places in fuse, that I would refactor If we had some time to spend on it.

SagiPolaczek

LGTM!
I added some questions inline :)

Could you please also describe the GPU mem opt part? Maybe how did you find it and what exactly fixing it (in high-level of course).

moshiko.raboh added 2 commits March 17, 2024 14:53

optimize memory usage of perplexity in multi gpu mode

73dfd72

Merge branch 'master' of github.com:BiomedSciAI/fuse-med-ml into perp…

c481525

…lexity_gpu_mem_optimizatio

mosheraboh requested review from YoelShoshan and SagiPolaczek March 17, 2024 12:55

moshiko.raboh added 2 commits March 17, 2024 22:51

..

2f487cc

fix doc

6ad4270

SagiPolaczek approved these changes Mar 19, 2024

View reviewed changes

mosheraboh merged commit bc21aad into master Mar 20, 2024
5 checks passed

mosheraboh deleted the perplexity_gpu_mem_optimizatio branch March 20, 2024 14:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Perplexity gpu mem optimization #345

Perplexity gpu mem optimization #345

mosheraboh commented Mar 17, 2024

SagiPolaczek Mar 19, 2024

mosheraboh Mar 19, 2024

SagiPolaczek Mar 19, 2024

mosheraboh Mar 19, 2024

SagiPolaczek Mar 19, 2024

SagiPolaczek Mar 19, 2024

mosheraboh Mar 19, 2024

SagiPolaczek Mar 19, 2024

mosheraboh Mar 19, 2024

SagiPolaczek Mar 19, 2024

mosheraboh Mar 19, 2024

SagiPolaczek left a comment

Perplexity gpu mem optimization #345

Perplexity gpu mem optimization #345

Conversation

mosheraboh commented Mar 17, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SagiPolaczek left a comment

Choose a reason for hiding this comment