Generalize float precision conversion #1261

emricksinisonos · 2023-11-14T10:07:21Z

Description

Conversion from f32 to f16 was already supported, this PR adds the support of f16 to f32 conversion and adds a bit of genericity in the implementation of the converter.

It also breaks high level tract API by replacing the half() API by:

f32_to_f16(): equivalent to half(), f32 -> f16 conversion
f16_to_f32(): f16 -> f32 conversion

kali · 2023-11-22T08:30:21Z

We should keep --half-floats as an alias() for f32-to-f16

emricksinisonos added 2 commits November 14, 2023 11:04

Generalise float precision model conversion (f16 -> f32 and f32 -> f16)

1b60a00

Update API to support both conversions

f3a434d

emricksinisonos force-pushed the task/generalize-float-precision-conversion branch from f2f68e6 to f3a434d Compare November 14, 2023 10:08

Some fixes

05719e5

emricksinisonos marked this pull request as ready for review November 14, 2023 11:25

emricksinisonos added 2 commits November 21, 2023 17:30

rename float_precision_translator to floats

7fcde41

Some renaming

889baf2

emricksinisonos requested a review from kali November 21, 2023 16:59

Add alias for half-floats

cb5a527

kali merged commit 43b1463 into main Nov 22, 2023
43 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generalize float precision conversion #1261

Generalize float precision conversion #1261

emricksinisonos commented Nov 14, 2023 •

edited

Loading

kali commented Nov 22, 2023

Generalize float precision conversion #1261

Generalize float precision conversion #1261

Conversation

emricksinisonos commented Nov 14, 2023 • edited Loading

Description

kali commented Nov 22, 2023

emricksinisonos commented Nov 14, 2023 •

edited

Loading