Implement BF16 using generic_float class #3578

richagadgil · 2024-10-30T22:38:34Z

Uses generic float class (#3522) to create bf16 class.

BF16 has 1 sign bit, 8 bits for the exponent, and 7 bits for the mantissa: bf16 = migraphx::generic_float<7, 8>;

Summary of changes:

generic_float.cpp : Change the subnormal (when exponent==0) conversion to differentiate between FP16 and BF16 types
migraphx.h, shape.hpp, hip_gemm_impl.cpp, gemm_impl.cpp: Add BF16 shape type
type_traits.hpp: Add traits for BF16 type
tests/: Add tests for BF16 type

src/include/migraphx/shape.hpp

test/op_shape_test.cpp

TedThemistokleous · 2024-11-14T19:37:22Z

src/py/migraphx_py.cpp

+{
+    static std::string format()
+    {
+        // TODO: no standard format in numpy for bf16


Should this be an issue or is this tracked somewhere already?

TedThemistokleous · 2024-11-18T15:02:15Z

@richagadgil update your description. I think we've done a pass on this with some comments.

src/api/include/migraphx/migraphx.h

test/op_shape_test.cpp

pfultz2 · 2024-11-20T22:28:40Z

For the CI failures, just add an if statement to skip bf16 for now in the compile_math case in jit.cpp. That should get it passing and we can work on enabling it on a later PR.

migraphx-bot · 2024-11-21T02:19:22Z

Test	Batch	Rate new 47a181	Rate old 0f36aa	Diff	Compare
torchvision-resnet50	64	3,254.36	3,261.99	-0.23%	✅
torchvision-resnet50_fp16	64	6,989.58	6,984.41	0.07%	✅
torchvision-densenet121	32	2,435.66	2,434.46	0.05%	✅
torchvision-densenet121_fp16	32	4,085.94	4,068.77	0.42%	✅
torchvision-inceptionv3	32	1,628.48	1,630.14	-0.10%	✅
torchvision-inceptionv3_fp16	32	2,745.44	2,746.22	-0.03%	✅
cadene-inceptionv4	16	764.67	765.59	-0.12%	✅
cadene-resnext64x4	16	810.97	809.78	0.15%	✅
slim-mobilenet	64	7,467.12	7,474.57	-0.10%	✅
slim-nasnetalarge	64	208.49	208.58	-0.05%	✅
slim-resnet50v2	64	3,442.62	3,441.49	0.03%	✅
bert-mrpc-onnx	8	1,148.81	1,150.80	-0.17%	✅
bert-mrpc-tf	1	466.40	465.54	0.18%	✅
pytorch-examples-wlang-gru	1	419.06	420.06	-0.24%	✅
pytorch-examples-wlang-lstm	1	473.00	381.98	23.83%	🔆
torchvision-resnet50_1	1	761.83	750.44	1.52%	✅
cadene-dpn92_1	1	402.29	398.35	0.99%	✅
cadene-resnext101_1	1	382.31	382.96	-0.17%	✅
onnx-taau-downsample	1	345.79	346.08	-0.08%	✅
dlrm-criteoterabyte	1	33.34	33.35	-0.02%	✅
dlrm-criteoterabyte_fp16	1	52.76	52.68	0.15%	✅
agentmodel	1	8,177.28	8,091.53	1.06%	✅
unet_fp16	2	58.83	58.77	0.09%	✅
resnet50v1_fp16	1	926.85	943.16	-1.73%	✅
resnet50v1_int8	1	994.66	1,012.12	-1.72%	✅
bert_base_cased_fp16	64	1,170.84	1,169.97	0.07%	✅
bert_large_uncased_fp16	32	363.41	363.75	-0.10%	✅
bert_large_fp16	1	198.71	199.03	-0.16%	✅
distilgpt2_fp16	16	2,197.22	2,201.98	-0.22%	✅
yolov5s	1	535.63	539.79	-0.77%	✅
tinyllama	1	43.65	43.42	0.53%	✅
vicuna-fastchat	1	177.06	175.75	0.75%	✅
whisper-tiny-encoder	1	418.16	418.02	0.03%	✅
whisper-tiny-decoder	1	423.43	428.37	-1.15%	✅

Check results before merge 🔆

migraphx-bot · 2024-11-21T02:19:24Z

✅ bert-mrpc-onnx: PASSED: MIGraphX meets tolerance

✅ bert-mrpc-tf: PASSED: MIGraphX meets tolerance

✅ pytorch-examples-wlang-gru: PASSED: MIGraphX meets tolerance

✅ pytorch-examples-wlang-lstm: PASSED: MIGraphX meets tolerance

✅ torchvision-resnet50_1: PASSED: MIGraphX meets tolerance

✅ cadene-dpn92_1: PASSED: MIGraphX meets tolerance

✅ cadene-resnext101_1: PASSED: MIGraphX meets tolerance

✅ dlrm-criteoterabyte: PASSED: MIGraphX meets tolerance

✅ agentmodel: PASSED: MIGraphX meets tolerance

✅ unet: PASSED: MIGraphX meets tolerance

✅ resnet50v1: PASSED: MIGraphX meets tolerance

✅ bert_base_cased_fp16: PASSED: MIGraphX meets tolerance

🔴bert_large_uncased_fp16: FAILED: MIGraphX is not within tolerance - check verbose output

✅ bert_large: PASSED: MIGraphX meets tolerance

✅ yolov5s: PASSED: MIGraphX meets tolerance

✅ tinyllama: PASSED: MIGraphX meets tolerance

✅ vicuna-fastchat: PASSED: MIGraphX meets tolerance

✅ whisper-tiny-encoder: PASSED: MIGraphX meets tolerance

✅ whisper-tiny-decoder: PASSED: MIGraphX meets tolerance

✅ distilgpt2_fp16: PASSED: MIGraphX meets tolerance

richagadgil added 30 commits October 10, 2024 17:53

first pass at integrating generic float

c51c1ce

fix namespaces

134b408

fix mantissa

d4fa6eb

refactor

0b60841

refactor

7a646f1

add fp

ebe819b

fixed generic float class

379a77a

add fp32 test

174384c

remove import

787b651

update tests

1d1fa1c

fp16 tests that work

1791092

update tests

a2eb005

updated fp16 and fp32 tests

ff8ffc7

half tests

e36fd65

underflow and overflow tests

9ac4e2a

generate map

f05fd31

add more tests

cb4d92d

fix names

0cc1946

update tests

85a761b

remove and

65cf9ae

disable warning

fbabf54

fix tidy warning

549f5e6

migraphx py fix

d302e5d

add increments

8d475e3

fix warnings

a0fd055

disable duplicate branch warning

41379fe

add countzero_std

0c29c7b

ci error

4b012a8

simplify countl

dbaa3a8

fix ci

b2bd2a0

richagadgil added 7 commits November 11, 2024 15:43

Merge branch 'develop' into bf16

4cb96ad

remove imports

ffd4ba2

Merge branch 'develop' into bf16

8a10da3

ref tests

1565a0e

migraphx_py fix

e6d1155

fix test cae by index

867e960

add rocblas type

9852da5

causten requested a review from CharlieL7 November 13, 2024 20:52

fix tgts err

bf50653

CharlieL7 reviewed Nov 14, 2024

View reviewed changes

src/include/migraphx/shape.hpp Outdated Show resolved Hide resolved

CharlieL7 reviewed Nov 14, 2024

View reviewed changes

test/op_shape_test.cpp Outdated Show resolved Hide resolved

CharlieL7 requested a review from TedThemistokleous November 14, 2024 18:59

TedThemistokleous reviewed Nov 14, 2024

View reviewed changes

pfultz2 reviewed Nov 18, 2024

View reviewed changes

src/api/include/migraphx/migraphx.h Outdated Show resolved Hide resolved

pfultz2 reviewed Nov 18, 2024

View reviewed changes

test/op_shape_test.cpp Outdated Show resolved Hide resolved

richagadgil added 2 commits November 18, 2024 17:47

address changes

0ebd220

Merge branch 'develop' into bf16

043e322

CharlieL7 approved these changes Nov 18, 2024

View reviewed changes

TedThemistokleous approved these changes Nov 18, 2024

View reviewed changes

TedThemistokleous added the roadmap Tasks to finish for a release label Nov 18, 2024

pfultz2 approved these changes Nov 18, 2024

View reviewed changes

richagadgil force-pushed the bf16 branch from d1acec9 to 043e322 Compare November 19, 2024 00:24

Merge branch 'develop' into bf16

21746a5

skip jit tests

47a1810

causten merged commit 952a257 into develop Nov 21, 2024
40 of 45 checks passed

causten deleted the bf16 branch November 21, 2024 02:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement BF16 using generic_float class #3578

Implement BF16 using generic_float class #3578

richagadgil commented Oct 30, 2024 •

edited

Loading

TedThemistokleous Nov 14, 2024

TedThemistokleous commented Nov 18, 2024

pfultz2 commented Nov 20, 2024

migraphx-bot commented Nov 21, 2024

migraphx-bot commented Nov 21, 2024

Implement BF16 using generic_float class #3578

Implement BF16 using generic_float class #3578

Conversation

richagadgil commented Oct 30, 2024 • edited Loading

TedThemistokleous Nov 14, 2024

Choose a reason for hiding this comment

TedThemistokleous commented Nov 18, 2024

pfultz2 commented Nov 20, 2024

migraphx-bot commented Nov 21, 2024

migraphx-bot commented Nov 21, 2024

richagadgil commented Oct 30, 2024 •

edited

Loading