build: $ g++ main.cpp fp32.cc FP16G.cc fp16.cc -o main
Running: $ ./main
C-based bit-level FP16/FP32 operator for designing and verifying a 16bits/32bits floating pointer hardware operator.
- FP32 Addition
- FP32 Subtraction
- FP32 Multiplication
- FP32 division (occur bit error at last bit of mantisa due to round-up)
- FP16 Addition/Subtraction (support normal and subnormal)
- FP16 Multiplication (support normal and subnormal) TODO list
- add FP16 operator (Division)