Skip to content

Commit

Permalink
Support fp8_e4m3/fp8_e5m2
Browse files Browse the repository at this point in the history
  • Loading branch information
Narsil committed Nov 17, 2023
1 parent 96061e9 commit c70ba7e
Show file tree
Hide file tree
Showing 2 changed files with 13 additions and 3 deletions.
8 changes: 5 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -99,7 +99,9 @@ Notes:
from traditional tensor libraries perspective (torch, tensorflow, numpy, ..).
- 0-rank Tensors (tensors with shape `[]`) are allowed, they are merely a scalar.
- The byte buffer needs to be entirely indexed, and cannot contain holes. This prevents
the creation of polyglot files.
- Endianness: Little-endian.
moment.
- Order: 'C' or row-major.


### Yet another format ?
Expand All @@ -113,7 +115,7 @@ formats.
Let's take a look at alternatives and why this format is deemed interesting.
This is my very personal and probably biased view:

| Format | Safe | Zero-copy | Lazy loading | No file size limit | Layout control | Flexibility | Bfloat16
| Format | Safe | Zero-copy | Lazy loading | No file size limit | Layout control | Flexibility | Bfloat16/Fp8
| ----------------------- | --- | --- | --- | --- | --- | --- | --- |
| pickle (PyTorch) |||| 🗸 || 🗸 | 🗸 |
| H5 (Tensorflow) | 🗸 || 🗸 | 🗸 | ~ | ~ ||
Expand All @@ -133,7 +135,7 @@ some tensors in it without scanning the whole file (distributed setting) ?
- Layout control: Lazy loading, is not necessarily enough since if the information about tensors is spread out in your file, then even if the information is lazily accessible you might have to access most of your file to read the available tensors (incurring many DISK -> RAM copies). Controlling the layout to keep fast access to single tensors is important.
- No file size limit: Is there a limit to the file size ?
- Flexibility: Can I save custom code in the format and be able to use it later with zero extra code ? (~ means we can store more than pure tensors, but no custom code)
- Bfloat16: Does the format support native bfloat16 (meaning no weird workarounds are
- Bfloat16/Fp8: Does the format support native bfloat16/fp8 (meaning no weird workarounds are
necessary)? This is becoming increasingly important in the ML world.


Expand Down
8 changes: 8 additions & 0 deletions safetensors/src/tensor.rs
Original file line number Diff line number Diff line change
Expand Up @@ -641,6 +641,12 @@ pub enum Dtype {
U8,
/// Signed byte
I8,
/// FP8 <https://arxiv.org/pdf/2209.05433.pdf>_
#[allow(non_camel_case_types)]
F8_E5M2,
/// FP8 <https://arxiv.org/pdf/2209.05433.pdf>_
#[allow(non_camel_case_types)]
F8_E4M3,
/// Signed integer (16-bit)
I16,
/// Unsigned integer (16-bit)
Expand Down Expand Up @@ -670,6 +676,8 @@ impl Dtype {
Dtype::BOOL => 1,
Dtype::U8 => 1,
Dtype::I8 => 1,
Dtype::F8_E5M2 => 1,
Dtype::F8_E4M3 => 1,
Dtype::I16 => 2,
Dtype::U16 => 2,
Dtype::I32 => 4,
Expand Down

0 comments on commit c70ba7e

Please sign in to comment.