Skip to content

Acceleration codes for the Ozaki-scheme on integer matrix multiplication units.

License

Notifications You must be signed in to change notification settings

RIKEN-RCCS/accelerator_for_ozIMMU

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Accelerator for ozIMMU

Acceleration codes for the Ozaki-scheme on integer matrix multiplication units.

Important Notice

To use these codes, ozIMMU is required.

Therefore, users must agree to the license terms of ozIMMU in addition to the license for these codes.

When citing these codes, please also include a citation for ozIMMU.

Usage

Compile ozIMMU with our files instead of the same name files in src of the original ozIMMU.

  • Codes in src_errfree_sum reduce the accumuration in FP64 in ozIMMU.
  • Codes in src_nearest_split offer an alternative splitting method and produce more accurate result than ozIMMU when the numbers of slices are the same.
  • Codes in src_nearest_split+errfree_sum provides the hyblid method of the above and produce more accurate result faster than ozIMMU.

Complex matrix multiplication is not provided.

Citation

@misc{uchino2024performanceenhancementozakischeme,
      title={Performance Enhancement of the Ozaki Scheme on Integer Matrix Multiplication Unit},
      author={Yuki Uchino and Katsuhisa Ozaki and Toshiyuki Imamura},
      year={2024},
      eprint={2409.13313},
      archivePrefix={arXiv},
      primaryClass={cs.DC},
      url={https://arxiv.org/abs/2409.13313},
}

About

Acceleration codes for the Ozaki-scheme on integer matrix multiplication units.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published