Update src/routes/blogs/accelerating-phi-2/+page.svx

Co-authored-by: Sophie Schoenmeyer <[email protected]>
microsoft · Feb 27, 2024 · 1a3b85a · 1a3b85a
1 parent cef1ea7
commit 1a3b85a
Showing 1 changed file with 1 addition and 1 deletion.
diff --git a/src/routes/blogs/accelerating-phi-2/+page.svx b/src/routes/blogs/accelerating-phi-2/+page.svx
@@ -28,7 +28,7 @@ url: 'https://onnxruntime.ai/blogs/accelerating-phi-2'
 
 In a fastmoving landscape where speed and efficiency are paramount, [ONNX Runtime](https://onnxruntime.ai/blogs/accelerating-llama-2) (ORT) allows users to easily integrate the power of generative AI models into their apps and services with improved optimizations that yield faster inferencing speeds and effectively lowers costs. These include state-of-the-art fusion and kernel optimizations to help improve model performance. The recent [ONNX Runtime 1.17 release](https://github.com/microsoft/onnxruntime/releases/tag/v1.15.0) improves inference performance of several Gen AI models including Phi-2, Mistral, CodeLlama, Orca-2 and more. ONNX Runtime is a complete solution for small language models (SLMs) from training to inference, showing significant speedups compared to other frameworks. With support for float32, float16, and int4, ONNX Runtime's inference enhancements provide maximum flexibility and performance.
 
-In this blog we will cover significant optimization speed up for both training and inference for latest GenAI models like Phi-2, Mistral, CodeLlama, SD-Turbo, SDXL-Turbo, Llama2, and Orca-2. For these model architectures ONNX Runtime significantly improves performance across a spectrum of batch size and prompt length when compared against other frameworks like PyTorch, and Llama.cpp. These optimizations using ONNX Runtime is now also available using [Olive](https://github.com/microsoft/Olive/tree/main/examples/).
+In this blog, we will cover significant optimization speed up for both training and inference for the latest GenAI models like Phi-2, Mistral, CodeLlama, SD-Turbo, SDXL-Turbo, Llama2, and Orca-2. For these model architectures, ONNX Runtime significantly improves performance across a spectrum of batch sizes and prompt lengths when compared against other frameworks like PyTorch, and Llama.cpp. These optimizations using ONNX Runtime are now also available using [Olive](https://github.com/microsoft/Olive/tree/main/examples/).
 # Quick Links
 - [Phi-2](#phi-2)
 - [Mistral](#mistral)