Add a asset #160

russellkim · 2024-03-30T08:41:52Z

SOLAR - https://arxiv.org/abs/2312.15166

name: SOLAR
organization: Upstage.ai
description:

We present a methodology for scaling LLMs called depth up-scaling (DUS) , which encompasses architectural modifications and continued pretraining. In other words, we integrated Mistral 7B weights into the upscaled layers, and finally, continued pre-training for the entire model.

SOLAR-10.7B has remarkable performance. It outperforms models with up to 30B parameters, even surpassing the recent Mixtral 8X7B model. For detailed information, please refer to the experimental table. Solar 10.7B is an ideal choice for fine-tuning. SOLAR-10.7B offers robustness and adaptability for your fine-tuning needs. Our simple instruction fine-tuning using the SOLAR-10.7B pre-trained model yields significant performance improvements (SOLAR-10.7B-Instruct-v1.0).

created date: 2023
url: https://arxiv.org/abs/2312.15166
model card: https://huggingface.co/upstage/SOLAR-10.7B-v1.0
modality: text
analysis:
size: 10.7B

rishibommasani · 2024-04-01T22:57:45Z

Thanks, this looks great - could you add a PR @russellkim?

russellkim · 2024-04-08T01:28:23Z

@rishibommasani Thanks, please, review it. #167

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a asset #160

Add a asset #160

russellkim commented Mar 30, 2024

rishibommasani commented Apr 1, 2024

russellkim commented Apr 8, 2024

Add a asset #160

Add a asset #160

Comments

russellkim commented Mar 30, 2024

rishibommasani commented Apr 1, 2024

russellkim commented Apr 8, 2024