Skip to content

Back Code Instruction: Generating Benchmark Dataset for Fortran Language

Notifications You must be signed in to change notification settings

zhu-zhu-ding/FortranCoder

Repository files navigation

About

  • Fortrancoder is a model empowered by Evol-Code, a novel approach to improve Fortran Programming for LLMs.
  • Evol-Code extends the diversity of instructions with data collected from real programming scenarios to improve the overall programming capabilities of LLM.
  • FortranEval is a benchmark dataset that comprehensively evaluates the Fortran programming capabilities of LLM. It includes both function and subroutine, and comprehensive programming tasks including scientific computing and general programming tasks.

💫Important !!!!!

🏅 Fortrancoder-DS-6.7B outperforms gpt-3.5-turbo-1106 and DeepSeek-Coder-6.7B-Instruct ( base model ) on FortranEval!(32.5% vs [29.5% and 27.4%] on pass@1 and 80.8% vs [72.6% and 70.9%])!

🤖Models

Model Checkpoint Size pass@1 pass_c@1 License
FortranCoder-DS-6.7B 🤗 HF_Link 6.7B 32.5(27.4) 80.8(70.9) DeepSeek

📚Dataset

FortranEval: The first benchmark dataset to evaluate LLM's Fortran programming capabilities.

Evol-Code-Fortran: Fortran instruction fine-tuning data generated using the Evol-Code method.

Fine-Tuning

The script supports the training with DeepSpeed. You need install required packages by:

pip install -r requirements.txt

The script finetune_deepseekcoder.py we provide is as DeepSeek.

DATA_PATH="{your_path}"
OUTPUT_PATH="{your_path}"
MODEL_PATH="{your_path}"

#wandb login
export CUDA_VISIBLE_DEVICES=1,2



deepspeed --num_gpus 2 --master_port 6002 finetune/lora_deepseekcoder.py \
    --model_name_or_path $MODEL_PATH \
    --data_path $DATA_PATH \
    --output_dir $OUTPUT_PATH \
    --num_train_epochs 2 \
    --model_max_length 2048 \
    --per_device_train_batch_size 4 \
    --per_device_eval_batch_size 1 \
    --gradient_accumulation_steps 4 \
    --evaluation_strategy "no" \
    --save_strategy "epoch" \
    --save_steps 100 \
    --save_total_limit 100 \
    --learning_rate 1e-5 \
    --warmup_steps 10 \
    --logging_steps 1 \
    --lr_scheduler_type "cosine" \
    --gradient_checkpointing True \
    --report_to "wandb" \
    --deepspeed finetune/configs/ds_config_zero3.json \
    --bf16 True

Inference

The script is in the inference dir.

python inference.py \
--data_path {your_path}/FortranCoder/FortranEval/FortranEval_base_function.jsonl \
--model_path {your_path}/DeepSeek-Coder/model/deepseek-coder-6.7b-instruct \
--lora_path /{your_path}/FortranCoder/lora_model/train_Evol_Code/model/deepseek-coder-6.7b-instruct \
--save_path {your_path}/FortranCoder/inference/1.jsonl \
--data_type fortran   #fortran or HumanEval


# python inference_gpt.py \
# --data_path {your_path}/FortranCoder/FortranEval/FortranEval_base_function.jsonl \
# --save_path {your_path}/FortranCoder/inference/1.jsonl \
# --data_type fortran   #fortran or HumanEval

🚤Quick Start

from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_path, trust_remote_code=True, torch_dtype=torch.bfloat16).cuda()

messages=[
    { 'role': 'user', 'content': "write a quick sort algorithm in python."}
]

inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)
# tokenizer.eos_token_id is the id of <|EOT|> token
outputs = model.generate(inputs, max_new_tokens=512, do_sample=False, top_k=50, top_p=0.95, num_return_sequences=1, eos_token_id=tokenizer.eos_token_id)
print(tokenizer.decode(outputs[0][len(inputs[0]):], skip_special_tokens=True))

About

Back Code Instruction: Generating Benchmark Dataset for Fortran Language

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published