Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluation doesn't work on Windows #45

Open
peter-ch opened this issue Jun 19, 2024 · 4 comments
Open

Evaluation doesn't work on Windows #45

peter-ch opened this issue Jun 19, 2024 · 4 comments

Comments

@peter-ch
Copy link

After getting a score of 0 every time, I looked at the samples.jsonl_results.jsonl file and the result for each is this: "failed: module 'signal' has no attribute 'setitimer'"

This seems like a Windows/Unix issue.

@Ephrem-Adugna
Copy link

Same issue here

@mfwong1223
Copy link

For Windows, I replaced the signal module by the threading module on

@contextlib.contextmanager
def time_limit(seconds: float):
def signal_handler(signum, frame):
raise TimeoutException("Timed out!")
signal.setitimer(signal.ITIMER_REAL, seconds)
signal.signal(signal.SIGALRM, signal_handler)
try:
yield
finally:
signal.setitimer(signal.ITIMER_REAL, 0)
to

import threading
@contextlib.contextmanager
def time_limit(seconds: float):
    def signal_handler():
        raise TimeoutException("Timed out!")
    timer = threading.Timer(seconds, signal_handler)
    timer.start()
    try:
        yield
    finally:
        timer.cancel()

@Ephrem-Adugna
Copy link

Above didn't work for me, just ran inside linux vm using wsl

@CynicalWilson
Copy link

same issue here. Every LLM I load in LMStudio, and test against HumanEval via the script below, I get 0/0 with the failure being the same module not being found.

HumanEval.py:

import os
import json
from human_eval.data import write_jsonl, read_problems
from human_eval.evaluation import evaluate_functional_correctness
from local_llm_client import client

def generate_one_completion(prompt):
    messages = [{"role": "user", "content": prompt}]
    response = client.chat_completion_create(messages)
    return response['choices'][0]['message']['content']

def generate_completions(problems, output_file):
    samples = []
    for task_id, problem in problems.items():
        prompt = problem["prompt"]
        completion = generate_one_completion(prompt)
        samples.append({"task_id": task_id, "completion": completion})
    
    write_jsonl(output_file, samples)

if __name__ == "__main__":
    problems = read_problems()
    output_file = "completions.jsonl"
    
    generate_completions(problems, output_file)
    
    results = evaluate_functional_correctness(output_file)
    print(json.dumps(results, indent=2))

local_llm_client.py:

import requests
import json

class LocalLLMClient:
    def __init__(self, base_url="http://localhost:4445"):
        self.base_url = base_url

    def chat_completion_create(self, messages, temperature=0.7, max_tokens=-1, stream=False):
        url = f"{self.base_url}/v1/chat/completions"
        headers = {"Content-Type": "application/json"}
        data = {
            "model": "nxcode-cq-7b-orpo-q8_0",  # Adjust this to match your model name
            "messages": messages,
            "temperature": temperature,
            "max_tokens": max_tokens,
            "stream": stream
        }

        response = requests.post(url, headers=headers, json=data)
        response.raise_for_status()
        return response.json()

client = LocalLLMClient()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants