-
-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Automatically generate draft answers for student questions #5331
Labels
low priority
Thing we want to see implemented at some point
Comments
bmesuere
added
the
medium priority
Things we want to see implemented relatively soon
label
Jan 30, 2024
Some old code I wrote to generate answers based on questions as a stand-alone script: import OpenAI from "openai";
import { JSDOM } from 'jsdom';
const dodonaHeaders = new Headers({
"Authorization": ""
});
const openai = new OpenAI({
apiKey: ""
});
const systemPrompt = "Your goal is to help a teaching assistant answer student questions for a university-level programming course. You will be provided with the problem description, the code of the student, and the question of the student. Your answer should consist of 2 parts. First, very briefly summarize what the student did wrong to the teaching assistant. Second, provide a short response to the question aimed at the student in the same language as the student's question.";
const questionId = 148513;
async function fetchData(questionId) {
// fetch question data from https://dodona.be/nl/annotations/<ID>.json
let r = await fetch(`https://dodona.be/nl/annotations/${questionId}.json`, {headers: dodonaHeaders});
const questionData = await r.json();
const lineNr = questionData.line_nr;
const question = questionData.annotation_text;
const submissionUrl = questionData.submission_url;
// fetch submission data
r = await fetch(submissionUrl, { headers: dodonaHeaders });
const submissionData = await r.json();
const code = submissionData.code;
const exerciseUrl = submissionData.exercise;
// fetch exercise data
r = await fetch(exerciseUrl, { headers: dodonaHeaders });
const exerciseData = await r.json();
const descriptionUrl = exerciseData.description_url;
// fetch description
r = await fetch(descriptionUrl, { headers: dodonaHeaders });
const descriptionHtml = await r.text();
const description = htmlToText(descriptionHtml);
return {description, code, question, lineNr};
}
async function generateAnswer({description, code, question, lineNr}) {
const response = await openai.chat.completions.create({
model: "gpt-4",
messages: [
{"role": "system", "content": systemPrompt},
{"role": "user", "content": `Description: ${description}\nCode: ${code}\nQuestion on line ${lineNr}: ${question}`}
]
});
console.log(response);
console.log(response.choices[0].message);
//return gptResponse.data.choices[0].text;
}
function htmlToText(html) {
const dom = new JSDOM(html);
const text = dom.window.document.body.textContent
.split("\n")
.map(l => l.trim())
.filter(line => !line.includes("I18n"))
.filter(line => !line.includes("dodona.ready"))
.join("\n");
return removeTextAfterSubstring(text, "Links").trim();
}
function removeTextAfterSubstring(str, substring) {
const index = str.indexOf(substring);
if (index === -1) {
return str; // substring not found
}
return str.substring(0, index);
}
const data = await fetchData(questionId);
console.log(data);
await generateAnswer(data) |
I tested the runtime performance of a few models on my mac studio (64GB memory):
I could not validate the output of codellama-70b since it seems to use a different prompt format. |
I played around with the various models this afternoon. Some early observations:
|
bmesuere
added
low priority
Thing we want to see implemented at some point
and removed
medium priority
Things we want to see implemented relatively soon
labels
Feb 18, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
With the increasing capabilities of LLMs, it is only a matter of time before they become powerful/cheap enough to use them inside Dodona. A first step might be to generate draft answers for questions from students. Here's how it might function:
This approach minimizes risk since each AI-generated answer undergoes human review and editing. Moreover, it's not time-sensitive. If the AI draft is inadequate or fails, the situation remains as it is currently. However, the potential time savings could be substantial.
Since this would be our first LLM integration, this will involve some research aspects.
The text was updated successfully, but these errors were encountered: