Dialogue Knowledge Tracing

This repository contains the code for the paper Exploring Knowledge Tracing in Tutor-Student Dialogues using LLMs. The primary contributions here include code for 1) our language model-based LLMKT and DKT-Sem models, 2) running DKT family and BKT models on dialogue knowledge tracing, and 3) automatically annotating dialogues with knowledge component and correctness labels using the OpenAI API.

If you use our code or find this work useful in your research then please cite us!

@inproceedings{scarlatos2024exploringknowledgetracingtutorstudent,
      title={Exploring Knowledge Tracing in Tutor-Student Dialogues using LLMs},
      author={Alexander Scarlatos and Ryan S. Baker and Andrew Lan},
      year={2025},
      booktitle={Proceedings of the 15th Learning Analytics and Knowledge Conference, {LAK} 2025, Dublin, Ireland, March 3-7, 2025},
      publisher={{ACM}},
}

Annotated Data

Annotated versions of the CoMTA and MathDial datasets (i.e. including per-turn knowledge component and correctness labels) are available in data/annotated, and can be loaded as-is during knowledge tracing training.

These versions of the datasets are subject to their original licenses. The license for CoMTA is available in data/annotated/COMTA_LICENSE.txt and MathDial is licensed under Creative Commons Attribution-ShareAlike 4.0 International License.

Setup

Download Data

This step is not necessary to reproduce our knowledge tracing results since we release the annotated data in data/annotated. However, you can follow the steps below to replicate our workflow or to experiment with custom data annotation.

Achieve the Core (ATC): Download the ATC HuggingFace dataset and put standards.jsonl and domain_groups.json under data/src/ATC/. At the time of releasing this code, the data was not accessible via HuggingFace due to a bug. If the data is still not accessible then you can contact us or the authors of the paper to send you a copy.

CoMTA: Download the CoMTA data file and put it under data/src.

MathDial: Clone the MathDial repo and put the root under data/src.

Environment

We used Python 3.10.12 in the development of this work. Run the following to set up a Python environment:

python -m venv dk
source dk/bin/activate
pip install -r requirements.txt

Also add the following to your environment:

export OPENAI_API_KEY=<your key here> # For automated annotation via OpenAI
export CUBLAS_WORKSPACE_CONFIG=:4096:8 # For enabling deterministic operations

Prepare Dialogues for KT (Run Annotation with OpenAI)

This step is not necessary to reproduce our results because we release the annotated datasets, but is here for reference.

Dialogue KT requires each dialogue turn to be annotated with correctness and knowledge component (KC) labels. We automate this process with LLM prompting via the OpenAI API. You can run the following to tag correctness and ATC standard KCs on the two datasets:

python main.py annotate --mode collect --openai_model gpt-4o --dataset comta
python main.py annotate --mode collect --openai_model gpt-4o --dataset mathdial

To see statistics on the resulting labels, run:

python main.py annotate --mode analyze --dataset comta
python main.py annotate --mode analyze --dataset mathdial

Train and Evaluate KT Methods

Each of the following runs a train/test cross-validation on the CoMTA data for a different model:

python main.py train --dataset comta --crossval --model_type lmkt --model_name lmkt_comta         # LLMKT
python main.py train --dataset comta --crossval --model_type dkt-sem --model_name dkt-sem_comta   # DKT-Sem
python main.py train --dataset comta --crossval --model_type dkt --model_name dkt_comta           # DKT
python main.py train --dataset comta --crossval --model_type dkvmn --model_name dkvmn_comta       # DKVMN
python main.py train --dataset comta --crossval --model_type akt --model_name akt_comta           # AKT
python main.py train --dataset comta --crossval --model_type saint --model_name saint_comta       # SAINT
python main.py train --dataset comta --crossval --model_type simplekt --model_name simplekt_comta # simpleKT
python main.py train --dataset comta --crossval --model_type bkt                                  # BKT

Check the results folder for metric summaries and turn-level predictions for analysis.

To see all training options, run:

python main.py train --help

Hyperparameter Sweep

We run a grid search to find the optimal hyperparameters for the DKT family models. For example, to run a search for DKT on CoMTA, run the following (crossval is inferred and model_name is set automatically):

python main.py train --dataset comta --hyperparam_sweep --model_type dkt

The output will indicate the model that achieved the highest validation AUC. To get its performance on the test folds, run:

python main.py test --dataset comta --crossval --model_type dkt --model_name <copy from output> --emb_size <get from model_name>

Best Hyperparameters Found

CoMTA:

DKT-Sem: lr=2e-4, emb_size=256
DKT: lr=1e-3, emb_size=32
DKVMN: lr=1e-4, emb_size=16
AKT: lr=5e-3, emb_size=32
SAINT: lr=1e-3, emb_size=32
simpleKT: lr=2e-4, emb_size=16

MathDial:

DKT-Sem: lr=2e-3, emb_size=512
DKT: lr=5e-3, emb_size=256
DKVMN: lr=1e-3, emb_size=128
AKT: lr=2e-4, emb_size=64
SAINT: lr=2e-4, emb_size=64
simpleKT: lr=5e-4, emb_size=256

Visualize Learning Curves

To generate the learning curve graphs, run the following (they will be placed in results):

python main.py visualize --dataset comta --model_name <trained model to visualize predictions for>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Dialogue Knowledge Tracing

Annotated Data

Setup

Download Data

Environment

Prepare Dialogues for KT (Run Annotation with OpenAI)

Train and Evaluate KT Methods

Hyperparameter Sweep

Best Hyperparameters Found

Visualize Learning Curves

Files

README.md

Latest commit

History

README.md

File metadata and controls

Dialogue Knowledge Tracing

Annotated Data

Setup

Download Data

Environment

Prepare Dialogues for KT (Run Annotation with OpenAI)

Train and Evaluate KT Methods

Hyperparameter Sweep

Best Hyperparameters Found

Visualize Learning Curves