Supporting code for Explainable Knowledge Graph Embedding: Inference Reconciliation for Knowledge Inferences Supporting Robot Actions.
- This repo has been tested for a system running Ubuntu 18.04 LTS, PyTorch (1.2.0), and hardware CPU or Nvidia GPU (GeForce GTX 1060 6GB or better).
- For GPU functionality Nvidia drivers, CUDA, and cuDNN are required.
- Git clone this repo using:
git clone [email protected]:adaruna3/explainable-kge.git
- To install using conda run:
conda env create -f xkge_env.yml
in the repo root and activate environment viaconda activate xkge_env
- Create a file
vim ~/[PATH_TO_ANACONDA]/envs/xkge_env/lib/python3.6/site-packages/xkge.pth
containing a single line with the absolute path to this repo. This file lets conda find theexplainable_kge
module when doing imports, see here. - Get submodules:
git submodule update --init
- Modify the
cd
path inexplainable_kge/run_pra.sh
to the absolute path on your PC forpra
submodule. - The
pra
submodule requires sbt. Make sure to install it and runsbt test
inside thepra
submodule.
- After activating the conda environment, run
python
. Python version 3.6 should run. Next, check ifimport torch
works. Next, for GPU usage check iftorch.cuda.is_available()
isTrue
. - If all these checks passed, the installation should be working.
- Knowledge Graph Embedding models: TuckER implemented here
- Interpretable Graph Feature models: XKE implemented here, and our approach XKGE
- Sub-graph Feature Extraction (SFE): SFE implemented here
- Datasets: VH+_CLEAN_RAN, which is in the paper, and VH+_CORR_RAN, which is in the paper.
- Evaluation of Interpretable Graph Feature Model:
- Remove logs in
./explainable_kge/logger/logs
and model checkpoints in./explainable_kge/models/checkpoints
from any previously run experiments - Run alg_eval.sh using:
./explainable_kge/experiments/scripts/alg_eval.sh
- If this script runs correctly, it will produce 5 output folders in
./explainable_kge/logger/logs
, one for each fold of cross-validation:VH+_CLEAN_RAN_tucker_[X]
where [X] is the fold number 0 to 4. Additionally, inside./explainable_kge/logger/logs/VH+_CLEAN_RAN_tucker_0/results/
there should be 3 PDF files, each containing a set of results corresponding to the last 4 rows of the table (the last row and 4th from last row are repeats since our model without locality or decision trees is XKE) - There is a possibility that some of the folds error out when running due to internal configuations of SFE code, resulting in missing PDF result files. This will be seen from the Python Stack Trace as one of the runs involving
explainable_setting.py
inalg_eval.sh
failing. That can be fixed by removing the entireVH+_CLEAN_RAN_tucker_[X]
directory and re-running the commands in./explainable_kge/experiments/scripts/alg_eval.sh
to generate the directory followed by the corresponding plotting comamnd to generate the PDF. Please see alg_eval.sh. - The locality parameter in alg_eval.sh for our approach was selected by checking a range of localities and plotting the performance. Those results can be generated with locality_dr.sh using:
./explainable_kge/experiments/scripts/locality_dr.sh
. The output PDF will again be in./explainable_kge/logger/logs/VH+_CLEAN_RAN_tucker_0/
- Remove logs in
- Evaluation of Explanation Preferences:
- Remove logs in
./explainable_kge/logger/logs
and model checkpoints in./explainable_kge/models/checkpoints
from any previously run experiments - Run user_pref_gen_exp.sh using:
./explainable_kge/experiments/scripts/user_pref_gen_exp.sh
- If this script runs correctly, it will produce 5 output folders in
./explainable_kge/logger/logs
, one for each fold of cross-validation:VH+_CLEAN_RAN_tucker_[X]
where [X] is the fold number 0 to 4. Additionally, inside./explainable_kge/logger/logs/VH+_CLEAN_RAN_tucker_0/results/
there will be a file used for the AMT user study containing all robot interaction scenarios and explanations,user_preferences_amt_explanations.json
. We selected 15 unique instances from this file so that each instance provides an explanation about a unique triple (there are multiple grounded explanations repeated for a triple inuser_preferences_amt_explanations.json
, so we randomly selected the 15 unique instaces). - After editing
user_preferences_amt_explanations.json
to get only 15 interactions, copy the file and post to AMT using thexkge_amt
submodule. See the README inxkge_amt
. - Run the user study using AMT, or skip that step using our provided user results:
- Our results from the user study are provided in
./amt_data/user_pref_study/
with one json file per user. - Analyze the results from our user study by running:
python ./explainable_kge/logger/viz_utils.py --config_file ./explainable_kge/experiments/configs/std_tucker_dt_vh+_clean_ran.yaml --options "{'plotting': {'mode': 'amt1', 'output_pdf': 'user_preferences'}}"
. This command will produce an output PDF in./explainable_kge/logger/logs/VH+_CLEAN_RAN_tucker_0/results/user_preferences.pdf
that matches the results reported in the paper. Note that to match our results, you need to copy theuser_preferences_amt_explanations.json
in thexkge_amt
repo to./explainable_kge/logger/logs/VH+_CLEAN_RAN_tucker_0/results/
.
- Our results from the user study are provided in
- Remove logs in
- Validation of Explanations for Downstream Tasks:
- Remove logs in
./explainable_kge/logger/logs
and model checkpoints in./explainable_kge/models/checkpoints
from any previously run experiments - Run user_feedback_gen_exp.sh using:
./explainable_kge/experiments/scripts/user_feedback_gen_exp.sh
- If this script runs correctly, it will produce 1 output folder,
./explainable_kge/logger/logs/VH+_CLEAN_RAN_tucker_0/
. Additionally, inside./explainable_kge/logger/logs/VH+_CLEAN_RAN_tucker_0/results/
there will be a file used for the AMT user study containing all explanations,explanations_decision_tree_local3_best_clean_filtered.json
. Copy the file and post to AMT using thexkge_amt
submodule. See the README inxkge_amt
. - Run the user study using AMT, or skip that step using our provided user results. All results from the AMT study should be copy pasted into
./amt_data/user_feedback_study/
.- Our results from the user study are provided in
./amt_data/user_feedback_study/
with one json file per user.
- Our results from the user study are provided in
- Given the AMT results, we now need to plot and analyze the user responses to get the majoirty vote average accuracy and confirm that the response distribution meets our statistical assumptions.
- Run
python explainable_kge/logger/viz_utils.py --config_file ./explainable_kge/experiments/configs/std_tucker_dt_vh+_corr_ran.yaml --options "{'plotting': {'mode': 'amt2', 'output_pdf': 'initial_rq2_plots'}}"
to get the plot the AMT user responses. - Assuming the above command runs correctly, there will be a PDF generated at
./explainable_kge/logger/logs/VH+_CLEAN_RAN_tucker_0/results/amt_initial_rq2_plots.pdf
- Examine the PDF to find the mean majority vote accuracy. The value will be labeled
Majority Mean
within a large table containingPractice
andTest
columns within roughly the last 10 pages of the PDF. This mean value is critical to find as it will serve as the input to all the following steps.
- Run
- Now that we have the majority vote average accuracy (i.e.
Majority Mean
), modify thenoise_reduction_rate
andcorrection_rate
in the first line of./explainable_kge/experiments/scripts/user_feedback_eval.sh
to the observed value (i.e.{'feedback': {'noise_reduction_rate': 0.XXX, 'correction_rate': 0.XXX}}
). If you are using our data, you can leave it as is.- To generate the results from the paper, run user_feedback_eval.sh using:
./explainable_kge/experiments/scripts/user_feedback_eval.sh
. This script will first generate the corrected dataset, then train/test a new embedding using the updated dataset. Additionally, it will train/test and embedding with no corrections. The test scores of both embeddings should closely match the report scores within the paper.
- To generate the results from the paper, run user_feedback_eval.sh using:
- The simulation experiments are not yet included in this repo. Please see rail_tasksim for the simulator.
- Remove logs in
- Repo closer to completion, expect more incremental changes.
- Add simulation experiment as part of demo code.