You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We are updating the code a lot, it would be great to have even just really simple tests to run to ensure nothing breaks. Let's brainstorm some ideas, and what to apply it to: podcast, 247, glove, just one conversation? How do we evaluate success?
The text was updated successfully, but these errors were encountered:
all this should be automated in a Makefile target:
what are we testing:
generate glove, and gpt2
one subject, 625
two big conversations
result: base and embedding pickles
encoding:
run encoding for 5 good electrodes glove/gpt2 and plot
result: average encoding
create a standard for what we will compare against in the future, but first we need to ensure current code replicates previous results for all electrodes and conversations.
hacky way to manage results folders:
mv results results-old
mkdir results
DO TEST
mv results results-test
mv results-old results
We are updating the code a lot, it would be great to have even just really simple tests to run to ensure nothing breaks. Let's brainstorm some ideas, and what to apply it to: podcast, 247, glove, just one conversation? How do we evaluate success?
The text was updated successfully, but these errors were encountered: