-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Matbench task model accuracy for xtal2png
representation
#50
Comments
The following tutorial uses grayscale MNIST dataset for classification and might be one of the easiest to adapt to |
@faris-k did a classification task on |
It does leave the question on my mind, why does the regression results are so poor (much worse than dummy), whereas the classification results are OK (a bit better than dummy). A follow-up computational experiment (that I think we should leave on the back-burner until further notice) is using the classification model, but with bins for the classes (e.g. formation energy between 0 and 0.05). Implementing ordinal classification would be extra work, so first treat it as categorical. I'm putting this here more as a future reference sort of thing as things progress with It's also interesting in the sense that hyperparameter-tuned XGBoost did a pretty good job on the regression task (~4x better than the CNN regression), and this was with much less information. We'll see if the results still hold when we double-check that data leakage wasn't coming into play. #51 and specifically #51 (comment) |
The task is to use a CNN model for a Matbench submission on regressing formation energy using the xtal2png representation (as an image and/or as an array would be fine). This will help with knowing how "good" the xtal2png representation is from a model accuracy perspective, though I don't expect this to set new benchmarks necessarily.
This might look like using
skorch
with some type of pytorch CNN module (e.g. ResNetUNet, Net) and an MSE loss function. This skorch tutorial looks like it might help with loading images, though this SO answer is probably better for making the actual dataset to pass toskorch
.If regression is too much of a pain (CNNs aren't used as often for property regression in the image-processing domain), an easy fallback is to do the
mp_is_metal
binary classification task instead of thee_form
regression task.Related:
Maybe Faris interested in working on this given that he'll be doing some image processing
The text was updated successfully, but these errors were encountered: