-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The loss becomes nan. #80
Comments
Hey @a1840436478 , Looks like you are overfitting to do your training data. Edoardo |
Yes, I'm using the DFDC dataset, first I run "index_dfdc.py" to generate a pkl file, then I run "extract_faces.py" to extract the face picture, and finally I run "train_binclass.py", the parameters are specified according to the train_all.sh file, because I use Windows, So I'm running with pycharm, and every time I run to "validation_routine", the loss value increases rapidly, and here's a record where I print the loss worth it:[119.45784385909792 -> 19070101454848.348 -> 5.573101113846284e+24 -> nan]. Another thing to add is that when I cancel "net.eval()" the loss value doesn't appear abnormal, but that doesn't solve the problem and I'm very bothered by it. |
My dataset is downloaded directly from "https://www.kaggle.com/competitions/deepfake-detection-challenge/data" in the file "all.zip (471.84 GB)" with the folders "dfdc_train_part_0" --> "dfdc_train_part_49". |
Hey @a1840436478 , This looks pretty strange TBH.
You should not remove the We never tested our code on a Windows server, but it should not be a problem. |
Okay, but I have one last question, when I extract the face information, if I load the model on the GPU, the image I get is different from the result of using the CPU, where the image obtained by the CPU is correct, and the image obtained by the GPU is completely wrong。Do you know why? |
Do you mean that the |
Hello, when I run to the validation_routine, the loss value increases quickly and then becomes nan, do you know why?
The text was updated successfully, but these errors were encountered: