Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parsing conceptual caption does not function properly as it removes some images and replaces them with zero tensor. #60

Open
mmderakhshani opened this issue Dec 4, 2022 · 1 comment

Comments

@mmderakhshani
Copy link

mmderakhshani commented Dec 4, 2022

Hi, I tried to re-train your model on Conceptual Caption dataset following this part in your readme.

I found a series of problem for parsing Conceptual Caption dataset.

  1. In this line, it is believed that if an exception occurred while loading an image, the image variable will be initialized to 0; this occurs when an image file is malformed.

  2. I also found that if I use the provided script to download images, most of the downloaded images become corrupted.

It may result in a model with undesirable behavior if it receives many zero images and maps it to a caption. I would greatly appreciate it if you could inform me if you are experiencing the same issue or if it is just mine.

Thanks.

@mmderakhshani mmderakhshani changed the title Parsing conceptual caption does not function perfectly as it removes some images and replaces them with zero tensor. Parsing conceptual caption does not function properly as it removes some images and replaces them with zero tensor. Dec 4, 2022
@mlching
Copy link

mlching commented Jun 26, 2023

Hi @mmderakhshani , may I know if you have found any solution?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants