Skip to content
This repository has been archived by the owner on Apr 16, 2023. It is now read-only.

R image processing #107

Open
wants to merge 11 commits into
base: master
Choose a base branch
from
Open

R image processing #107

wants to merge 11 commits into from

Conversation

mfidino
Copy link
Collaborator

@mfidino mfidino commented Aug 20, 2019

These PR is a quality of life improvement for R users of autofocus. The old example assumed that images have already been preprocessed and/or zipped together. This is not the case when someone has a whole bunch of camera trap images in hand.

This new example process_predict_example.R contains a suite of functions that can be used to:

  1. Collect the file names of images that you want to process (via Dan Acheson)
  2. Process the images in a way similar to process_raw.py. We remove the bottom 198 pixels and then reduce to 760x512 pixels.
  3. Zip images together into 'bundles' of 10.
  4. Post the zip files to autofocus, which goes much faster than posting single images.
  5. Process the output from autofocus to generate the 'most likely' estimate in a photo (i.e., going from the many probability statements to the maximum probability statement).

In R, you then end up with a svelt data.frame that contains the original file name and what the most likely species in that photo is. As an example:

best_ids
# A tibble: 7 x 3
  file                                          species probability
  <chr>                                         <chr>         <dbl>
1 C:/Users/mfidino/Documents/GitHub/autofocus/~ squirr~       0.584
2 C:/Users/mfidino/Documents/GitHub/autofocus/~ bird          0.986
3 C:/Users/mfidino/Documents/GitHub/autofocus/~ raccoon       0.960
4 C:/Users/mfidino/Documents/GitHub/autofocus/~ rabbit        0.934
5 C:/Users/mfidino/Documents/GitHub/autofocus/~ raccoon       0.997
6 C:/Users/mfidino/Documents/GitHub/autofocus/~ skunk         0.999
7 C:/Users/mfidino/Documents/GitHub/autofocus/~ deer          1.000

Finally, the images (and associated zip files) that get processed are treated as temporary files so you don't have to have to create a secondary batch of images. This is most useful for the prediction side of autofocus (for training we'd probably want to retain the processed images).

@mfidino mfidino requested a review from gsganden as a code owner August 20, 2019 21:02
@gsganden
Copy link
Collaborator

gsganden commented Aug 20, 2019

Got it, will review when I get a chance.

@jameslamb jameslamb self-requested a review August 22, 2019 14:36
jameslamb
jameslamb previously approved these changes Aug 22, 2019
Copy link
Collaborator

@jameslamb jameslamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks cool!

I left some generic R style and best practice stuff, I would be happy to take another look at what the code is actually doing later on.

autofocus/predict/process__predict_example.R Show resolved Hide resolved
autofocus/predict/process__predict_example.R Outdated Show resolved Hide resolved
autofocus/predict/process__predict_example.R Outdated Show resolved Hide resolved
autofocus/predict/process__predict_example.R Outdated Show resolved Hide resolved
autofocus/predict/process__predict_example.R Show resolved Hide resolved
autofocus/predict/process__predict_example.R Outdated Show resolved Hide resolved
autofocus/predict/process__predict_example.R Outdated Show resolved Hide resolved
autofocus/predict/process__predict_example.R Outdated Show resolved Hide resolved
autofocus/predict/process__predict_example.R Outdated Show resolved Hide resolved
Copy link
Collaborator

@gsganden gsganden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! @jameslamb knows R better than I do, so I'm going to defer to him on the details.

One limitation of the approach taken here is that the model probabilities for the different categories are independent, so that e.g. an image containing both a human and a dog should get high probabilities for both. As a result, choosing the highest-probability label will not be the best approach for some applications. (For instance, to evaluate the model's performance e.g. for the human label, you may want to look at the probabilities for human on every image regardless of whether that probability is the highest or not.)

At some point it might be worthwhile to create a library or otherwise make this code more modular and reusable, but I'm fine with having it all in one script as a first pass.

@@ -0,0 +1,230 @@
#Examples of how to make requests agains the image classification endpoints
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the double underscore in the filename intentional?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, that is definitely a typo

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the model probabilities. That is something that I was unaware of until you brought it up today (that they were independent). I'm thinking that if we are trying to select the 'best' one it may make the most sense to divide each probability by the sum of all the probabilities.

For example, if we have

{
raccoon:0.80,
rabbit:0.20,
coyote:0.75
}

Then we would divide each of those element by 0.80+0.20+0.75. That would at least ensure that the relative probabilities sum to 1. On top of this, an image with multiple 'high' probabilities for different classifications (as in the above example) would get down-weighted a bit while those with a single 'high' probability would be penalized less.

Finally, I do agree that this could be made MUCH more modular and a library would be a great way to do that.

Copy link
Collaborator

@gsganden gsganden Aug 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then we would divide each of those element by 0.80+0.20+0.75. That would at least ensure that the relative probabilities sum to 1.

Why would we want that? The idea behind letting them be independent is that the categories are not actually mutually exclusive, so we should not force them to sum to 1. E.g. {'human': 1, 'dog': 1, ...} is just the right result for an image that contains both humans and dogs.

Copy link
Collaborator Author

@mfidino mfidino Aug 23, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While that is true in this specific case, what do you do with an image that is {'raccoon': 0.99, 'coyote': 0.99}?. In our decade of camera trapping we've only gotten one image of a coyote and a raccoon so assuming that both are in the is a little suspect. Aside from human and dog the likelihood of getting two unique species is quite low. However, if we allow the probabilities to sum to one you could just add together the human and dog probabilities (making a human AND dog classifier). At the end of the day though these types of summaries can be done after the fact (post autofocus) and we can then make some comparisons about which way performs better.

Copy link
Collaborator

@gsganden gsganden Aug 23, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While that is true in this specific case, what do you do with an image that is {'raccoon': 0.99, 'coyote': 0.99}?

If the model gives that result and the image doesn't contain both a raccoon and a coyote, then the model got it wrong. Cases where it is that badly wrong should be extremely rare. At this point if I saw {'raccoon': 0.99, 'coyote': 0.99} from the model I would be inclined to believe that the image does contain both a raccoon and a coyote, although my prior probability for that scenario is low.

However, if we allow the probabilities to sum to one you could just add together the human and dog probabilities (making a human AND dog classifier).

If the app returns {"human": 1, "dog": 1} (with zeros for other categories), that means that the model is confident that the image contains both a human and a dog. If it returns {"human": .5, "dog": .5} (with zeros for other categories), that means that the model is maximally uncertain about whether the image contains a human and about whether it contains a dog. This is a very important distinction, and we would lose it if we made the numbers sum to one after the fact.

If we trained a multiclass rather than multilabel model so that {"human": 1, "dog": 1} was impossible, then it is hard to say what the model would do on an image that contains both humans and dogs. For instance, in that case {"human": .5, "dog": .5} could mean that the image contains both humans and dogs but the model doesn't have the means to represent that fact, or it could mean that the model is confident that the image contains something but not whether that thing is a human or a dog.

The categories are not mutually exclusive, so treating them as separate labels is the right approach in principle. We can revisit the approach if we find that it doesn't work well in practice, but so far I don't see any reason to think that it wouldn't.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's meet up or jump on a call if what I'm saying isn't clear or if it is clear but you disagree.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants