R image processing #107

mfidino · 2019-08-20T21:02:32Z

These PR is a quality of life improvement for R users of autofocus. The old example assumed that images have already been preprocessed and/or zipped together. This is not the case when someone has a whole bunch of camera trap images in hand.

This new example process_predict_example.R contains a suite of functions that can be used to:

Collect the file names of images that you want to process (via Dan Acheson)
Process the images in a way similar to process_raw.py. We remove the bottom 198 pixels and then reduce to 760x512 pixels.
Zip images together into 'bundles' of 10.
Post the zip files to autofocus, which goes much faster than posting single images.
Process the output from autofocus to generate the 'most likely' estimate in a photo (i.e., going from the many probability statements to the maximum probability statement).

In R, you then end up with a svelt data.frame that contains the original file name and what the most likely species in that photo is. As an example:

best_ids
# A tibble: 7 x 3
  file                                          species probability
  <chr>                                         <chr>         <dbl>
1 C:/Users/mfidino/Documents/GitHub/autofocus/~ squirr~       0.584
2 C:/Users/mfidino/Documents/GitHub/autofocus/~ bird          0.986
3 C:/Users/mfidino/Documents/GitHub/autofocus/~ raccoon       0.960
4 C:/Users/mfidino/Documents/GitHub/autofocus/~ rabbit        0.934
5 C:/Users/mfidino/Documents/GitHub/autofocus/~ raccoon       0.997
6 C:/Users/mfidino/Documents/GitHub/autofocus/~ skunk         0.999
7 C:/Users/mfidino/Documents/GitHub/autofocus/~ deer          1.000

Finally, the images (and associated zip files) that get processed are treated as temporary files so you don't have to have to create a secondary batch of images. This is most useful for the prediction side of autofocus (for training we'd probably want to retain the processed images).

gsganden · 2019-08-20T22:08:58Z

Got it, will review when I get a chance.

jameslamb

Looks cool!

I left some generic R style and best practice stuff, I would be happy to take another look at what the code is actually doing later on.

autofocus/predict/process__predict_example.R

gsganden

Nice! @jameslamb knows R better than I do, so I'm going to defer to him on the details.

One limitation of the approach taken here is that the model probabilities for the different categories are independent, so that e.g. an image containing both a human and a dog should get high probabilities for both. As a result, choosing the highest-probability label will not be the best approach for some applications. (For instance, to evaluate the model's performance e.g. for the human label, you may want to look at the probabilities for human on every image regardless of whether that probability is the highest or not.)

At some point it might be worthwhile to create a library or otherwise make this code more modular and reusable, but I'm fine with having it all in one script as a first pass.

gsganden · 2019-08-22T17:11:17Z

autofocus/predict/process__predict_example.R

@@ -0,0 +1,230 @@
+#Examples of how to make requests agains the image classification endpoints


Is the double underscore in the filename intentional?

Nope, that is definitely a typo

Regarding the model probabilities. That is something that I was unaware of until you brought it up today (that they were independent). I'm thinking that if we are trying to select the 'best' one it may make the most sense to divide each probability by the sum of all the probabilities.

For example, if we have

{ raccoon:0.80, rabbit:0.20, coyote:0.75 }

Then we would divide each of those element by 0.80+0.20+0.75. That would at least ensure that the relative probabilities sum to 1. On top of this, an image with multiple 'high' probabilities for different classifications (as in the above example) would get down-weighted a bit while those with a single 'high' probability would be penalized less.

Finally, I do agree that this could be made MUCH more modular and a library would be a great way to do that.

Then we would divide each of those element by 0.80+0.20+0.75. That would at least ensure that the relative probabilities sum to 1.

Why would we want that? The idea behind letting them be independent is that the categories are not actually mutually exclusive, so we should not force them to sum to 1. E.g. {'human': 1, 'dog': 1, ...} is just the right result for an image that contains both humans and dogs.

While that is true in this specific case, what do you do with an image that is {'raccoon': 0.99, 'coyote': 0.99}?. In our decade of camera trapping we've only gotten one image of a coyote and a raccoon so assuming that both are in the is a little suspect. Aside from human and dog the likelihood of getting two unique species is quite low. However, if we allow the probabilities to sum to one you could just add together the human and dog probabilities (making a human AND dog classifier). At the end of the day though these types of summaries can be done after the fact (post autofocus) and we can then make some comparisons about which way performs better.

While that is true in this specific case, what do you do with an image that is {'raccoon': 0.99, 'coyote': 0.99}?

If the model gives that result and the image doesn't contain both a raccoon and a coyote, then the model got it wrong. Cases where it is that badly wrong should be extremely rare. At this point if I saw {'raccoon': 0.99, 'coyote': 0.99} from the model I would be inclined to believe that the image does contain both a raccoon and a coyote, although my prior probability for that scenario is low.

However, if we allow the probabilities to sum to one you could just add together the human and dog probabilities (making a human AND dog classifier).

If the app returns {"human": 1, "dog": 1} (with zeros for other categories), that means that the model is confident that the image contains both a human and a dog. If it returns {"human": .5, "dog": .5} (with zeros for other categories), that means that the model is maximally uncertain about whether the image contains a human and about whether it contains a dog. This is a very important distinction, and we would lose it if we made the numbers sum to one after the fact.

If we trained a multiclass rather than multilabel model so that {"human": 1, "dog": 1} was impossible, then it is hard to say what the model would do on an image that contains both humans and dogs. For instance, in that case {"human": .5, "dog": .5} could mean that the image contains both humans and dogs but the model doesn't have the means to represent that fact, or it could mean that the model is confident that the image contains something but not whether that thing is a human or a dog.

The categories are not mutually exclusive, so treating them as separate labels is the right approach in principle. We can revisit the approach if we find that it doesn't work well in practice, but so far I don't see any reason to think that it wouldn't.

Let's meet up or jump on a call if what I'm saying isn't clear or if it is clear but you disagree.

mfidino added 3 commits August 20, 2019 08:54

Getting some R stuff on .gitignore

06ba7f6

Create process__predict_example.R

26fd36c

Some more assumptions added to the top

3128dd8

mfidino requested a review from gsganden as a code owner August 20, 2019 21:02

jameslamb self-requested a review August 22, 2019 14:36

jameslamb previously approved these changes Aug 22, 2019

View reviewed changes

mfidino added 8 commits August 22, 2019 11:15

made extensions a constant in find_image_files

41dccce

dropped = NULL from proces_images argument

b3bdbc3

Hugged if statements

2fefc07

Calling functions via their libraries

ab19940

corrupt image fix, referencing issues

05b93f7

tmp_name to tmp_names

8491404

two lines between funcitions

551b73c

Slayed NULL defaults

e8dd854

mfidino dismissed jameslamb’s stale review via e8dd854 August 22, 2019 16:42

gsganden reviewed Aug 22, 2019

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

R image processing #107

R image processing #107

mfidino commented Aug 20, 2019

gsganden commented Aug 20, 2019 •

edited

Loading

jameslamb left a comment

gsganden left a comment •

edited

Loading

gsganden Aug 22, 2019

mfidino Aug 22, 2019

mfidino Aug 22, 2019

gsganden Aug 22, 2019 •

edited

Loading

mfidino Aug 23, 2019 •

edited

Loading

gsganden Aug 23, 2019 •

edited

Loading

gsganden Aug 23, 2019

		@@ -0,0 +1,230 @@
		#Examples of how to make requests agains the image classification endpoints

R image processing #107

Are you sure you want to change the base?

R image processing #107

Conversation

mfidino commented Aug 20, 2019

gsganden commented Aug 20, 2019 • edited Loading

jameslamb left a comment

Choose a reason for hiding this comment

gsganden left a comment • edited Loading

Choose a reason for hiding this comment

gsganden Aug 22, 2019

Choose a reason for hiding this comment

mfidino Aug 22, 2019

Choose a reason for hiding this comment

mfidino Aug 22, 2019

Choose a reason for hiding this comment

gsganden Aug 22, 2019 • edited Loading

Choose a reason for hiding this comment

mfidino Aug 23, 2019 • edited Loading

Choose a reason for hiding this comment

gsganden Aug 23, 2019 • edited Loading

Choose a reason for hiding this comment

gsganden Aug 23, 2019

Choose a reason for hiding this comment

gsganden commented Aug 20, 2019 •

edited

Loading

gsganden left a comment •

edited

Loading

gsganden Aug 22, 2019 •

edited

Loading

mfidino Aug 23, 2019 •

edited

Loading

gsganden Aug 23, 2019 •

edited

Loading