This repository has been archived by the owner on Apr 16, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 30
R image processing #107
Open
mfidino
wants to merge
11
commits into
master
Choose a base branch
from
R_image_processing
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
R image processing #107
Changes from all commits
Commits
Show all changes
11 commits
Select commit
Hold shift + click to select a range
06ba7f6
Getting some R stuff on .gitignore
mfidino 26fd36c
Create process__predict_example.R
mfidino 3128dd8
Some more assumptions added to the top
mfidino 41dccce
made extensions a constant in `find_image_files`
mfidino b3bdbc3
dropped `= NULL` from `proces_images` argument
mfidino 2fefc07
Hugged if statements
mfidino ab19940
Calling functions via their libraries
mfidino 05b93f7
corrupt image fix, referencing issues
mfidino 8491404
tmp_name to tmp_names
mfidino 551b73c
two lines between funcitions
mfidino e8dd854
Slayed `NULL` defaults
mfidino File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -93,3 +93,7 @@ ENV/ | |
|
||
# Image files | ||
**/*.JPG | ||
|
||
# Some R stuff | ||
.Rproj.user | ||
*.Rproj |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,230 @@ | ||
#Examples of how to make requests agains the image classification endpoints | ||
#Note: | ||
# 1. This assumes that the image_classifier_api is running | ||
# (i.e., using docker run -p 8000:8000 gaganden/autofocus_serve) | ||
# 2. It also assumes that the api address is at 127.0.0.1 | ||
# (which should be the case) | ||
# 3. Assumes that your current working directory is | ||
# './GitHub/autofocus/autofocus/predict' | ||
# 4. Assumes that the images you are going to send to autofocus have | ||
# not been preprocessed at all. | ||
|
||
#Library requirements: | ||
# RCurl, jsonlite, dplyr, magick, zip, progress | ||
|
||
library(RCurl) | ||
library(jsonlite) | ||
library(magick) | ||
library(zip) | ||
library(progress) | ||
library(dplyr) | ||
|
||
find_image_files <- function(search_dir){ | ||
# Utility function to find all recursively find all image files | ||
# starting from a directory | ||
|
||
# Args: | ||
# search_dir(character): the starting directory path from which to search | ||
|
||
# Returns: | ||
# image_files(list): list containing the paths of all image files found. | ||
# Each element in this list is a vector of at least 10 images. This split | ||
# is done so that the images can be zipped and sent to autofocus. | ||
|
||
valid_extensions <- c("jpeg", "jpg", "bmp", "png") | ||
valid_extensions <- c(valid_extensions, toupper(valid_extensions)) | ||
|
||
file_list <- list.files(search_dir, recursive = TRUE, full.names = TRUE) | ||
image_files <- file_list[grep(paste(valid_extensions, | ||
collapse = "|"), file_list)] | ||
image_files <- normalizePath(image_files, winslash = "/") | ||
# normalize the path, then split into groups of max 10 image | ||
n_groups <- ceiling(length(image_files) / 10) | ||
image_files <- split(image_files, | ||
sort(rep_len(1:n_groups, length(image_files)))) | ||
|
||
return(image_files) | ||
} | ||
|
||
|
||
process_images <- function(image_files){ | ||
# Utility function to preprocess images to be sent to autofocus | ||
|
||
# Args: | ||
# image_files(list): the output object from find_images() | ||
|
||
# Returns: | ||
# a list: This list has two elements: | ||
# 1. zip(character): A vector of the temporary zip files to be sent to | ||
# autofocus. | ||
# 2. dict(named character): a key-value pair that links the temporary | ||
# image file to the actual file. The elements in this vector are | ||
# the names of the temporary files while the names are the full | ||
# paths to the file names. | ||
|
||
if(!is(image_files, 'list')){ | ||
stop('image_files must be a list.') | ||
} | ||
|
||
if(any(sapply(image_files, length)>10)){ | ||
stop('One of the elements is image_files has > 10 images.') | ||
mfidino marked this conversation as resolved.
Show resolved
Hide resolved
|
||
} | ||
|
||
dict_list <- vector('list', length = length(image_files)) | ||
zip_vector <- rep(NA, length(image_files)) | ||
|
||
cat(paste('Processing', length(unlist(image_files)), 'images...\n')) | ||
|
||
pb <- progress::progress_bar$new( | ||
format = "Images processed [:bar] :elapsed | eta: :eta", | ||
total = length(unlist(image_files)), | ||
width = 60 | ||
) | ||
|
||
for(photo_group in seq.int(length(image_files))){ | ||
|
||
# get paths to files | ||
image_file_names <- image_files[[photo_group]] | ||
|
||
# files with 0 kb are corrupt, remove them | ||
file_sizes <- file.size(image_file_names) | ||
if(any(file_sizes == 0)){ | ||
image_file_names <- image_file_names[-which(image_file_names == 0)] | ||
} | ||
|
||
# count number of files | ||
num_files <- length(image_file_names) | ||
|
||
file_pattern <- paste0("file_", | ||
stringr::str_pad(1:num_files, width = 2, pad = "0"), | ||
"_") | ||
# make some temporary file names | ||
tmp_names <- tempfile(pattern = file_pattern, | ||
fileext = rep('.jpg', num_files)) | ||
|
||
# sort them | ||
tmp_names <- sort(tmp_names) | ||
|
||
# line up temps to actual photo names | ||
dict <- sapply(strsplit(tmp_names, "\\\\|/"), function(x) x[length(x)]) | ||
names(dict) <- image_file_names | ||
|
||
# Read in iamge, crop 198 from the bottom, resize to 512 pixels tall, | ||
# then save as a temporary image. | ||
for(image in seq.int(num_files)){ | ||
pb$tick() | ||
magick::image_read(image_file_names[image]) %>% | ||
magick::image_crop(., paste0(image_info(.)$width, | ||
"x", | ||
magick::image_info(.)$height-198)) %>% | ||
magick::image_resize(., '760x512!') %>% | ||
magick::image_write(., tmp_names[image]) | ||
} | ||
|
||
# zip the temporary files together | ||
tmp_zip <- tempfile(fileext = ".zip") | ||
zip::zipr(tmp_zip, tmp_names) | ||
dict_list[[photo_group]] <- dict | ||
zip_vector[photo_group] <- tmp_zip | ||
if(file.exists(tmp_zip)){ | ||
unlink(tmp_names) | ||
} | ||
} | ||
|
||
# return the dictionary and the name of the zipped file. | ||
return(list(zip = zip_vector, dict = dict_list)) | ||
} | ||
|
||
|
||
post_zips <- function(processed_images, | ||
uri = "http://localhost:8000/predict_zip"){ | ||
# send the zip files to autofocus | ||
|
||
# Args: | ||
# processed_images(list): the output from process_images() | ||
# uri(character): the location autofocus is running | ||
|
||
#Returns: | ||
# response(tibble): A tibble of guesses for each image supplied to | ||
# autofocus. The columns, save for the last one, have species names | ||
# and represent the likelihood that this species is in the image. | ||
# The last column is the file name of the image. | ||
cat(paste('Posting', length(processed_images$zip), | ||
'zip file(s) to autofocus...\n')) | ||
|
||
pb <- progress::progress_bar$new( | ||
format = "Files processed [:bar] :elapsed | eta: :eta", | ||
total = length(unlist(processed_images$zip)), | ||
width = 60 | ||
) | ||
# the object that initially contains the autofocus json | ||
response <- vector('list', length(processed_images$zip)) | ||
for(zippy in seq.int(length(processed_images$zip))){ | ||
pb$tick() | ||
# post to autofocus | ||
response[[zippy]] <- jsonlite::fromJSON(RCurl::postForm(uri, | ||
file = RCurl::fileUpload(processed_images$zip[zippy]), | ||
.checkParams = FALSE)) | ||
|
||
# get the file names from autofocus | ||
file_names <- strsplit(names(response[[zippy]]), "/") | ||
file_names <- sapply(file_names, function(x) x[length(x)]) | ||
file_names <- strsplit(file_names, "_") | ||
file_names <- as.numeric(sapply(file_names, '[[', 2)) | ||
# and line it up with what we did during image processing | ||
OG_file_names <- names(processed_images$dict[[zippy]])[file_names] | ||
mfidino marked this conversation as resolved.
Show resolved
Hide resolved
|
||
# provide a warning just incase autofocus did not ID a specific image | ||
if(!length(OG_file_names) == length(processed_images$dict[[zippy]]) ){ | ||
warning(paste('Autofocus did not ID all images in zip file number', zippy)) | ||
} | ||
# put the file name into each nested list object | ||
for(image in seq.int(length(response[[zippy]]))){ | ||
response[[zippy]][[image]]$file <- OG_file_names[image] | ||
} | ||
} | ||
# bind the list of lists, then bind the list of tibbles | ||
response <- lapply(response, dplyr::bind_rows) %>% dplyr::bind_rows | ||
return(response) | ||
} | ||
|
||
|
||
most_likely <- function(response_frame){ | ||
# Utility function that provides the best guess from each classification | ||
|
||
# Args: | ||
# response_frame(tibble): the output from post_zips() | ||
|
||
# Returns: | ||
# A tibble that has three columns: | ||
# 1) file: the file name | ||
# 2) species: the species most likely to be in the image | ||
# 3) probability autofocus's confidence of this classification | ||
|
||
# Find which column has the highest likelihood | ||
best_guess <- apply(response_frame[,-grep('file', colnames(response_frame))], | ||
1, which.max) | ||
# Grab the highest likelihood | ||
best_prob <- apply(response_frame[,-grep('file', colnames(response_frame))], | ||
1, max) | ||
# Correspond the highest likelihood to a species name | ||
species_name <- colnames(response_frame)[best_guess] | ||
|
||
# the object to return | ||
to_return <- dplyr::tibble(file = response_frame$file, | ||
species = species_name, | ||
probability = best_prob) | ||
return(to_return) | ||
} | ||
|
||
|
||
# where are the photos located | ||
search_dir <- "./images/" | ||
|
||
all_images <- find_image_files(search_dir) | ||
|
||
processed_images <- process_images(all_images) | ||
|
||
my_ids <- post_zips(processed_images) | ||
|
||
best_ids <- most_likely(my_ids) | ||
|
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the double underscore in the filename intentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, that is definitely a typo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding the model probabilities. That is something that I was unaware of until you brought it up today (that they were independent). I'm thinking that if we are trying to select the 'best' one it may make the most sense to divide each probability by the sum of all the probabilities.
For example, if we have
Then we would divide each of those element by
0.80+0.20+0.75
. That would at least ensure that the relative probabilities sum to 1. On top of this, an image with multiple 'high' probabilities for different classifications (as in the above example) would get down-weighted a bit while those with a single 'high' probability would be penalized less.Finally, I do agree that this could be made MUCH more modular and a library would be a great way to do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would we want that? The idea behind letting them be independent is that the categories are not actually mutually exclusive, so we should not force them to sum to 1. E.g.
{'human': 1, 'dog': 1, ...}
is just the right result for an image that contains both humans and dogs.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While that is true in this specific case, what do you do with an image that is
{'raccoon': 0.99, 'coyote': 0.99}
?. In our decade of camera trapping we've only gotten one image of a coyote and a raccoon so assuming that both are in the is a little suspect. Aside from human and dog the likelihood of getting two unique species is quite low. However, if we allow the probabilities to sum to one you could just add together the human and dog probabilities (making a human AND dog classifier). At the end of the day though these types of summaries can be done after the fact (post autofocus) and we can then make some comparisons about which way performs better.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the model gives that result and the image doesn't contain both a raccoon and a coyote, then the model got it wrong. Cases where it is that badly wrong should be extremely rare. At this point if I saw
{'raccoon': 0.99, 'coyote': 0.99}
from the model I would be inclined to believe that the image does contain both a raccoon and a coyote, although my prior probability for that scenario is low.If the app returns
{"human": 1, "dog": 1}
(with zeros for other categories), that means that the model is confident that the image contains both a human and a dog. If it returns{"human": .5, "dog": .5}
(with zeros for other categories), that means that the model is maximally uncertain about whether the image contains a human and about whether it contains a dog. This is a very important distinction, and we would lose it if we made the numbers sum to one after the fact.If we trained a multiclass rather than multilabel model so that
{"human": 1, "dog": 1}
was impossible, then it is hard to say what the model would do on an image that contains both humans and dogs. For instance, in that case{"human": .5, "dog": .5}
could mean that the image contains both humans and dogs but the model doesn't have the means to represent that fact, or it could mean that the model is confident that the image contains something but not whether that thing is a human or a dog.The categories are not mutually exclusive, so treating them as separate labels is the right approach in principle. We can revisit the approach if we find that it doesn't work well in practice, but so far I don't see any reason to think that it wouldn't.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's meet up or jump on a call if what I'm saying isn't clear or if it is clear but you disagree.