Add hyperparameter tuning plot functionality (and maybe other plots) #122

BTopcuoglu · 2020-07-22T20:40:40Z

Like @zenalapp suggested, it would be a good idea to export a figure that shows the user whether they are exhausting the tuning for hyperparameters (e.g. using all the cost parameters until we see a global maxima for ROC).

zenalapp · 2020-09-09T17:26:41Z

@kelly-sovacool and I discussed having plots in the package that plot the output of multiple ML runs. Current ideas:

Definitely a hyperparameter tuning plot (1 hyperparameter and 2 hyperparameters).
Maybe a boxplot of AUROCs.
Maybe a boxplot of AUPRCs.

BTopcuoglu · 2020-09-09T17:37:21Z

I have some ggplot code to make nice dotplots with mean/median stats for AUPROC and AUROC values that we can implement if we want.

kelly-sovacool · 2020-09-09T18:02:36Z

@BTopcuoglu do you want to get started on making a dotplot function based on your code then?

BTopcuoglu · 2020-09-13T22:29:41Z

Now that I thought about this a little - this might be a better venue for snakemake workflow. Because the tuning results would not mean much if they are done only for 1 seed. The best hp you get in 1 datasplit might not be the same in another. Similarly the AUROC plots would make sense for 100 datasplit averages/medians but not for 1 datasplit.

kelly-sovacool · 2020-09-14T13:35:11Z

Any plots which are better for multiple seeds should take a dataframe with each row as the result from one seed. We should probably include a function to merge results like the merge_results rule in https://github.com/SchlossLab/mikRopML-snakemake-workflow.

kelly-sovacool · 2020-09-22T13:37:52Z

@BTopcuoglu have you pushed the progress you've made?

BTopcuoglu · 2020-09-23T19:07:24Z

I do have some code for hyperparameter tuning too..but it looks pretty bad right now :) https://github.com/SchlossLab/Topcuoglu_ML_mBio_2020/blob/master/code/learning/FigureS2.R

kelly-sovacool · 2020-10-09T14:49:41Z

Does anyone have example code for feature importance plots? Would be nice to show an example in the Snakemake workflow, regardless of whether we include it in the package.

BTopcuoglu · 2020-10-09T17:16:43Z

# Data has a names column that has the feature/group of features name.
# Data has the auc_diff column that has real auc - permuted auc for each datasplit

perm_top10 <- data %>%
  group_by(names)%>%
  summarise(median = median(auc_diff), iqr_AUC = IQR(auc_diff), mean = mean(auc_diff), se = sd(auc_diff)/sqrt(n())) %>%
  mutate(sign = case_when(median > 0 ~ "positive", median < 0 ~ "negative")) %>%
  #  Arrange from highest median delta to  descending
  arrange(-median) %>%
  # Grab only the largest delta top 10
  head(n=10) %>%
  select(names, median, iqr_AUC, mean, se)



######################################################################
#Plot the feature importances based on permutation importance #
######################################################################

# ggplot2 bar plot
plot <- ggplot(perm_top10, aes(reorder(names, mean), mean)) +
	geom_bar(position = position_dodge(), width = .25, stat="identity", fill="steelblue")  +
	geom_hline(yintercept = 0, color = "black") +
        geom_errorbar(aes(ymin = mean - se, ymax = mean + se), width=0.2) +
	theme(panel.grid.major = element_blank(),
				panel.grid.minor = element_blank()) +
	theme_bw() +
	theme(panel.grid.major = element_blank(),
				panel.grid.minor = element_blank()) +
	xlab("Features") +
	ylab('Mean difference between test and permuted AUROC') +
	coord_flip() +
	theme(axis.text.x = element_text(size = 10,  colour=c("black")),
				axis.text.y = element_text(size = 10, colour=c("black")),
				axis.title.x = element_text(size=12, vjust = 0),
				axis.title.y = element_text(size=12, vjust = 0.5),
				legend.text = element_text(size=13))

kelly-sovacool · 2020-10-21T01:36:41Z

@BTopcuoglu: @pschloss was asking about when we might have plots for hyperparameter tuning incorporated. Would be helpful for @courtneyarmour's project.

I know we'll also need to document tuning better (#201).

zenalapp · 2020-10-21T16:02:11Z

Made a draft in branch iss-122_hp-plot. @BTopcuoglu feel free to modify if you'd like!

BTopcuoglu · 2020-10-21T17:00:05Z

Made a draft in branch iss-122_hp-plot. @BTopcuoglu feel free to modify if you'd like!

Working on it now.

zenalapp changed the title ~~Add hyperparameter tuning plot functionality~~ Add hyperparameter tuning plot functionality (and maybe other plots) Sep 9, 2020

BTopcuoglu self-assigned this Sep 10, 2020

kelly-sovacool mentioned this issue Sep 30, 2020

Plot performance #183

Merged

kelly-sovacool added the feature A new feature request or enhancement label Oct 16, 2020

kelly-sovacool mentioned this issue Oct 21, 2020

Add feature importance plot #210

Closed

kelly-sovacool closed this as completed in df4942c Oct 21, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add hyperparameter tuning plot functionality (and maybe other plots) #122

Add hyperparameter tuning plot functionality (and maybe other plots) #122

BTopcuoglu commented Jul 22, 2020

zenalapp commented Sep 9, 2020 •

edited

Loading

BTopcuoglu commented Sep 9, 2020

kelly-sovacool commented Sep 9, 2020

BTopcuoglu commented Sep 13, 2020

kelly-sovacool commented Sep 14, 2020

kelly-sovacool commented Sep 22, 2020

BTopcuoglu commented Sep 23, 2020

kelly-sovacool commented Oct 9, 2020

BTopcuoglu commented Oct 9, 2020 •

edited

Loading

kelly-sovacool commented Oct 21, 2020

zenalapp commented Oct 21, 2020

BTopcuoglu commented Oct 21, 2020

Add hyperparameter tuning plot functionality (and maybe other plots) #122

Add hyperparameter tuning plot functionality (and maybe other plots) #122

Comments

BTopcuoglu commented Jul 22, 2020

zenalapp commented Sep 9, 2020 • edited Loading

BTopcuoglu commented Sep 9, 2020

kelly-sovacool commented Sep 9, 2020

BTopcuoglu commented Sep 13, 2020

kelly-sovacool commented Sep 14, 2020

kelly-sovacool commented Sep 22, 2020

BTopcuoglu commented Sep 23, 2020

kelly-sovacool commented Oct 9, 2020

BTopcuoglu commented Oct 9, 2020 • edited Loading

kelly-sovacool commented Oct 21, 2020

zenalapp commented Oct 21, 2020

BTopcuoglu commented Oct 21, 2020

zenalapp commented Sep 9, 2020 •

edited

Loading

BTopcuoglu commented Oct 9, 2020 •

edited

Loading