Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression performance #4

Open
orelgueta opened this issue Jan 4, 2021 · 12 comments
Open

Regression performance #4

orelgueta opened this issue Jan 4, 2021 · 12 comments

Comments

@orelgueta
Copy link
Collaborator

Performance isn't great at the moment. At low energies the prediction is quite poor. At higher energies we see some improvement, but it's still might not be good enough (see plots below).
Questions/ideas:

  • Search for more useful variables.
  • Would feature selection in each energy bin improve things (I doubt it)?
  • Why do we have a bias in our predictions? It seems like we generally predict better PSF than the true one. Such a "consistent" bias points to a problem in the logic or a bug, no?
  • Would more events for training or using diffuse gamma improve things?

Plots below are for gamma_onSource.S.3HB9-FD_ID0.eff-0.root.

image

MLP_small

@orelgueta
Copy link
Collaborator Author

Adding more variables already improved these results significantly. See new plots below (again, for gamma_onSource.S.3HB9-FD_ID0.eff-0.root). Need to study now which variables led to this improvement and how it can be improved further with feature selection.
I think the aforementioned bias is still there though in some of the energy bins.

compare_scores

MLP_small

@orelgueta
Copy link
Collaborator Author

Same plot as in the previous comment, but showing more regressors (MLP_small is still best).

compare_scores

@TarekHC
Copy link
Collaborator

TarekHC commented Jan 7, 2021

This is very cool Orel!! It's funny you were able to improve performance so quickly!

Regarding the bias: weird! Although to tell you the truth, I'm not so worried about it, as what we want to do is rank the events... So biases will not really be a problem. Although lets keep an eye on it, as it could be the cause of some mistake somewhere...

As soon as I'm back from vacation I will start playing around with the new code, and let you know if I'm able to come up with anything relevant.

@orelgueta
Copy link
Collaborator Author

Regarding the bias: weird! Although to tell you the truth, I'm not so worried about it, as what we want to do is rank the events... So biases will not really be a problem. Although lets keep an eye on it, as it could be the cause of some mistake somewhere...

I think we can clearly see this bias in the confusion matrix in #2 (pasted below for convenience). The events in the lower left cells are a direct result of this bias. It would be great to figure out why it's there and fix it.

image

@TarekHC
Copy link
Collaborator

TarekHC commented Jan 28, 2021

Are you sure this is a result of the bias?

From a very naive point of view: If you increase that bias by a factor 10, but keep the shape of the "blob" in the confusion matrix identical, then we would get exactly the same predictions (when you rank the events, the only thing that matters is the shape of that blob, not where it is with respect to the dashed line you show).

Or perhaps I'm not properly understanding the effect of the bias? (you have thought more about this than me!)

@orelgueta
Copy link
Collaborator Author

Oh, now I realise the mistake. I defined the event_type bins based on the true angular error, but I should actually define it based on the reconstruction angular error. Defining it based on the true value results in the bias translating to that lower edge cell. If we define it from the reco value (which is what we should do!) then I think you are right and that the bias will not have an effect.

@orelgueta
Copy link
Collaborator Author

orelgueta commented Feb 2, 2021

Calculated the regression performance with the Prod5 sample, using the baseline south array, on source gamma, pointing north, with minimum multiplicity of 2 telescopes. Full file name is
'/lustre/fs22/group/cta/users/maierg/analysis/AnalysisData/prod5-Paranal-20deg-sq08-LL/EffectiveAreas/EffectiveArea-50h-ID0-NIM2LST2MST2SST2SCMST2-g20210921-V3/BDT.DL2.50h-V3.g20210921/gamma_onSource.S.BL-4LSTs25MSTs70SSTs-MSTF_ID0.eff-0.root'

Regression results are below for the MLP_small model (didn't try any others for now). The scores plot compares the results for all variables and for cases where we exclude the two new "promising" variables. I had hoped that those variables will provide an improvement in performance, but they don't. However, we anyway get a significant boost in performance with this new sample. Possible reasons for this are:

  • The number of minimum telescope is now two, which means there is now a larger spread in the angular error.
  • The increase in the total number of events (almost a factor 10) could mean the training works better (extra motivation to study Impact of training statistics on performance #11).

In particular I am happy to see we don't have a bias anymore. Not sure if an increase in statistics can be the reason for that, but maybe.
The confusion matrix plots will be uploaded to #2.

scores_features_1

All_predict_dist

@TarekHC
Copy link
Collaborator

TarekHC commented Feb 3, 2021

Hi Orel,

This really looks cool! Statistics definitely seem to play a role here indeed... I do have access to the path you linked, so I can try to copy these files and play around with them.

One question: When you say statistics are larger, is it because more MCs were produced? Or because when including multiplicity 2 the statistics went up?

Should we try to move to generating IRFs? I don't mind playing around with that... I guess Max will be happy to help!

Best,
Tarek

@orelgueta
Copy link
Collaborator Author

One question: When you say statistics are larger, is it because more MCs were produced? Or because when including multiplicity 2 the statistics went up?

Not entirely sure. Without digging into the previous file and testing the cuts, I can't answer whether that file already includes only >4 telescope multiplicity or not. It might be the case because the Prod3b file is significantly smaller (3.3GB vs. 14GB).
The new files are the same size, regardless of the cut on multiplicity, since they contain the same events, just with different cut values (as far as I can tell).

Should we try to move to generating IRFs? I don't mind playing around with that... I guess Max will be happy to help!

Yes, producing IRFs would be good. It would help answer also the question regarding the number of events. You can start with that if you wish and I will do the 2-3 things missing with the regression/classification.

@orelgueta
Copy link
Collaborator Author

Added a few more variables to improve performance. Namely, we have a few variables which are given per telescope, like size, distance, crossing, asymmetry, etc. So far we only used the average of those values. However the distribution of these variables is also important of course. So, in order to add some additional information, I added also the median and the std of each distribution of sizes, distance, crossing, etc. This resulted in a significant improvement in performance.
(I tried adding the entire set of variables as well, but this reduces performance, will explain if anyone is actually interested.)

Below I put plots for the comparison of scores of each model,

  • old_features: before adding these variables (essentially only the averages of the distributions).
  • dist_characteristics: all characteristics (average, median and std).
  • no_average: same as dist_characteristics without the average.
  • no_me: same as dist_characteristics without the median.
  • no_std: same as dist_characteristics without the std.

scores_features_1

The model with all characteristics is clearly the best, so we will stay with it.
The distributions of the true angular error vs. the predicted one are pasted below.

dist_characteristics_predict_dist

The confusion matrix is also significantly better and I think now we can clearly have 3 event types.

dist_characteristics_confusion_matrix_n_types_3

@TarekHC
Copy link
Collaborator

TarekHC commented May 27, 2022

Just for future reference: the plots shown from Feb 11, 2021 were artificially good, because we were adding too much information to the training (if I remember correctly, we were using point-like gammas as training? Can't remember...).

We will update these plots and perhaps close the issue.

@orelgueta
Copy link
Collaborator Author

That is true, the plots here were made using point-like gammas. I am not sure I would consider this adding too much information, but indeed we most likely cannot reach this level of performance with diffuse-like gammas.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants