Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug, Sim plots] Plots don't work when Gaussian Process and > 200 training samples #1268

Open
trentmc opened this issue Jun 21, 2024 · 2 comments
Assignees
Labels
Priority: Mid Type: Bug Something isn't working

Comments

@trentmc
Copy link
Member

trentmc commented Jun 21, 2024

The bug / how to reproduce

Set up a simulation run, where:

  • aimodel_ss.approach: ClassifGaussianProcess or RegrGaussianProcess
  • aimodel_data_ss.autoregressive_n: 2, and
  • aimodel_data_ss.max_n_train: 500

Then run simulation: pdr sim my_ppss.yaml. It runs, successfully.

Then in a separate console, run plots: pdr sim_plots. It runs, the browser window pops open, it says "Updating" in the browser tab title bar, but nothing ever populates. This is the problem.

However, if I set max_n_train: 100 then the plots work successfully. I can inspect the model response plots and it's clearly nonlinear response surfaces (good).

It's a similar problem for approach: ClassifXgboost or RegrXgboost

Towards a solution

Datapoint: Both Gaussian Process models and Xgboost models take up significant memory. 100x+ more memory than the linear models that we've been using so far.

Datapoint: sim engine pickles the model. Then the plots use it for the model response surface plots

Hypothesis: the model's size footprint is too large for the plots

Where is the problem occurring? Possibilities include:

  • Cand A: Not properly pickling the model (due to size); it fails in un-pickling and everything freezes
  • Cand B: Too much bandwidth trying to transmit the model data from the pdr sim_plots server to the Plotly / Dash process running in the browser
  • Cand C: The model made it to the browser, but too much memory for a browser process to handle
  • Cand D: something else?

TODO

Either fix the problem, or have a workaround for when it happens.

  • If a workaround, outcome would be: all plots would properly render except model response is static (vs interactive based on clicking var impacts bars)
@trentmc trentmc added Priority: Mid Type: Bug Something isn't working labels Jun 21, 2024
@calina-c calina-c self-assigned this Jul 1, 2024
@calina-c
Copy link
Contributor

calina-c commented Jul 2, 2024

An interesting update: I ran the process with n_iter=50. The plots do take a lot of time to appear, but they eventually appear. I haven't identified the source of this lag yet, but I would like to do some benchmarking and investigating until today's sprint planning.

@trentmc
Copy link
Member Author

trentmc commented Jul 2, 2024

If I were to fix this issue, I'd do something like:

If the model is xgboost or gaussian process, then don't pass it on to the plotter. Because that's asking the browser VM to hold way more in memory

As for showing sweeps info, we have these options:

  • Idea A: if xgboost or gaussian process, don't show sweeps
  • Idea B: always pre-calculate all sweeps for all 1d and 2d variable pairs. But that gets very expensive, and is rarely needed
  • Idea C: add new parameter aimodel_ss.show_sweeps.
    • If False: sweeps aren't shown
    • If True:
      • If not xgboost or gaussian process, then calculate and show sweeps on the fly, like status quo
      • If xgboost or gaussian process, then pre-calculate the sweeps for each 1d and 2d variable pair

I recommend (A) for now. (B) is not good. And it's not worth the effort & extra complexity to do (C), given that xgboost and gaussian processes aren't doing that well yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Priority: Mid Type: Bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants