Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Job metrics are appended into a list instead of overwritten #163

Open
shazraz opened this issue May 21, 2020 · 13 comments
Open

[BUG] Job metrics are appended into a list instead of overwritten #163

shazraz opened this issue May 21, 2020 · 13 comments
Labels
bug Something isn't working

Comments

@shazraz
Copy link

shazraz commented May 21, 2020

When logging metrics using foundations.log_metric(), metrics for the same job across epochs are appended into a list instead of being overwritten.

This is because of the JobMetricsConsumer.

job_id = message['job_id']
        key = 'jobs:{}:metrics'.format(job_id)
        metric_key = message['key']
        metric_value = message['value']
        value = (timestamp, metric_key, metric_value)
        self._redis.rpush(key, serialize(value))

This makes the metrics very hard to read on the UI.

image

The user should see only the latest metric in the UI. @Rayhane-mamah can you please comment?

@shazraz shazraz added the bug Something isn't working label May 21, 2020
@ekhl
Copy link
Contributor

ekhl commented May 21, 2020

Proposal:
Add a default parameter that indicates the function will append to existing value, i.e. change from log_metric(key, value) to log_metric(key, value, overwrite=False)

Pros: backwards compatible, user gets to choose behaviour if they want the existing value to be overwritten
Con: potentially not the default behaviour users want (do we have some other user feedback?)

Alternative considerations:

  1. using an alternative default parameter like "append=True" or "replace=False"
  2. make the default behaviour to overwrite existing values, with append as the non-default behaviour (requires minor version change/breaks backwards compatibility)
  3. don't change the signature and do not offer control to user, make the function always overwrite

@kyledef
Copy link
Contributor

kyledef commented May 22, 2020

I'm in favour of keeping backward compatibility as we haven't figured out a way to communicate changes between versions yet to users. This might improve the need to communicate changes to end users.

2 considerations:

  • I like overwrite=False
  • I also like the reverse ordering array.insert(0,values)

@Dessa-ColeClifford
Copy link

I am also in favor of keeping backwards compatibility. In certain situations, having the ability to see all historical values would be valuable; so removing the functionality may just force us to bring it back later after a user complaint.

As a thought, would it be possible to store all values in the backend but serve them to the user based on a frontend setting? Storing all values on the backend would also allow visualization of training and validation curves across models later on.

@ekhl
Copy link
Contributor

ekhl commented May 22, 2020

@Dessa-ColeClifford I like that idea. We don't have to modify the SDK in that case. Were you thinking a toggle of sorts on a per-column basis for metrics?

@Dessa-ColeClifford
Copy link

@ekhl A toggle per-column would be the quickest implementation that delivers all required functionality to close this bug. However, this toggle will only hide and not fix the comment "This makes the metrics very hard to read on the UI". In the toggle state that shows the list, we should still re-style the list popup.

I can imagine a more complicated scenario where a cell with "complex data" (which would still need to be clearly defined) has an on-hover and/or on-click component that allows for visualizations (e.g. value over time) or further data analysis. But this shouldn't be treated as part of the described bug due to size of work and required design.

@shazraz
Copy link
Author

shazraz commented May 25, 2020

As a thought, would it be possible to store all values in the backend but serve them to the user based on a frontend setting? Storing all values on the backend would also allow visualization of training and validation curves across models later on.

How about if we continue to store all metrics in the backend but only render the most recent metric in the UI? I think this would be the least-effort path that maintains backwards compatibility.

If the user wants access to all of the metrics for plotting purposes, a safe assumption is that they would want programmatic access. In this case, they can retrieve those metrics using get_metrics_for_all_jobs or get_metric. Was there a use-case defined for rendering all values in the UI when we originally went down this path?

@Dessa-ColeClifford
Copy link

How about if we continue to store all metrics in the backend but only render the most recent metric in the UI? I think this would be the least-effort path that maintains backwards compatibility.

I agree with this. A toggle can be added in later.

The use-case is as simple as knowing the history of how the model did over time. The debate for what we displayed went back-and-forth with "it's too much information" and "but we need to know historic values".

@shazraz
Copy link
Author

shazraz commented May 25, 2020

Im actually opposed to adding a toggle on a per column basis because it may clutter the UI and add additional complexity. If a user wants to view historical values, does it make sense if we limit this to retrieving them via the SDK as I described above?

My thought is that providing a list of metrics in the UI adds little value other than being able to quickly eyeball the metrics. Would love some user feedback on this from the ML team @Rayhane-mamah

@Dessa-ColeClifford
Copy link

The requirement to easily view this information was something that was user validated and discussed at length before implementation.

If we want to limit to the SDK, I am fine as long as we have more than those immediately involved (e.g. MLE's on the Risk project) as to not create a localized regression in functionality.

Sorry for pushing on this, it just feels that we are overfitting at the moment and want to make sure we validate that the general sentiment has truly changed before updating something.

@shazraz
Copy link
Author

shazraz commented May 25, 2020

This makes sense. Let me ping the ML team with this ticket and see if we can get some feedback.

@mohammedri
Copy link

mohammedri commented May 25, 2020

Excuse the awful drawing, but is this what you were thinking @Dessa-ColeClifford ?

image

If so - I agree with Cole - this was something we discussed heavily before and was always favoured. If we give the ability to expand the column view - which epoch run shows us up in the graph above?

@SebastianGergen
Copy link

I think a simple solution would be a (single) check box in the GUI to "show only latest value".

If the values/metrics are tracked as well with tensorboard, the user can visualize the evolution of a metric there, however in atlas, there is no option to plot this evolution of a metric yet, am I right? (The plot in atlas allows me to compare model parameters and final (I mean 'single valued') metric results, but does not show/visualize all appended values of a metric, right?) If I would not use tensorflow (tensorboard), I would miss such a plot of the evolution of a metric.

If I think about it now I would consider as well an additional tab in the "Details For Job" page (now: Logs, Artifacts) and add one tab with "metric plots" where lists of metric values are plotted. If a user wants a metric to be plotted, that could be indicated by a flag in the foundations.log_metric call.

When I understand it correctly this is what you @Dessa-ColeClifford and @mohammedri already indicated as well here.

@mohammedri
Copy link

With #165 we made the decision to only show the latest metric in the UI column based on @pippinlee 's research; I am curious whether we can easily show all the metrics recorded so far in detailed results page (where the artifacts and logs are as well)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

6 participants