-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Job metrics are appended into a list instead of overwritten #163
Comments
Proposal: Pros: backwards compatible, user gets to choose behaviour if they want the existing value to be overwritten Alternative considerations:
|
I'm in favour of keeping backward compatibility as we haven't figured out a way to communicate changes between versions yet to users. This might improve the need to communicate changes to end users. 2 considerations:
|
I am also in favor of keeping backwards compatibility. In certain situations, having the ability to see all historical values would be valuable; so removing the functionality may just force us to bring it back later after a user complaint. As a thought, would it be possible to store all values in the backend but serve them to the user based on a frontend setting? Storing all values on the backend would also allow visualization of training and validation curves across models later on. |
@Dessa-ColeClifford I like that idea. We don't have to modify the SDK in that case. Were you thinking a toggle of sorts on a per-column basis for metrics? |
@ekhl A toggle per-column would be the quickest implementation that delivers all required functionality to close this bug. However, this toggle will only hide and not fix the comment "This makes the metrics very hard to read on the UI". In the toggle state that shows the list, we should still re-style the list popup. I can imagine a more complicated scenario where a cell with "complex data" (which would still need to be clearly defined) has an on-hover and/or on-click component that allows for visualizations (e.g. value over time) or further data analysis. But this shouldn't be treated as part of the described bug due to size of work and required design. |
How about if we continue to store all metrics in the backend but only render the most recent metric in the UI? I think this would be the least-effort path that maintains backwards compatibility. If the user wants access to all of the metrics for plotting purposes, a safe assumption is that they would want programmatic access. In this case, they can retrieve those metrics using |
I agree with this. A toggle can be added in later. The use-case is as simple as knowing the history of how the model did over time. The debate for what we displayed went back-and-forth with "it's too much information" and "but we need to know historic values". |
Im actually opposed to adding a toggle on a per column basis because it may clutter the UI and add additional complexity. If a user wants to view historical values, does it make sense if we limit this to retrieving them via the SDK as I described above? My thought is that providing a list of metrics in the UI adds little value other than being able to quickly eyeball the metrics. Would love some user feedback on this from the ML team @Rayhane-mamah |
The requirement to easily view this information was something that was user validated and discussed at length before implementation. If we want to limit to the SDK, I am fine as long as we have more than those immediately involved (e.g. MLE's on the Risk project) as to not create a localized regression in functionality. Sorry for pushing on this, it just feels that we are overfitting at the moment and want to make sure we validate that the general sentiment has truly changed before updating something. |
This makes sense. Let me ping the ML team with this ticket and see if we can get some feedback. |
Excuse the awful drawing, but is this what you were thinking @Dessa-ColeClifford ? If so - I agree with Cole - this was something we discussed heavily before and was always favoured. If we give the ability to expand the column view - which epoch run shows us up in the graph above? |
I think a simple solution would be a (single) check box in the GUI to "show only latest value". If the values/metrics are tracked as well with tensorboard, the user can visualize the evolution of a metric there, however in atlas, there is no option to plot this evolution of a metric yet, am I right? (The plot in atlas allows me to compare model parameters and final (I mean 'single valued') metric results, but does not show/visualize all appended values of a metric, right?) If I would not use tensorflow (tensorboard), I would miss such a plot of the evolution of a metric. If I think about it now I would consider as well an additional tab in the "Details For Job" page (now: Logs, Artifacts) and add one tab with "metric plots" where lists of metric values are plotted. If a user wants a metric to be plotted, that could be indicated by a flag in the foundations.log_metric call. When I understand it correctly this is what you @Dessa-ColeClifford and @mohammedri already indicated as well here. |
With #165 we made the decision to only show the latest metric in the UI column based on @pippinlee 's research; I am curious whether we can easily show all the metrics recorded so far in detailed results page (where the artifacts and logs are as well) |
When logging metrics using
foundations.log_metric()
, metrics for the same job across epochs are appended into a list instead of being overwritten.This is because of the JobMetricsConsumer.
This makes the metrics very hard to read on the UI.
The user should see only the latest metric in the UI. @Rayhane-mamah can you please comment?
The text was updated successfully, but these errors were encountered: