-
Notifications
You must be signed in to change notification settings - Fork 91
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a section to show model average score for all eval data #4050
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is cool! Can you also include a screenshot of what this looks like?
isExpanded: boolean; | ||
goldenStatVars: string[]; | ||
overrideStatVars: EmbeddingObject[]; | ||
setScore: (modelName: string, sentence: string, score: number) => void; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: add a comment about what score we're setting here? maybe also rename to onScoreUpdated
(here and everywhere else)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
static/js/apps/nl_eval/app.tsx
Outdated
|
||
import { stringifyFn } from "../../utils/axios"; | ||
import { QuerySection } from "./query_section"; | ||
import { EmbeddingObject } from "./util"; | ||
|
||
function OverallScoreTable(props: { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
have a separate file for this component?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
static/js/apps/nl_eval/app.tsx
Outdated
|
||
import { stringifyFn } from "../../utils/axios"; | ||
import { QuerySection } from "./query_section"; | ||
import { EmbeddingObject } from "./util"; | ||
|
||
function OverallScoreTable(props: { | ||
data: Record<string, Record<string, number>>; | ||
count: number; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a comment about what this is a count of? or have a better name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
static/js/apps/nl_eval/app.tsx
Outdated
|
||
import { stringifyFn } from "../../utils/axios"; | ||
import { QuerySection } from "./query_section"; | ||
import { EmbeddingObject } from "./util"; | ||
|
||
function OverallScoreTable(props: { | ||
data: Record<string, Record<string, number>>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comment what the keys and values are?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
static/js/apps/nl_eval/app.tsx
Outdated
useEffect(() => { | ||
// Check if data is complete. If so, set mark to an empty string so that no | ||
// new render is triggered. | ||
if (Object.values(Object.values(props.data)[0]).length === props.count) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is Object.values(props.data)
always going to be non empty? If it can be empty, should check before accessing the first element
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not needed anymore as the logic is moved to App where each model should have the same length of values as the eval dataset.
static/js/apps/nl_eval/app.tsx
Outdated
return; | ||
} | ||
// Update mark to be a new timestamp frequently if data is not complete | ||
const timerId = setInterval(() => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of the mark and timer, could this work:
- add a new
completedOverallScore
state in App component - in
handleUpdateOverallScore
, check if scores are complete. If complete, updatecompletedOverallScore
- pass
completedOverallScore
toOverallScoreTable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a really good suggestion. Works perfect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review, it's very helpful!
static/js/apps/nl_eval/app.tsx
Outdated
|
||
import { stringifyFn } from "../../utils/axios"; | ||
import { QuerySection } from "./query_section"; | ||
import { EmbeddingObject } from "./util"; | ||
|
||
function OverallScoreTable(props: { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
static/js/apps/nl_eval/app.tsx
Outdated
|
||
import { stringifyFn } from "../../utils/axios"; | ||
import { QuerySection } from "./query_section"; | ||
import { EmbeddingObject } from "./util"; | ||
|
||
function OverallScoreTable(props: { | ||
data: Record<string, Record<string, number>>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
static/js/apps/nl_eval/app.tsx
Outdated
|
||
import { stringifyFn } from "../../utils/axios"; | ||
import { QuerySection } from "./query_section"; | ||
import { EmbeddingObject } from "./util"; | ||
|
||
function OverallScoreTable(props: { | ||
data: Record<string, Record<string, number>>; | ||
count: number; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
isExpanded: boolean; | ||
goldenStatVars: string[]; | ||
overrideStatVars: EmbeddingObject[]; | ||
setScore: (modelName: string, sentence: string, score: number) => void; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
static/js/apps/nl_eval/app.tsx
Outdated
return; | ||
} | ||
// Update mark to be a new timestamp frequently if data is not complete | ||
const timerId = setInterval(() => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a really good suggestion. Works perfect.
static/js/apps/nl_eval/app.tsx
Outdated
useEffect(() => { | ||
// Check if data is complete. If so, set mark to an empty string so that no | ||
// new render is triggered. | ||
if (Object.values(Object.values(props.data)[0]).length === props.count) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not needed anymore as the logic is moved to App where each model should have the same length of values as the eval dataset.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the changes!
Score for each model is passed from child component to the App. Use state in the parent component is very slow due to the large number of children components. Here uses a ref object to collect and a timer to check if all scores have been collected.
Also