Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create map for conversation count #135

Open
hvgazula opened this issue Dec 25, 2022 · 3 comments
Open

Create map for conversation count #135

hvgazula opened this issue Dec 25, 2022 · 3 comments
Assignees

Comments

@hvgazula
Copy link
Collaborator

@VeritasJoker How do you propose we handle this condition

elif args.subject == "676" and "blenderbot" in args.embedding_type:
in

CONVERSATIONS_MAP = {
    "podcast": dict.fromkeys(
        SUBJECTS["podcast"],
        1,
    ),
    "tfs": dict.fromkeys(["625", "676", "7170", "798"], [54, 78, 24, 15]),
}

if we go the route of https://github.com/hassonlab/247-pickling/blob/main/scripts/tfspkl_config.py#L11-L30

@VeritasJoker
Copy link
Contributor

Huh good question. Is it possible to do two values for one patient key in the dict? I think 2 convos for 676 only fails for specific groups of models (seq-2-seq and per-utterance MLM models) so we can set that up specifically?

Another simpler method I can think of is to just not assert the number of conversations, as in if the conversation does not fit the patient, print out something but still do the concatenation anyways.

@hvgazula
Copy link
Collaborator Author

I'm not too fond of the latter as it will not tell us if, say, some conversations (jobs) were still pending (worst-case).

@zkokaja
Copy link
Contributor

zkokaja commented Jan 5, 2023

Seems like we can do away with the num_convs check now that we are indexing. But in addition, we can perform a "merge test" on concatenated embeddings and the base df to ensure that all embeddings align with something in the base_df, and then we can print out a warning if conversations don't have embeddings.

@zkokaja zkokaja pinned this issue Feb 2, 2023
@zkokaja zkokaja assigned hvgazula and VeritasJoker and unassigned VeritasJoker Feb 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants