Matching Engine Analytics: Percentage of users with availability data #311

vingkan · 2022-07-22T19:48:09Z

Goal

Calculate what percentage of users in a community have filled out their availability.

This well help us track whether people are eligible for schedule matches at all. Currently, there is no way for users to edit their schedule, so for now, there will be no real data to test this feature with, which means you will have to write good unit tests to ensure that it will work correctly when the real data arrives.

Definition of Done

Write a function that takes in a list of User objects and returns a tuple of two counts
- The first count in the tuple is the number of users who have availability data
- The second count in the tuple is the total number of users
Write unit tests for the above function
- Why a tuple? It will be more reliable to test the numerator and denominator, rather than the percentage
Integrate the above function into the matching pipeline
- Call your function in the display_internal_matching_metrics() task, which is used in the matching flow
- Add a line to log the percentage of users who have filled out their availability, using the tuple output
- Run the matching flow with run flow matching to make sure that it succeeds and logs the correct information

Code Pointers

Implementation and Tests

Write your own function (no need for a class) to calculate this tuple
You can create a new file pipeline/transform/schedule.py for your function
You can create a new file pipeline/transform/schedule_test.py for your unit tests
Similar to your core project, read examples from the codebase to see how to use functions, tests, and types

User Availability Data

Just like in your schedule match generator, you can check whether a user has availability data by accessing the schedule field of the User class:

butterfly/pipeline/types/user.py

Line 26 in f21cdc8

schedule: List[Availability] = field(default_factory=list)

Pipeline Integration

This function display_internal_matching_metrics() is a Prefect task, which runs as part of the matching flow
It takes calculated metrics and then displays them by writing them to the pipeline logs.
You can get the list of users in the community from the MatchingOutput parameter by accessing output.users
Call your function, then log the percentage of users who have availability data
If you want, you can add a helper function to format a tuple into a percentage

butterfly/pipeline/load/display_metrics.py

Lines 27 to 38 in f21cdc8

 @task 

 def display_internal_matching_metrics( 

 output: MatchingOutput, metrics: MatchingMetrics 

 ): 

 """Task to display matching engine metrics.""" 

 logger = prefect.context.get("logger") 

 proposed_matches_per_user = render_counts_per_user( 

 output.users, metrics.n_proposed_matches_per_user 

 ) 

 matched_user_emails = render_user_emails(output.users) 

 logger.info(f"\nMatches Proposed per User:\n{proposed_matches_per_user}") 

 logger.info(f"\nEmails of Matched Users:\n{matched_user_emails}")

The text was updated successfully, but these errors were encountered:

rbrooks6 · 2022-07-29T13:29:44Z

@vingkan Emma is done with her core project now and is ready to start the stretch assignment! 🎉 What did you have in mind for this assignment in terms of details? Could you please add details or explain it to me so that I can add details to the issue?

vingkan · 2022-07-30T03:37:50Z

@rbrooks6 Thanks for the reminder! I updated this issue with more details. The instructions for #312 will be very similar, so if @emmadiamon finishes this task quickly, she can follow a similar approach for that task.

vingkan added enhancement New feature or request pipeline Related to the offline pipelines data Related to data or types labels Jul 22, 2022

vingkan assigned emmadiamon Jul 22, 2022

vingkan mentioned this issue Jul 22, 2022

Matching Engine Analytics: Top 3 days/times when users in the community are available #312

Open

vingkan changed the title ~~Matching Engine Analytics: Percentage of users with schedule data~~ Matching Engine Analytics: Percentage of users with availability data Jul 22, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Matching Engine Analytics: Percentage of users with availability data #311

Matching Engine Analytics: Percentage of users with availability data #311

vingkan commented Jul 22, 2022 •

edited by emmadiamon

Loading

rbrooks6 commented Jul 29, 2022

vingkan commented Jul 30, 2022

Matching Engine Analytics: Percentage of users with availability data #311

Matching Engine Analytics: Percentage of users with availability data #311

Comments

vingkan commented Jul 22, 2022 • edited by emmadiamon Loading

Goal

Definition of Done

Code Pointers

Implementation and Tests

User Availability Data

Pipeline Integration

rbrooks6 commented Jul 29, 2022

vingkan commented Jul 30, 2022

vingkan commented Jul 22, 2022 •

edited by emmadiamon

Loading