Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a heuristic to the working-on ping #179

Open
TheOnlyMrCat opened this issue Nov 13, 2023 · 6 comments
Open

Add a heuristic to the working-on ping #179

TheOnlyMrCat opened this issue Nov 13, 2023 · 6 comments
Labels
enhancement update an existing command or cog for some new functionality

Comments

@TheOnlyMrCat
Copy link
Member

TheOnlyMrCat commented Nov 13, 2023

What is the status quo?

Currently, the bot pings two random server members at 5PM asking them what they're working on in an attempt to foster discussion and increase the activity of members.

In practice, this has seen mixed results: most of the time, the pings go completely unnoticed and un-cared for. Sometimes they land on an active server member who excitedly takes the opportunity to share one of their current projects, and sometimes still they land on a non-active member who brings a novel and interesting project to the table.

Unfortunately there's no algorithm we can write to tell if any non-active members have interesting projects they're likely to share, but we might be able to increase the response rate with heuristics somewhat.

What do we want out of a heuristic?

  • We want to continue the module's original purpose of bringing inactive members back to the discussion, even if only for a short amount of time
  • We want to improve the chance that one of the people pinged has something interesting to share
  • We want to avoid pinging the same few people over and over again, at least not without a significant gap in between pings

Some heuristics that have come up in discussion on discord:

  • Choose one member randomly as we're already doing, and have a heuristic only for the other slot
  • Choose a member who has been active in the past week and not been chosen by the Ping for at least a month
  • Choose a member that have been active in the past 3 months, but not been active in the past 30 days

Prior discussion:

@andrewj-brown andrewj-brown added the enhancement update an existing command or cog for some new functionality label Nov 14, 2023
@JamesDearlove
Copy link
Member

I had a brief discussion last year in the Discord about this as well, link to UQCS Discord message

@bradleysigma
Copy link
Contributor

So I've been thinking about how to go about this, specifically for how to keep track of who's been recently active, without putting a whole lot of extra stress on the bot. Here's what I've come up with:

  • In the database, have a table of users (by id) and months, with each entry representing that user being active in that month.
  • When the bot starts up, load the data from the table, with one set for the current month, and one set for the three? previous months. This will give a time range of three to four months, depending on how far through the month we currently are. Also create an empty set for newly seen members.
  • Whenever somebody posts a message, see if they are in the current month or newly seen sets. If not, add them to the newly seen set.
  • Every day at 5:00pm, choose a random member to ping from the current month intersect previous months sets. This will mean that the member has posted at least twice, and on at least two different days.
  • Every day at, say, 3:00am. add everyone from the newly seen list to the database and current month sets, then reset it to a new set. If it is the first day of the month, also clear out all entries from the database that are now four months old.

@andrewj-brown
Copy link
Member

A three-set system seems vastly more complicated than just storing (member, last_seen) and filtering by 3 months < last_seen < 1 month. If you are attached to the heuristic requiring multiple days, you could store (member, last_seen, days_posted), and maintain it by only incrementing days_posted if the last_seen day is earlier than the current day. Are there other advantages to the three-set system that I haven't thought of?

Additionally, I'm unsure if 3-to-1-month-activity is the best idea for a heuristic anyway. Someone who's been inactive for a month is likely to either 1. already not see the ping or 2. be specifically busy with Life:tm: and therefore not want to be pinged.

I think the 2nd discussed heuristic (pick someone active in the last week who hasn't been pinged for at least a month) would work better, but I'm open to arguments for and against.

@bradleysigma
Copy link
Contributor

The main reason I went with this method was to avoid the bot writing to the database for every single message sent, even if the same person sent a message mere minutes ago. I assume that would be computationally expensive. How expensive would it actually be?

@JamesDearlove
Copy link
Member

JamesDearlove commented Nov 20, 2023

Probably more expensive than our db can handle long term scaling wise, also the potential crashes could be horrific if the db connection locks up.

So realistically these would have to be implemented with in memory sets that are periodically flushed to the db (probably every hour), and on shutdown for when the bot needs to restart.

@andrewj-brown
Copy link
Member

I was thinking you'd only store last_seen with a granularity of one day, because that's the only relevant timescale for the rest of the code. You'd only be reading per-message, not writing, although I don't have access to server insights to know how many reads that would actually end up being.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement update an existing command or cog for some new functionality
Projects
None yet
Development

No branches or pull requests

4 participants