Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add support for enabling Mastodon 4.2 search indexing #656

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

osmaa
Copy link
Contributor

@osmaa osmaa commented Nov 16, 2023

adds the opt-in attribute which enables Mastodon 4.2 to index toots from an account

Copy link
Member

@andrewgodwin andrewgodwin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was actually considering just tying this to "search_enabled", since we already have that feature built-in on the profile level?

@osmaa
Copy link
Contributor Author

osmaa commented Nov 16, 2023

I was considering that, too, but it seemed to me that it was conceptually a little different. search_enabled is a feature flag for (a partially implemented?) local search of local accounts, while indexable is an AP Actor level flag for opt-in to being indexed on remote servers.

Was this the other way around, ie indexable had come first, then it would be obvious to implement search_enabled on top of that.

Heads up though! The migration appears to have created this as a non-nullable column, and I missed at least one code path which leaves the attribute null during fetch/create. Will review.

@osmaa
Copy link
Contributor Author

osmaa commented Nov 16, 2023

Hmm, my lack of experience with Django shows again. As far as I can see, indexable is defined to default to false everywhere, but somehow it's still passed as null into a database insert here..
https://github.com/osmaa/takahe/blob/883c607468252fdfbf107cfe5c35ca86d6afc70c/users/models/identity.py#L452

@andrewgodwin
Copy link
Member

I agree they are different meanings conceptually, but I still would like to combine the meanings now - two search options just seems too many, and I don't see a lot of cases where people would enable it locally but not remotely and vice-versa. It's a little bit of an expectation-breaking change, but I am alright with it in this instance.

@osmaa
Copy link
Contributor Author

osmaa commented Nov 16, 2023

Fully agree that the privacy-related settings in Mastodon are too many. I've been meaning to outline a matrix of all of the possible combinations to see which of them even make sense. I don't know what to make of the existence of these technically valid combos, for example:

discoverable=false, indexable=true, toot=public (it's not listed on Mastodon's local timeline, but can be found by text search)
discoverable=true, indexable=false, toot=public (is listed, but not search indexed)
discoverable=true, indexable=true, toot=unlisted (not listed nor searchable)
discoverable=true, indexable=true, noindex=true (opted in to be indexed by everyone but web search engines)
discoverable=false, indexable=false, noindex=false (opted out of being found on Mastodon, while allowing web search crawlers)

It's a mess. Is it a mess that can be cleaned up? If it was just me, I'd just merge all three account level settings to one (values: promote, search, unlisted), and disallow use of "public" toot level on unlisted accounts.

@andrewgodwin
Copy link
Member

Right, it being a bit of a mess was kind of the thing I wanted to avoid. I do think that in Takahē's case, with just two options - "discoverable" and "search_enabled" - we end up with only three sensible configurations:

  • Discoverable and searchable: Where most people probably end up
  • Discoverable but not searchable: Maybe you're trying to avoid harassment enabled via search
  • Not discoverable but searchable: Should not be allowed, makes no sense
  • Not discoverable or searchable: Traditional privacy-focused stance

I'm not sure how sensible it might be to make the UI switch search off if you flip discoverable off, but it feels like it should.

@osmaa
Copy link
Contributor Author

osmaa commented Nov 17, 2023

I would argue that:

Discoverable but not searchable: Maybe you're trying to avoid harassment enabled via search

is superfluous and should be instead delivered by automatic pruning of old toots from both timelines and search indices. "Allow my toots to be discovered but only for X days/weeks".

While your:

Not discoverable but searchable: Should not be allowed, makes no sense

That would be someone who opts in to be found by explicit search, but wouldn't want to be shown in trending lists or being algorithmically promoted.

I didn't even include that Mastodon further complicates this by having different logic for hashtags. Again, if it was just me, I'd say that hashtags should be restricted to public toots only. Yes, there are nuances like being generally unsearchable but opening tiny windows into discovery on very specific topics only, but the complexities around documenting that kind of behavior make it into a trap.

So the question really is, how much does it make sense to try to do things different to Mastodon, which has evolved to a weird legacy of incompatible layers, but is the dominant source and consumer of ActivityPub content. Plus, if you still also have plans of also exploring AT proto PDS functionality, that'll map different. Mostly just 100% public with no control over third party indexing, though..

@andrewgodwin
Copy link
Member

Well, automatic pruning of local things from searches would be nice, but that's a separate feature so I'm not going to say we should do that now.

In general I want to keep Takahē relatively low on options and complexity - so I think just tying Mastodon's indexable property to "search enabled" and changing its help text to say that it enables you to be searched locally and remotely would be the way to go here.

@alphatownsman
Copy link
Contributor

this seems quite important feature for users like mine, regardless of separate option or not. what's best next step to get it merged?

@andrewgodwin
Copy link
Member

I'm willing to accept a PR for this that just does this flag based off of our existing search_enabled and discoverable flags, where you get marked as having search indexing allowed if they're both true.

@alphatownsman
Copy link
Contributor

alphatownsman commented Dec 6, 2023

does this flag based off of our existing search_enabled and discoverable flags

There's no perfect solution and I can totally live with this.

how much does it make sense to try to do things different to Mastodon

@osmaa I agree with you this is real concern if Mastodon exposes these searchable options separately via API, but right now they are only changeable in UI I guess, so I'm ok with Andrew's suggestion above.

@alphatownsman
Copy link
Contributor

@osmaa @AstraLuma this is absolutely great feature. any chance get this updated / merged? happy to do anything I can to help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants