Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: filters v2 server-side warning/hiding #2793

Merged
merged 13 commits into from
May 6, 2024

Conversation

VyrCossont
Copy link
Contributor

@VyrCossont VyrCossont commented Apr 1, 2024

Description

This PR implements Mastodon-4.0-compatible server-side filtering, aka v2 filtering. Statuses matching a filter with a warn action will have a filtered field added to them so that clients can display a click-through warning with context, or hide them entirely depending on user preferences. (Clients not aware of this field can still use v1 client-side filtering.) Statuses matching a filter with a hide action will not be sent to the client at all, which works for all clients regardless of supported filter APIs.

This PR doesn't implement the CRUD API for v2 filters, or allow the creation of hide filters through the v1 API. I'll do that in a followup PR. Our internal filter storage is already in the v2 format, so the CRUD API for v1 filters can be used to manage warn filters meanwhile.

Impact

Enables filtering on Phanpy and other clients that support v2 filters but not v1 filters. (Note that Phanpy doesn't have UI for managing filters, so they have to be created through another client using the v1 API.)

Doesn't affect clients that only support v1 filters (Semaphore, Feditext).

Implementation

Extends typeutils.StatusToAPIStatus to take a filter context (public or tag timeline, home or list timeline, notifications, thread, account's statuses, or none) and the filter list from the requesting user, if there is one. About half of StatusToAPIStatus calls are from a context where we would apply filters, so adding filters to that function makes sense to me (I could be convinced otherwise if this goes beyond the traditional scope of typeutils). Similarly extends typeutils.NotificationToAPINotification to take a filter list (the filter context is implicitly notifications).

Applying the filters generates either:

  • no match: the Status is not modified by filtering.
  • for a matching filter with the warn filter action, a FilterResult, which contains a filter that matched (even if multiple filters match, it can only hold one), and the list of the actual keywords and status IDs within that filter that matched. This gets attached to the returned Status so clients can put it behind a warning or hide it themselves.
  • for a matching filter with the hide filter action, an ErrHideStatus error, which signals to StatusToAPIStatus callers that the Status should not be included in the results at all.

A matching hide filter takes action priority over a matching warn filter, so if we match a warn filter, we will record the match but keep going through the filter list in case there are any matching hide filters. Matching a hide filter exits early and doesn't bother checking the rest of the filters.

Filters don't apply to a user's own posts.

I've extended StatusToAPIStatus callers to:

  • include the filter context if applicable: APIs like the user's bookmarks list don't filter, and use a named constant for the empty string to indicate no context.
  • fetch the requesting user's filters if applicable: a few APIs don't always require auth, so sometimes there's no user.
  • handle ErrHideStatus by returning early with no error (when adding a single status to a timeline) or by skipping a status (when returning several statuses).

Performance

Inherently slower since we're doing some more work server-side. How much slower depends on how many filters a user has.

Filters are fetched from the existing filter cache if possible instead of fetching from the DB, like most DB objects. However, the cache/DB representation isn't quite what the filtering code uses. We could store precompiled regexps and status ID sets per filter context, as well as lookup maps for generating the filtered attribute, at the complexity cost of adding another kind of cache and having to invalidate all of it for a user every time any of their filters change.

Alternatively, instead of caching regexps and status IDs across requests, we could fetch the user's cached DB filters per request (as this PR already does), compile them to regexps and status IDs for a given filter context, and tweak StatusToAPIStatus to take those regexps and status IDs instead of a filter list. This would save a little work when filtering multiple statuses at once, for example, when returning a timeline page, and isn't as complex as the previous option since there's no second cache.

Either of these could be done in a followup PR.

Expired filters are not purged automatically, since Mastodon intentionally doesn't do that in case the user wants to renew an expired filter later. We could add a recurring task to purge very old ones automatically in a followup PR if user filter list size becomes a problem due to expired filters specifically.

Checklist

  • I/we have read the GoToSocial contribution guidelines.
  • I/we have discussed the proposed changes already, either in an issue on the repository, or in the Matrix chat.
  • I/we have not leveraged AI to create the proposed changes.
  • I/we have performed a self-review of added code.
  • I/we have written code that is legible and maintainable by others.
  • I/we have commented the added code, particularly in hard-to-understand areas.
  • I/we have made any necessary changes to documentation.
  • I/we have added tests that cover new code.
  • I/we have run tests and they pass locally with the changes.
  • I/we have run go fmt ./... and golangci-lint run.

@VyrCossont
Copy link
Contributor Author

VyrCossont commented Apr 1, 2024

One more TODO:

  • when filtering a status's content, we should first convert the HTML to plain text so filters don't match HTML tags by accident.

@tsmethurst
Copy link
Contributor

Will get to this as soon as 0.15.0 is out and other pending PRs are squerged :)

@tsmethurst
Copy link
Contributor

Apologies for leaving this in review request limbo for a while, got caught up in a necessary refactor of the frontend stuff so I haven't got around to looking at it yet!

Copy link
Member

@daenney daenney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly this seem alright to me. The tests pass, as do the new ones too so that's also a good sign (the tests fail on the Swagger gen test, it just needs regenerating).

I've left some minor comments, mostly a few places where bare empty strings are being passed in instead of the FilterContextNone.

internal/api/model/filterv2.go Show resolved Hide resolved
// FilterActionHide filters will remove this status from API results.
FilterActionHide FilterAction = "hide"

FilterActionNumValues = 2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems unused.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's currently unused, so I'll remove it for now., and add it back in with the rest of the public v2 filter API.

@@ -184,95 +184,14 @@ func (p *Processor) GetAPIStatus(
apiStatus *apimodel.Status,
errWithCode gtserror.WithCode,
) {
apiStatus, err := p.converter.StatusToAPIStatus(ctx, target, requester)
apiStatus, err := p.converter.StatusToAPIStatus(ctx, target, requester, "", nil)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
apiStatus, err := p.converter.StatusToAPIStatus(ctx, target, requester, "", nil)
apiStatus, err := p.converter.StatusToAPIStatus(ctx, target, requester, custom.FilterContextNone, nil)

@@ -113,7 +113,7 @@ func (p *Processor) packageStatuses(
continue
}

apiStatus, err := p.converter.StatusToAPIStatus(ctx, status, requestingAccount)
apiStatus, err := p.converter.StatusToAPIStatus(ctx, status, requestingAccount, "", nil)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
apiStatus, err := p.converter.StatusToAPIStatus(ctx, status, requestingAccount, "", nil)
apiStatus, err := p.converter.StatusToAPIStatus(ctx, status, requestingAccount, custom.FilterContextNone, nil)

@@ -1446,7 +1606,7 @@ func (c *Converter) ReportToAdminAPIReport(ctx context.Context, r *gtsmodel.Repo
}
}
for _, s := range r.Statuses {
status, err := c.StatusToAPIStatus(ctx, s, requestingAccount)
status, err := c.StatusToAPIStatus(ctx, s, requestingAccount, "", nil)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
status, err := c.StatusToAPIStatus(ctx, s, requestingAccount, "", nil)
status, err := c.StatusToAPIStatus(ctx, s, requestingAccount, custom.FilterContextNone, nil)

ExpiresAt: expiresAt,
Irreversible: filter.Action == gtsmodel.FilterActionHide,
}, nil
return ""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be nice to have a apimodel.FitlerActionNone here and return that from the default in the switch statement.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed.

@@ -427,7 +428,7 @@ func (suite *InternalToFrontendTestSuite) TestLocalInstanceAccountToFrontendBloc
func (suite *InternalToFrontendTestSuite) TestStatusToFrontend() {
testStatus := suite.testStatuses["admin_account_status_1"]
requestingAccount := suite.testAccounts["local_account_1"]
apiStatus, err := suite.typeconverter.StatusToAPIStatus(context.Background(), testStatus, requestingAccount)
apiStatus, err := suite.typeconverter.StatusToAPIStatus(context.Background(), testStatus, requestingAccount, "", nil)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
apiStatus, err := suite.typeconverter.StatusToAPIStatus(context.Background(), testStatus, requestingAccount, "", nil)
apiStatus, err := suite.typeconverter.StatusToAPIStatus(context.Background(), testStatus, requestingAccount, custom.FilterContextNone, nil)

func (suite *InternalToFrontendTestSuite) TestStatusToFrontendUnknownAttachments() {
testStatus := suite.testStatuses["remote_account_2_status_1"]
requestingAccount := suite.testAccounts["admin_account"]

apiStatus, err := suite.typeconverter.StatusToAPIStatus(context.Background(), testStatus, requestingAccount)
apiStatus, err := suite.typeconverter.StatusToAPIStatus(context.Background(), testStatus, requestingAccount, "", nil)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
apiStatus, err := suite.typeconverter.StatusToAPIStatus(context.Background(), testStatus, requestingAccount, "", nil)
apiStatus, err := suite.typeconverter.StatusToAPIStatus(context.Background(), testStatus, requestingAccount, custom.FilterContextNone, nil)

@@ -774,7 +950,7 @@ func (suite *InternalToFrontendTestSuite) TestStatusToFrontendUnknownLanguage()
*testStatus = *suite.testStatuses["admin_account_status_1"]
testStatus.Language = ""
requestingAccount := suite.testAccounts["local_account_1"]
apiStatus, err := suite.typeconverter.StatusToAPIStatus(context.Background(), testStatus, requestingAccount)
apiStatus, err := suite.typeconverter.StatusToAPIStatus(context.Background(), testStatus, requestingAccount, "", nil)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
apiStatus, err := suite.typeconverter.StatusToAPIStatus(context.Background(), testStatus, requestingAccount, "", nil)
apiStatus, err := suite.typeconverter.StatusToAPIStatus(context.Background(), testStatus, requestingAccount, custom.FilterContextNone, nil)

// along with this program. If not, see <http://www.gnu.org/licenses/>.

// Package custom represents custom filters managed by the user through the API.
package custom
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would probably rename this package to user instead of custom, since they're user filters. I think it's implicitly understood they're kinda custom/unique to a user. It reads a bit odd in other import paths to see stuff like custom.FilterContextNone, user.FilterContextNone makes it a bit more obvious for me as to what kind of filter this is.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, no problem.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

user sounds like it could be a filter for users as well as a filter managed by a user, so we ended up going with status, imported as statusfilter to avoid conflicts with variables named status.

@tsmethurst
Copy link
Contributor

tsmethurst commented May 2, 2024

Once I've finished fiddling with this bugfix I'm working on, I'll push to this to bring it up to date with main to save you the trouble @VyrCossont :)

@VyrCossont
Copy link
Contributor Author

Ready for any necessary further review.

@NyaaaWhatsUpDoc
Copy link
Member

great work! thank you so much @VyrCossont !

@NyaaaWhatsUpDoc NyaaaWhatsUpDoc merged commit 45f4afe into superseriousbusiness:main May 6, 2024
2 checks passed
@VyrCossont VyrCossont deleted the filters-v2 branch May 15, 2024 21:50
@VyrCossont VyrCossont mentioned this pull request May 27, 2024
10 tasks
nyarla pushed a commit to nyarla/gotosocial-modded that referenced this pull request Jun 19, 2024
…2793)

* Remove dead code

* Filter statuses when converting to frontend representation

* status.filtered is an array

* Make matching case-insensitive

* Remove TODOs that don't need to be done now

* Add missing filter check for notification

* lint: rename ErrHideStatus

* APIFilterActionToFilterAction not used yet

* swaggerino docseroni

* Address review comments

* Add apimodel.FilterActionNone

---------

Co-authored-by: tobi <[email protected]>
Co-authored-by: tobi <[email protected]>
nyarla pushed a commit to nyarla/gotosocial-modded that referenced this pull request Jun 19, 2024
…2793)

* Remove dead code

* Filter statuses when converting to frontend representation

* status.filtered is an array

* Make matching case-insensitive

* Remove TODOs that don't need to be done now

* Add missing filter check for notification

* lint: rename ErrHideStatus

* APIFilterActionToFilterAction not used yet

* swaggerino docseroni

* Address review comments

* Add apimodel.FilterActionNone

---------

Co-authored-by: tobi <[email protected]>
Co-authored-by: tobi <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants