Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regex filter not activating #1519

Closed
toonn opened this issue Feb 24, 2025 · 2 comments · Fixed by #1521
Closed

Regex filter not activating #1519

toonn opened this issue Feb 24, 2025 · 2 comments · Fixed by #1521

Comments

@toonn
Copy link

toonn commented Feb 24, 2025

This might be user error rather than a bug but I can't get an exclusion filter to work.

Last Week Tonight posts videos on Youtube that are geolocked. Conveniently the titles follow the common way of labeling episodes with the season and episode number, Sxx Eyy.

I've tried various formulations of what I understand a regex field filter would be. I started with !title:/^S[0-9]+ E[0-9]+:/, which is what I expect would be the correct filter to exclude these videos. I've tried escaping various things like the + metacharacters but haven't gotten it to work. Problem with testing is that I assume the filter only applies to new entries and I don't know when the next episode will be posted, so I don't find out the filter doesn't work until another of these geolocked episodes shows up in the feed.

I assume the title field filter tries to match the string in the following span:

<span class="entry-title-link" aria-expanded="false" aria-current="true" role="link" tabindex="0">S12 E02: DOGE, National Parks &amp; Content Moderation: 2/23/25: Last Week Tonight with John Oliver</span>
@jtojnar
Copy link
Member

jtojnar commented Feb 25, 2025

Thanks for reporting.

Your filter expression looks okay to me and works for me in tests:

--- a/tests/Helpers/FilterTest.php
+++ b/tests/Helpers/FilterTest.php
@@ -492,6 +492,15 @@ final class FilterTest extends TestCase {
             ),
             true,
         ];
+
+        yield 'Not(Title): Match' => [
+            '!title:/^S[0-9]+ E[0-9]+:/',
+            self::mkItem(
+                /* title: */ 'S12 E02: DOGE, National Parks &amp; Content Moderation: 2/23/25: Last Week Tonight with John Oliver',
+                /* content: */ ''
+            ),
+            false,
+        ];
     }
 
     /**

What selfoss version are you using? It is only available in 2.20 preview builds, not 2.19.

The only other reason it would not be working I see is if there are extra spaces at the beginning of the string.

As for testing, I have a preview feature on todo list and have it implemented as CLI script but still need to design a UI. For now, there is a (bit tiresome) option of creating new source, deleting its items in the database before each filter change and refreshing.

@toonn
Copy link
Author

toonn commented Feb 26, 2025

That must be my problem. I noticed the filter field and read these docs on it. Maybe the docs can have a version indication/selector of some sort, or at least a warning they're not for the latest stable release?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants