Add spam classifier with google gemini flash 2.0 experimental and check for spam earlier #58

nonprofittechy · 2025-01-03T21:22:15Z

This adds a basically free and optional spam filter to the feedback form, driven by Google Gemini.

If the user's message passes the keyword filter, it will be sent to Google Gemini flash 2.0 experimental for additional filtering. As of 1/3/2025, the free tier has a limit of 1,500 queries/day, plenty to handle the small volume of feedback form spam we've been dealing with (a dozen a month in some cases).

To use it, a google gemini api key must be added to the global configuration.

I also noticed that the existing spam filtering wasn't being used except when the form fell back to delivering an email. I'm not sure why that was the case but this should also solve that problem.

…al - also check for spam in more situations

aryy-suffolk · 2025-01-07T00:28:51Z

docassemble/GithubFeedbackForm/data/questions/feedback.yml

+            log(f"~~~USER FEEDBACK~~~ {github_repo} -{issue_template.subject_as_html(trim=True)} - {issue_template.content_as_html(trim=True)}")
+      mark_task_as_performed('issue noted', persistent=True)
+    else:
+      log("Already sent feedback to github from a feedback interview, not going to send again")


I think we can likely simplify the nested ifs much more. Current structure, as I understand it:

if feedback looks like spam: log mark task as done set some values else: if task not yet performed: prepare saved_uuid if user should be added: add user if feedback should be sent to github: prepare if url: if saved_uuid: link else: log if error email AND not spam: log send else: log else: log set note_issue to true

What we could do, to reduce code duplication and the logic branches:

if feedback looks like spam: log mark as done save values return if task already performed: log return if should add user to panel: add user if send feedback to github: create issue if issue_url and saved_uuid: link to issue else: log error if error email configured: send else: log mark as done set note_issue to true

We can't use a return statement here in a Docassemble code block, unfortunately! I can take another look at simplifying this--I just wanted to be careful to scope my change to be as small as possible.

docassemble/GithubFeedbackForm/data/questions/feedback.yml

docassemble/GithubFeedbackForm/github_issue.py

aryy-suffolk · 2025-01-07T00:36:07Z

docassemble/GithubFeedbackForm/github_issue.py

+    if not context:
+        context = "a guided interview in the legal context"
+


To improve readability when setting defaults, we can use or. For example:

context = context or "a guided interview in the legal context" gemini_api_key = gemini_api_key or get_config("google gemini api key") ... etc ...

aryy-suffolk · 2025-01-07T00:42:37Z

docassemble/GithubFeedbackForm/github_issue.py

+    try:
+        response = model.generate_content(body)
+        if response.text.strip() == "spam":
+            return True


I get the sense that this would be readable if we folded it into the other try. I'm not sure there's a need to keep them distinct. We can leverage using specific exception types to do this. The structure would change to something like this:

try: attempt configuration generate the response except UseANameException as e: log error configuring return False except Exception as e: log generic error

nonprofittechy added 4 commits January 3, 2025 15:07

Fix #54 - add spam classifier with google gemini flash 2.0 experiment…

16f6591

…al - also check for spam in more situations

edit type ignore list

95227a9

Typo in docstring

72c01b3

Make model more easily configurable

9b9c0a5

nonprofittechy requested review from samglover and aryy-suffolk January 3, 2025 21:22

aryy-suffolk reviewed Jan 7, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add spam classifier with google gemini flash 2.0 experimental and check for spam earlier #58

Add spam classifier with google gemini flash 2.0 experimental and check for spam earlier #58

nonprofittechy commented Jan 3, 2025

aryy-suffolk Jan 7, 2025

nonprofittechy Jan 8, 2025

aryy-suffolk Jan 7, 2025

aryy-suffolk Jan 7, 2025

		if not context:
		context = "a guided interview in the legal context"

Add spam classifier with google gemini flash 2.0 experimental and check for spam earlier #58

Are you sure you want to change the base?

Add spam classifier with google gemini flash 2.0 experimental and check for spam earlier #58

Conversation

nonprofittechy commented Jan 3, 2025

aryy-suffolk Jan 7, 2025

Choose a reason for hiding this comment

nonprofittechy Jan 8, 2025

Choose a reason for hiding this comment

aryy-suffolk Jan 7, 2025

Choose a reason for hiding this comment

aryy-suffolk Jan 7, 2025

Choose a reason for hiding this comment