-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add spam classifier with google gemini flash 2.0 experimental and check for spam earlier #58
base: main
Are you sure you want to change the base?
Conversation
…al - also check for spam in more situations
log(f"~~~USER FEEDBACK~~~ {github_repo} -{issue_template.subject_as_html(trim=True)} - {issue_template.content_as_html(trim=True)}") | ||
mark_task_as_performed('issue noted', persistent=True) | ||
else: | ||
log("Already sent feedback to github from a feedback interview, not going to send again") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can likely simplify the nested ifs much more. Current structure, as I understand it:
if feedback looks like spam:
log
mark task as done
set some values
else:
if task not yet performed:
prepare saved_uuid
if user should be added:
add user
if feedback should be sent to github:
prepare
if url:
if saved_uuid:
link
else:
log
if error email AND not spam:
log
send
else:
log
else:
log
set note_issue to true
What we could do, to reduce code duplication and the logic branches:
if feedback looks like spam:
log
mark as done
save values
return
if task already performed:
log
return
if should add user to panel:
add user
if send feedback to github:
create issue
if issue_url and saved_uuid:
link to issue
else:
log error
if error email configured:
send
else:
log
mark as done
set note_issue to true
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't use a return statement here in a Docassemble code
block, unfortunately! I can take another look at simplifying this--I just wanted to be careful to scope my change to be as small as possible.
if not context: | ||
context = "a guided interview in the legal context" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To improve readability when setting defaults, we can use or
. For example:
context = context or "a guided interview in the legal context"
gemini_api_key = gemini_api_key or get_config("google gemini api key")
... etc ...
try: | ||
response = model.generate_content(body) | ||
if response.text.strip() == "spam": | ||
return True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get the sense that this would be readable if we folded it into the other try. I'm not sure there's a need to keep them distinct. We can leverage using specific exception types to do this. The structure would change to something like this:
try:
attempt configuration
generate the response
except UseANameException as e:
log error configuring
return False
except Exception as e:
log generic error
Fix #54
This adds a basically free and optional spam filter to the feedback form, driven by Google Gemini.
If the user's message passes the keyword filter, it will be sent to Google Gemini flash 2.0 experimental for additional filtering. As of 1/3/2025, the free tier has a limit of 1,500 queries/day, plenty to handle the small volume of feedback form spam we've been dealing with (a dozen a month in some cases).
To use it, a
google gemini api key
must be added to the global configuration.I also noticed that the existing spam filtering wasn't being used except when the form fell back to delivering an email. I'm not sure why that was the case but this should also solve that problem.