Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reCaptcha not working from China #2993

Closed
ajnyga opened this issue Oct 31, 2017 · 29 comments
Closed

reCaptcha not working from China #2993

ajnyga opened this issue Oct 31, 2017 · 29 comments
Labels
Hosting Bug reports and feature requests from Publishing Services's hosted clients.

Comments

@ajnyga
Copy link
Collaborator

ajnyga commented Oct 31, 2017

Hi,

I do not have first hand knowledge of this, but one of our journals is reporting that an author from China is unable to register to OJS because the reCaptcha is not visible.

This is probably due to local internet policies and noticed elsewhere as well: stellar-deprecated/stellar-client#1162

Should OJS have a secondary system for preventing spam that would not have the limitations of reCaptcha?

@mfelczak
Copy link
Member

mfelczak commented Nov 6, 2017

Confirming that some of our hosted clients have also run into this issue.

@asmecher
Copy link
Member

asmecher commented Nov 6, 2017

@mfelczak, do you have any preference from that client? I suspect we'd need to look at a built-in service again, though I really don't like maintaining our own. I suspect there's a third-party library we could use.

@mfelczak
Copy link
Member

mfelczak commented Nov 7, 2017

Hi @asmecher, I've checked with the client and they couldn't provide any alternatives. Digging around a bit, BotDetect Captcha looks promising for a possible 3rd-party integration: https://captcha.com/php-captcha.html

@asmecher
Copy link
Member

asmecher commented Nov 7, 2017

https://github.com/gregwar/captcha also appears to be heavily used, and it's in Composer with very few dependencies...

@mfelczak mfelczak added the Hosting Bug reports and feature requests from Publishing Services's hosted clients. label Jan 19, 2018
@mfelczak mfelczak added this to the OJS/OMP 3.1.1 milestone Jan 19, 2018
@asmecher asmecher removed this from the OJS/OMP 3.1.1 milestone Jan 29, 2018
@mfelczak mfelczak modified the milestones: OJS/OMP 3.1.1, OJS/OMP 3.2 Jan 29, 2018
@jmacgreg
Copy link
Contributor

Hi all, see also https://pad.foebud.org/google-alternatives for some alternatives. The Recaptcha-specific section is quoted below:

http://textcaptcha.com/

https://www.scorchsoft.com/blog/recaptcha-alternative-honeypot-spam-prevention/ honeypot forms

checking IP addresses against RBLs

https://captcha.com/ - BotDetect, a CAPTCHA implementation

https://akismet.com/ Akismet, known spammer API -> https://de.wikipedia.org/wiki/Akismet#Datenschutzprobleme_in_Deutschland,_%C3%96sterreich_und_der_Schweiz

https://www.drupal.org/project/botcha Botcha "anti-captcha" (the technique used is very easy to implement elsewhere but also very effective) 

@GrazingScientist
Copy link

Although not coming from China, in my public institute in Germany, we have a need of a reCaptcha alternative and hence would be very thankful for having a out-of-the-box-plugin.

@carzamora
Copy link
Contributor

I want to propose a very simple idea, and this can be implemented even (I think) with reCaptcha enabled, a Honeypot like this: http://tidyrepo.com/registration-honeypot/

taken from the link above:

Registration Honeypot works by adding a hidden text field to your registration form labeled “Only fill in if you are not human.” Users will not even be able to see the text field, since it is hidden, let alone fill it out, so registration process will go unaffected. Spambots, on the other hand, will fill it out automatically, and will be kicked out of the registration process and redirected to an error page that will not let them continue.

@mt-dave

This comment has been minimized.

@jonasraoni
Copy link
Contributor

Hi guys, resurrecting this old issue =]

  • I was taking a look in the available solutions and didn't find anything that great (works offline + few installation requirements + open source/compatible license).
  • About the reCaptcha, some people are proxying the Google scripts as a workaround, but well, that's prone to fail.
  • There are plenty of online solutions, which could be implemented as plugins, but since we're trying to solve a connectivity problem due to external dependencies, I'd rather start with the offline solution.

So, I vote for the https://github.com/gregwar/captcha!

p.s.: There's another popular captcha written in PHP (https://github.com/mewebstudio/captcha), but it was built for Laravel and has more dependencies.


About the implementation, two things that came into my mind...

  • As an user-friendly measure we could check if Google is unresponsive in the client-side and, optionally, fallback to the offline captcha (I know this is flawed from a security perspective, as an attacker might easily switch to the easier challenge, but if the intention is just to avoid generic spammers, I think it's ok).
  • To improve the accessibility, we could make use of the "Web Speech API" to spell the text.

@asmecher
Copy link
Member

asmecher commented Oct 3, 2019

I think these distorted-letters-based CAPTCHA tests are widely considered to be easily crackable, and neither library includes any accessible alternatives. I wonder whether there's not a wholly different approach with decent support that's possible to run locally (or via a defined proxy) -- scanning the suggestions e.g. on https://www.w3.org/TR/turingtest/. @NateWr, any thoughts on this? How about e.g. @jmacgreg on accessibility?

As much as I don't like relying entirely on Google for this, I suspect proxying the scripts would be more effective than a distorted-letters implementation, and would include accessibility support.

(We used to have our own homemade distorted-letters generator, but I happily got rid of it a few years back.)

@jmacgreg
Copy link
Contributor

jmacgreg commented Oct 3, 2019

I won't have much useful to say on this, but @israelcefrin will!

@jonasraoni
Copy link
Contributor

@asmecher about proxying, the own OJS instance might have limited external access, so I think we'll need a fallback anyway.

I personally like random logic challenges like: mark the first and last checkboxes, cat/snake refer to animal or tool, leave the field empty if you're not a robot, write the result of 2*1 and tricks to deceive the bot (fake/duplicated fields), but they are weak...

Is OJS a direct target of spammers or we're just trying to defend against generic bots?

Let's see what the other guys have to say, I might research alternatives later as well =]

@NateWr
Copy link
Contributor

NateWr commented Oct 4, 2019

The link that Alec shared is good at outlining the pros and cons of different approaches. I don't think that any technique in the "Interactive Stand-Alone Approaches" category is going to be accessible. That includes captcha, logic games, etc.

Of the non-interactive approaches, I think a honeypot is our best bet. And we should probably consider using this as a standard part of the application, not just a plugin that gets added on. Honeypots are very effective at defending against generic bots that aren't specifically targetting OJS, and will likely be sufficient to cut down on the majority of spam problems faced.

For OJS instances that need to be hardened -- usually because they are a direct target, as jonas mentioned -- we should consider the "multi-party" approaches in the document that Alec linked. These include Google's ReCaptcha for journals that aren't concerned about accessibility. But there's also third-party services like Akismet, which supports non-WordPress uses. I think the forum thread that Alec linked before had a plugin with Akismet support.

@israelcefrin
Copy link
Collaborator

Hi all, just a thought on accessibility. Currently Google ReCaptcha is concerned with this issue (accessible forms) and they even have a section explaining how accessible their solution is. However, tests with real users with disabilities have shown that it is not fully accessible/usable and it becomes a barrier for users with different devices to pass the "captcha" test.

I agree to @NateWr :

Of the non-interactive approaches, I think a honeypot is our best bet.

And Pitt has those 2 plugins from that post that @asmecher shared on the meeting:
https://github.com/ulsdevteam/pkp-akismet
https://github.com/ulsdevteam/pkp-formHoneypot

I've talked to Clinton and he told me that they could contribute these both plugins to the OJS plugin gallery.

@jonasraoni
Copy link
Contributor

Oh, I didn't know the term honeypot, but that's what I meant with "tricks to deceive the bot" 😁

I'm just curious about how inaccessible is a logic puzzle, something with a question and a simple answer seems to be as complex as filling the form itself for me 🤔
I'll try to find a research about it when I arrive at home to kill my curiosity.

I just read the honeypot plugin's description and it seems to be ok assuming that we just need to get rid of generic bots.

@israelcefrin
Copy link
Collaborator

I'm just curious about how inaccessible is a logic puzzle, something with a question and a simple answer seems to be as complex as filling the form itself for me

According to W3C document shared by @asmecher , it relies on the Understable Principle of Accessibility. To make a logic puzzle accessible we need to work with language, learning and cognitive issues to solve it. It is not impossible, but to make it work, it is recommended a comprehensive round test with different users.

A honey pot approach wouldn´t add any extra workload on users to fulfill a form but on bots.

@jonasraoni
Copy link
Contributor

@israelcefrin I just read the W3C link... Looks like after all these years everybody is still in the same boat (while the bots are almost beating us).

I like the honeypot solution, and if it's not enough, we can extend it (e.g. monitor if a given user is triggering too many actions in a small period of time).

@NateWr
Copy link
Contributor

NateWr commented Oct 7, 2019

To make a logic puzzle accessible we need to work with language, learning and cognitive issues

Yeah, the main issue is coming up with something that works across all the different languages/cultures that our product is used in.

I like the honeypot solution, and if it's not enough, we can extend it

💯 It's far easier to build something smarter on a case-by-case basis, for the 1-2% of cases when the honeypot isn't enough, than to try to devise a single solution that works everywhere.

@jonasraoni
Copy link
Contributor

With the release of the honeypot plugin into the gallery I guess this issue can be closed, right?

@NateWr
Copy link
Contributor

NateWr commented Oct 14, 2019

@mfelczak are you happy on the PS side for this issue to be closed?

@mfelczak
Copy link
Member

Thanks @NateWr and @jonasraoni. Yes, this should suffice -- we'll test with a few hosted journals.

@jnugent
Copy link
Member

jnugent commented Jul 10, 2020

Hi folks,

We've had requests from hosted journals who have users in China that are unable to sign up due to google.com being unavailable in China. Our workaround thus far has been to edit classes/form/validation/FormValidatorReCaptcha.inc.php and classes/template/PKPTemplateManager.inc.php and switch the www.google.com references to www.recaptcha.net which works in China. If this was permanently added to pkp-lib the only other modification would be to switch enable_cdn to off. The recaptcha.net domain is owned by Google so it probably isn't going anywhere. I can provide links to journals that are using the alternative URL with no problems if it is helpful. (And apologies, @asmecher, for the dupe post earlier)

@asmecher
Copy link
Member

@jnugent (and others), is there any downside in just using www.recaptcha.net as proposed above?

@mfelczak
Copy link
Member

A couple notes here to add to Jason's summary above. Adding support for recaptcha.net via a new toggle in config.inc.php or even as the default to replace the existing google.com implementation would expand the anti-spam toolset available to OJS users. There will be journals who can't subscribe to Akismet or who still want a reCaptcha solution that works for all visitors. At the moment the only free alternative to the default reCaptcha is the honeypot plugin.

@NateWr
Copy link
Contributor

NateWr commented Jul 13, 2020

Is there a downside to just switching the hostname to recaptcha.net? Would any existing installs be effected? Perhaps if servers have whitelisted domains that are permitted to make external requests...

@asmecher
Copy link
Member

👍 OK, I fully support this proposal. @jnugent, could you open a PR for it?

@jnugent
Copy link
Member

jnugent commented Jul 18, 2020

I will! @asmecher what did we decide on? Changing the urls in the various classes, or the config.inc.php option to allow people to toggle between the two?

@asmecher
Copy link
Member

I think just universally using the recaptcha.net domain is best/simplest!

@asmecher
Copy link
Member

Implemented at #6114!

@asmecher asmecher reopened this Jul 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Hosting Bug reports and feature requests from Publishing Services's hosted clients.
Projects
None yet
Development

No branches or pull requests