Reduce Typosquatting Harm via Social Distancing for Top PyPI Packages #9527
Labels
feature request
malware-detection
Issues related to automated malware detection.
security
Security-related issues and pull requests
What's the problem this feature will solve?
Reduce the total harm typosquatting causes to PyPI users.
Describe the solution you'd like
Block users from uploading new packages with a similar name to any of the current top packages.
Additional context
While similar solutions have been proposed before (see below), this particular solution is not a malware check or a predictive model, per se. It is a proposal for a simple rule that users cannot upload any NEW packages that have a similar name to a top package. Full stop. This solution does not stop all typosquatters, but it will likely reduce the harm because typosquatting the top packages will be harder. Typosquatters can either attempt to typosquat less popular packages and therefore harm fewer users or they can use typosquatting attack strategies involving a greater edit distance and also likely harm fewer users.
I should also mention that this proposed feature is not meant to replace a number of other ongoing and related efforts that try to reduce the harm caused by malware on PyPI. Finally, like all approaches to reducing harm from malware on PyPI, there are pros and cons. All debate and critique and suggested revisions are welcome.
Some parties I know will be interested: @di, @ewdurbin, @xmunoz, @benjaoming, @hannob
Some parties who could be interested: @ewjoachim, @brainwane, @pradyunsg, @ncoghlan, @dstufft
Relevant issues and PR’s:
Implement a More Robust Malware Detector - Issue #7748
Detect Packages Being Published with Typo-ish Names Issue #4998
@brainwane rightfully mentioned that if this approach was part of a malware check it is virtually guaranteed that this approach would generate many false positives. This proposal is distinct since this proposal simply calls for a rule to restrict package name selection in the name of what might technically be called “preclusive namespacing” but what might informally be called “social distancing for top PyPI packages.” PyPI administrators can therefore avoid adjudicating whether a certain package is malware or not. PyPI will simply prevent any users--and ideally provide an explanation--that such a package name is not allowed in the name of reducing aggregate typosquatting harm.
Post-registration Alerts for Packages with Similar Names (Typosquatting) - Issue #2268
Monitor New Packages that Might be Typosquats - PR #5001
PSF Fundables - Productionize Malware Detection - Issue #38
The text was updated successfully, but these errors were encountered: