Add proxy rule for robots.txt #380

zwolf · 2024-09-27T17:16:25Z

Proxy www.zooniverse.org/robots.txt to static.zooniverse.org/robots.txt. This allows the www robots.txt to be permissive, while the robots.txt served by the FEM apps can remain restrictive to prevent crawling of other subdomains.

eatyourgreens · 2024-09-28T09:14:16Z

This adds a permissive robots exclusion rule to both www.zooniverse.org and static.zooniverse.org, by applying the same /robots.txt to both domains.

If you set up nginx to return 404 for www.zooniverse.org/robots.txt, that would allow bots to crawl www.zooniverse.org, without being tied to the exclusion rules for static.zooniverse.org.

static.zooniverse.org is already indexed by Google, so that difference is probably academic.
https://www.google.com/gasearch?q=site:static.zooniverse.org

Add proxy rule for robots.txt

08a27d8

zwolf requested review from lcjohnso and yuenmichelle1 September 27, 2024 17:22

This was referenced Sep 27, 2024

app-root: Remove stub routes zooniverse/front-end-monorepo#6340

Merged

Entire Zooniverse web site blocked from search engines zooniverse/front-end-monorepo#6331

Closed

fix(app-root): allow search engines in production zooniverse/front-end-monorepo#6341

Closed

lcjohnso approved these changes Sep 27, 2024

View reviewed changes

yuenmichelle1 approved these changes Sep 27, 2024

View reviewed changes

Store robots.txt in PFE folder

2021fff

zwolf merged commit 73b13ff into master Oct 2, 2024
2 checks passed

zwolf deleted the www-robots-txt branch October 2, 2024 18:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add proxy rule for robots.txt #380

Add proxy rule for robots.txt #380

zwolf commented Sep 27, 2024

eatyourgreens commented Sep 28, 2024 •

edited

Loading

Add proxy rule for robots.txt #380

Add proxy rule for robots.txt #380

Conversation

zwolf commented Sep 27, 2024

eatyourgreens commented Sep 28, 2024 • edited Loading

eatyourgreens commented Sep 28, 2024 •

edited

Loading