Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Commit

Permalink
Add X-Robots-Tag header to stop crawlers from indexing media (#8887)
Browse files Browse the repository at this point in the history
Fixes / related to: #6533

This should do essentially the same thing as a robots.txt file telling robots to not index the media repo. https://developers.google.com/search/reference/robots_meta_tag

Signed-off-by: Aaron Raimist <[email protected]>
  • Loading branch information
aaronraimist authored Dec 8, 2020
1 parent ab7a24c commit cd9e72b
Show file tree
Hide file tree
Showing 3 changed files with 19 additions and 0 deletions.
1 change: 1 addition & 0 deletions changelog.d/8887.feature
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Add `X-Robots-Tag` header to stop web crawlers from indexing media.
5 changes: 5 additions & 0 deletions synapse/rest/media/v1/_base.py
Original file line number Diff line number Diff line change
Expand Up @@ -155,6 +155,11 @@ def _quote(x):
request.setHeader(b"Cache-Control", b"public,max-age=86400,s-maxage=86400")
request.setHeader(b"Content-Length", b"%d" % (file_size,))

# Tell web crawlers to not index, archive, or follow links in media. This
# should help to prevent things in the media repo from showing up in web
# search results.
request.setHeader(b"X-Robots-Tag", "noindex, nofollow, noarchive, noimageindex")


# separators as defined in RFC2616. SP and HT are handled separately.
# see _can_encode_filename_as_token.
Expand Down
13 changes: 13 additions & 0 deletions tests/rest/media/v1/test_media_storage.py
Original file line number Diff line number Diff line change
Expand Up @@ -362,3 +362,16 @@ def _test_thumbnail(self, method, expected_body, expected_found):
"error": "Not found [b'example.com', b'12345']",
},
)

def test_x_robots_tag_header(self):
"""
Tests that the `X-Robots-Tag` header is present, which informs web crawlers
to not index, archive, or follow links in media.
"""
channel = self._req(b"inline; filename=out" + self.test_image.extension)

headers = channel.headers
self.assertEqual(
headers.getRawHeaders(b"X-Robots-Tag"),
[b"noindex, nofollow, noarchive, noimageindex"],
)

0 comments on commit cd9e72b

Please sign in to comment.