Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Fetch images when previewing Twitter #11985

Merged
merged 6 commits into from
Feb 22, 2022
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions changelog.d/11985.misc
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Use a bot User-Agent for URL preview queries.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking through this more, I think we should make this a .feature file which says "Fetch images when previewing Twitter URLs."

You can also credit yourself if you would like "Contributed by (your name or @AndrewRyanChama)."

4 changes: 1 addition & 3 deletions synapse/res/providers.json
Original file line number Diff line number Diff line change
Expand Up @@ -5,13 +5,11 @@
"endpoints": [
{
"schemes": [
"https://twitter.com/*/status/*",
"https://*.twitter.com/*/status/*",
clokep marked this conversation as resolved.
Show resolved Hide resolved
"https://twitter.com/*/moments/*",
"https://*.twitter.com/*/moments/*"
Comment on lines 8 to 9
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weirdly these URLs don't seem to be used by Twitter anymore? We can leave them for now, but it is a bit annoying they changed their URL scheme at some point...

],
"url": "https://publish.twitter.com/oembed"
}
]
}
]
]
10 changes: 9 additions & 1 deletion synapse/rest/media/v1/preview_url_resource.py
Original file line number Diff line number Diff line change
Expand Up @@ -402,7 +402,15 @@ async def _download_url(self, url: str, output_stream: BinaryIO) -> DownloadResu
url,
output_stream=output_stream,
max_size=self.max_spider_size,
headers={"Accept-Language": self.url_preview_accept_language},
headers={
b"Accept-Language": self.url_preview_accept_language,
# Use a custom user agent for the preview because some sites will only return
# Open Graph metadata to crawler user agents. Omit the Synapse version
# string to avoid leaking information.
b"User-Agent": [
"Synapse (bot; +https://github.com/matrix-org/synapse)"
],
clokep marked this conversation as resolved.
Show resolved Hide resolved
},
is_allowed_content_type=_is_previewable,
)
except SynapseError:
Expand Down