-
Notifications
You must be signed in to change notification settings - Fork 248
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support unfurling more websites #3530
Conversation
Pull Request Checklist
|
Jenkins BuildsClick to see older builds (27)
|
3a5cb0b
to
f555a48
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor comments, looks great!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor comments, looks great!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor comments, looks great!
015d88d
to
effd6b9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lovely work @ilmotta.
Just a minor question, what impact if any does this functionality have on protocol/urls/urls.go?
disabledStubs := false | ||
if disabledStubs { | ||
expected.Thumbnail.Width = 1280 | ||
expected.Thumbnail.Height = 720 | ||
expected.Thumbnail.DataURI = "" | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
disabledStubs
is always false, the if statement can never be true.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
url := "https://www.youtube.com/watch?v=lE4UXdJSJM4" | ||
thumbnailURL := "https://i.ytimg.com/vi/lE4UXdJSJM4/maxresdefault.jpg" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do these tests need to consider segregating getting data from parsing data?
See #3529
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question @Samyoul, but I'm inferring a little bit what you mean, please correct me if anything is off. So far, I've focused on testing only the public function UnfurlURLs
, but because it does basically everything under the hood, its tests are more sociable than solitary.
I think I could do a better job of testing some of the private functions too, since they have a narrower responsibility, and it would be easier to test this segregation of data fetching vs parsing you mention.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Slightly off topic, I saw many examples in the code about using a mock server, but in retrospect, I don't remember anymore why I didn't go that route, which in my experience (in other languages) works really well too. In the end, I opted for direct control over a RoundTripper to give me more control (?). Maybe overkill? I must say I've had good results with both approaches.
And your comments on the issue about the flaky tests are on point :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You know, I really have enjoyed reviewing this PR. Testing http requests are always tricky because the http client interface is generally embedded and segregating the getting concerns from the parsing concerns isn't a focus.
A mock client request verses a mock server response is basically removing the need for outbound calls. I've always focused on making the server mockable so that the client behaves as normally as possible during tests. But in your case I think that the server is not required because you pass in (inject) the client dependency and so you only really test the data parsing concern of your functionality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I may have just become a convert to client centric testing.
No impact in theory @Samyoul. In the first PR for the new unfurling implementation, I decided to not touch the |
effd6b9
to
20ca457
Compare
// RoundTrip returns a stubbed response if any matcher returns a non-nil | ||
// http.Response. If no matcher is found and fallbackToDefaultTransport is true, | ||
// then it executes the HTTP request using the default http transport. | ||
// | ||
// If StubTransport#disabledStubs is true, the default http transport is used. | ||
func (t *StubTransport) RoundTrip(req *http.Request) (*http.Response, error) { | ||
if t.disabledStubs { | ||
return http.DefaultTransport.RoundTrip(req) | ||
} | ||
|
||
for _, matcher := range t.matchers { | ||
res := matcher(req) | ||
if res != nil { | ||
return res, nil | ||
} | ||
} | ||
|
||
if t.fallbackToDefaultTransport { | ||
return http.DefaultTransport.RoundTrip(req) | ||
} | ||
|
||
return nil, fmt.Errorf("no HTTP matcher found") | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
823487b
to
540e4ae
Compare
540e4ae
to
6481196
Compare
Bumps status-go's version to point to the status-go branch in PR status-im/status-go#3530. Fixes #15918
Summary
This PR adds support for unfurling a wider range of websites. Most code changes are related to the implementation of a new
Unfurler
, anOEmbedUnfurler
, which is necessary to get metadata for Reddit URLs using oEmbed, since Reddit does not support OpenGraph meta tags. The new unfurler will also be useful for other websites, like Twitter. Also the user agent was changed, and now more websites consider status-go reasonably human.Example hostnames that are now unfurleable:
reddit.com
,open.spotify.com
,music.youtube.com
No breaking changes in this PR The PR in status-mobile will only bump the status-go version (not yet created).
Other improvements
UnfurlErr
.http.Client
instance and by customizing itsTransport
field (except for some failing conditions where it's even good to hit the real servers). I think the solution is pretty decent and highly flexible (although I hardcoded one or two things for the scope of this PR). Do take a look at theStubTransport
type in the tests and its usages. The goal was to help our test suite not get worse than it already is in terms of reliability and speed. I know there are other places in status-go doing something similar, and I've been wondering if a more unified HTTP stubbing solution would benefit us all. Feedback is more than welcomed! 🌟For the future
There is a lot more to be done, like better timeout handling, performance improvements when unfurling multiple URLs. There are even scenarios where websites such as Reddit return 429s if more than one request is made in a 2s window, which will happen if users try to unfurl more than one Reddit URL in the same message. Edge case or not, unfurling URLs is like a can of worms.
Sooner or later we'll need to add support for even more websites, like Twitter, maybe Amazon, etc. In this PR I found out that Amazon sucks (why am I surprised?). They don't support OpenGraph, nor oEmbed. They really just want product sellers to paste their own auto-generated embeddable links. So for the moment, I ignored support for Amazon. Twitter is different and requires more handling behind the scenes, even though it supports OpenGraph and oEmbed. Reddit I only added basic unfurling support, because they don't offer any thumbnail URL in their oEmbed response and they do not support OpenGraph.