-
-
Notifications
You must be signed in to change notification settings - Fork 7.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[extractor/tv4] Fix tv4 extraction #5649
Conversation
changed json api url
Closes #5535 |
This comment was marked as outdated.
This comment was marked as outdated.
yt_dlp/extractor/tv4.py
Outdated
@@ -73,7 +73,7 @@ def _real_extract(self, url): | |||
video_id = self._match_id(url) | |||
|
|||
info = self._download_json( | |||
'https://playback-api.b17g.net/asset/%s' % video_id, | |||
'https://playback2.a2d.tv/asset/%s' % video_id, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use f-strings for both API URLs for consistency / readability
'https://playback2.a2d.tv/asset/%s' % video_id, | |
f'https://playback2.a2d.tv/asset/{video_id}', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely make them consistent, but if it's done using f-strings yt-dl will just have to change it to any of the other myriad string syntaxes (I know that's our burden).
Maybe s/t like this:
def _call_api(self, endpoint, video_id, query=None, **kwargs):
return self._download_json(
'/'.join(('https://playback2.a2d.tv', endpoint, video_id)),
video_id, endpoint.join(('Downloading video ', ' JSON')),
query=update_url_query({
'service': 'tv4',
'device': 'browser',
'protocol': 'hls,dash',
}, query), **kwargs)
...
info = self._call_api('asset', video_id, query={
'drm': 'widevine',
})['metadata']
...
manifest_url = self._call_api('play', video_id, expected_status=401)
# don't crash during error handling
err = traverse_obj(manifest_url, 'errorCode')
...
# then as in https://github.com/yt-dlp/yt-dlp/pull/5649#issuecomment-1337847672 below
Related to the geo-restriction handling, in the class:
# XFF is not effective
_GEO_BYPASS = False
and perhaps specialise the msg
for raise_geo_restricted()
if, as I believe, a logged-in user can access the site from outside SE/EU:
self.raise_geo_restricted(
'This video is not available from your location due to geo restriction, or not being authenticated.',
countries=['SE'])
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely make them consistent, but if it's done using f-strings yt-dl will just have to change it to any of the other myriad string syntaxes (I know that's our burden).
Sorry, but if we discourage the use of newer Python features, then there was no point in deprecating older versions. In most cases (including here) fstrings simply read much better than the alternatives.
You could encourage OP to make PR against youtube-dl and get it merged there first - which would reverse our responsibilities; i.e. the burden of code review will then mainly fall on you, and the burden of porting (dealing with deprecated imports/functions) on me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair point, but in this case it is at least possible to have just the one instance, and also avoid replicating the base API URL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, I wasn't commenting on the whole thing, just on the quoted part. Having a _call_api
does look good
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And in any case, https://gist.github.com/dirkf/26c1dc0e0f29e2663c8cfd58f932aca6.
yt_dlp/extractor/tv4.py
Outdated
@@ -84,7 +84,7 @@ def _real_extract(self, url): | |||
title = info['title'] | |||
|
|||
manifest_url = self._download_json( | |||
'https://playback-api.b17g.net/media/' + video_id, | |||
'https://playback2.a2d.tv/play/' + video_id, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'https://playback2.a2d.tv/play/' + video_id, | |
f'https://playback2.a2d.tv/play/{video_id}', |
This is a good idea, since it seems like all tv4 content is geo-restricted now. But the diff posted above conflates the A working version could look like this (using this PR's initial commit as baseline for the diff): - manifest_url = self._download_json(
- 'https://playback2.a2d.tv/play/' + video_id,
- video_id, query={
+ manifest_info = self._download_json(
+ f'https://playback2.a2d.tv/play/{video_id}',
+ video_id, 'Downloading manifest info JSON', query={
'service': 'tv4',
'device': 'browser',
'protocol': 'hls',
- })['playbackItem']['manifestUrl']
+ }, expected_status=401)
+ err = manifest_info.get('errorCode')
+ if err:
+ msg = manifest_info.get('message') or err
+ if 'GEO_LOCATION' in err:
+ self.raise_geo_restricted(msg, countries=['SE'])
+ raise ExtractorError(f'HTTP Error 401: Unauthorized; {msg}', video_id=video_id)
+
+ manifest_url = manifest_info['playbackItem']['manifestUrl']
+ |
Does this still work? Pls update tests |
yt_dlp/extractor/tv4.py
Outdated
@@ -84,7 +84,7 @@ def _real_extract(self, url): | |||
title = info['title'] | |||
|
|||
manifest_url = self._download_json( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This step (manifest download) requires geo-verification outside Sweden. The following update would solve that part:
manifest_url = self._download_json(
'https://playback2.a2d.tv/play/%s' % video_id,
video_id, query={
'service': 'tv4',
'device': 'browser',
'browser': 'GoogleChrome',
'protocol': 'hls,dash',
'drm': 'widevine',
'capabilities': 'live-drm-adstitch-2,expired_assets',
},
headers=self.geo_verification_headers(),
)['playbackItem']['manifestUrl']
ee280c7
to
7aeda6c
Compare
@@ -20,19 +23,25 @@ class TV4IE(InfoExtractor): | |||
sport/| | |||
) | |||
)(?P<id>[0-9]+)''' | |||
_GEO_COUNTRIES = ['SE'] | |||
_GEO_BYPASS = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
per dirkf's comment here #5649 (comment) that xff is futile
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then why r we passing go verification headers? Am I misunderstanding something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_GEO_BYPASS
is for --xff
, which has no effect
geo_verification_headers()
are for --geo-verification-proxy
, which does work
Closes yt-dlp#5535 Authored by: TxI5, dirkf
changed json api url
IMPORTANT: PRs without the template will be CLOSED
Description of your pull request and other information
TV4 has changed their api domain. This changes the url for json metadata extraction
Fixes #5535
Template
Before submitting a pull request make sure you have:
In order to be accepted and merged into yt-dlp each piece of code must be in public domain or released under Unlicense. Check one of the following options:
What is the purpose of your pull request?