Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[YouTube] Workaround HTTP 403s on streaming URLs of WEB client (after some time or instantly for some JavaScript players), update clients info #1197

Merged

Conversation

AudricV
Copy link
Member

@AudricV AudricV commented Jul 19, 2024

This PR works around an anti-bot token, for which its requirement is A/B tested on the WEB client. In this test, streaming URLs of this client return HTTP errors 403 if the token is not provided after some time.

It also allows to not fetch the JavaScript player for non-age restricted videos, reducing data usage. The TVHTML5 embed client is now also only fetched in the case of age-restricted videos.

⚠️ The methods forceFetchAndroidClient and forceFetchIosClient of YoutubeStreamExtractor have been removed, as they are now not needed anymore. This is a breaking change for users of these methods.

The PR changes also break the extraction of appropriate error states for private and deleted videos and invalid video IDs (an invalid player response message is thrown instead of a content not available one). This should fixed before this PR is merged, but I can't find a solution, so I will need help for that.

Clients info have been updated, this means that mocks have to be updated and several tests unrelated to my changes are failing. I fixed one in this PR (at least for the connection I used to ran the tests), YoutubeSearchExtractorTest.CrisisResources.

Fixes TeamNewPipe/NewPipe#11191.

@AudricV AudricV added bug Issue is related to a bug enhancement New feature or request ASAP Issue needs to be fixed as soon as possible youtube service, https://www.youtube.com/ tests Issues and PR related to unit tests labels Jul 19, 2024
@Stypox
Copy link
Member

Stypox commented Jul 21, 2024

The PR changes also break the extraction of appropriate error states for private and deleted videos and invalid video IDs (an invalid player response message is thrown instead of a content not available one). This should fixed before this PR is merged, but I can't find a solution, so I will need help for that.

Can't you just use checkPlayabilityStatus() like before? This is a response for a private video, which contains information about which error to throw:

{playabilityStatus={contextParams=..., messages=[This is a private video. Please sign in to verify that you may see it.], status=LOGIN_REQUIRED, errorScreen={playerErrorMessageRenderer={reason={simpleText=Private video}, thumbnail=..., icon={iconType=ERROR_OUTLINE}, subreason={simpleText=Sign in if you've been granted access to this video}, ...

This is the response for an invalid id:

{playabilityStatus={reason=Video unavailable, contextParams=..., status=ERROR, errorScreen={playerErrorMessageRenderer={reason={simpleText=Video unavailable}, thumbnail=..., icon={iconType=ERROR_OUTLINE}}}}}

I don't have a deleted video to test.

@AudricV AudricV marked this pull request as ready for review July 23, 2024 17:42
@AudricV AudricV force-pushed the yt_innertube-clients-changes-for-streams branch from f22ed6d to e80515c Compare July 23, 2024 17:42
@AudricV AudricV changed the title [YouTube] Workaround HTTP error 403s on streaming URLs of WEB client after some time, update clients info [YouTube] Workaround HTTP 403s on streaming URLs of WEB client (after some time or instantly for some JavaScript players), update clients info Jul 23, 2024
AudricV added 7 commits July 23, 2024 20:43
…time

These changes work around an anti-bot token, for which its requirement is A/B
tested on the WEB client. In this test, streaming URLs of this client return
HTTP errors 403 if the token is not provided after some time.

It also allows to not fetch the JavaScript player for non-age restricted
videos, reducing data usage.

The TVHTML5 embed client is now only fetched in the case of age-restricted
videos.

The methods forceFetchAndroidClient and forceFetchIosClient of
YoutubeStreamExtractor have been removed, as they are now not needed anymore.

These changes also break the extraction of appropriate error states for private
and deleted videos and invalid video IDs.
The "blue whale" search query does not return a crisis resource panel anymore,
so it was changed to a different word, "suicide".
- Fix typo in folder name of DescriptionTestPewdiepie test;
- Fix constant usage of DownloaderTestImpl as download implementation for
UnlistedTest and CCLicensed tests.
This commits fixes extraction of the function name decoding the n parameter for
HTML5 clients' streaming URLs for YouTube base JavaScript player 3400486c.

Two new regexes have been added to the existing ones. All regexes and what they
extract has been documented.
This param used to throttle bandwidth of streaming URLs which have this
parameter when the correct value is not provided but it is not the case
anymore, as the streaming URLs return now an HTTP response code 403 in
this case.
@AudricV AudricV force-pushed the yt_innertube-clients-changes-for-streams branch from e80515c to d73de6b Compare July 23, 2024 19:12
@Stypox Stypox merged commit 2d36945 into TeamNewPipe:dev Jul 24, 2024
3 of 4 checks passed
@AudricV AudricV deleted the yt_innertube-clients-changes-for-streams branch July 24, 2024 14:42
@offshade

This comment was marked as resolved.

@opusforlife2

This comment was marked as resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ASAP Issue needs to be fixed as soon as possible bug Issue is related to a bug enhancement New feature or request tests Issues and PR related to unit tests youtube service, https://www.youtube.com/
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[YouTube] HTTP error 403 around 1 minute mark
4 participants