-
Notifications
You must be signed in to change notification settings - Fork 425
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[YouTube] Use the new internal API in NewPipe Extractor #604
Conversation
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
@TiA4f8R Can you merge in the change more incrementally instead of all at once. Basically finish up the current channels, search, playlist extractor and get those merged or even split those up into seperate PR. |
Then later, all the related PRs for this change can be linked to each other in the OP in a bullet list to make it easier to navigate for future devs. |
0e69831
to
8022275
Compare
extractor/src/main/java/org/schabi/newpipe/extractor/services/youtube/YoutubeParsingHelper.java
Outdated
Show resolved
Hide resolved
b61d54f
to
59e20e0
Compare
3eb2e9a
to
d607fe9
Compare
@XiangRongLin I think I fixed all tests because the CI passed. What do you think? |
extractor/src/main/java/org/schabi/newpipe/extractor/services/youtube/YoutubeParsingHelper.java
Outdated
Show resolved
Hide resolved
From the mock standpoint, if the tests are passing, then it should be fine. |
47dd511
to
f7bca36
Compare
…deos + update clients version Here is now the requests which will be made by the `onFetchPage` method of `YoutubeStreamExtractor`: - the desktop API is fetched. If there is no streaming data, the desktop player API with the embed client screen will be fetched (and also the player code), then the Android mobile API. - if there is no streaming data, a `ContentNotAvailableException` will be thrown by using the message provided in playability status If the video is age restricted, a request to the next endpoint of the desktop player with the embed client screen will be sent. Otherwise, the next endpoint will be fetched normally, if the content is available. If the video is not age-restricted, a request to the player endpoint of the Android mobile API will be made. We can get more streams by using the Android mobile API but some streams may be not available on this API, so the streaming data of the Android mobile API will be first used to get itags and then the streaming data of the desktop internal API will be used. If the parsing of the Android mobile API went wrong, only the streams of the desktop API will be used. Other code changes: - `prepareJsonBuilder` in `YoutubeParsingHelper` was renamed to `prepareDesktopJsonBuilder` - `prepareMobileJsonBuilder` in `YoutubeParsingHelper` was renamed to `prepareAndroidMobileJsonBuilder` - two new methods in `YoutubeParsingHelper` were added: `prepareDesktopEmbedVideoJsonBuilder` and `prepareAndroidMobileEmbedVideoJsonBuilder` - `createPlayerBodyWithSts` is now public and was moved to `YoutubeParsingHelper` - a new method in `YoutubeJavaScriptExtractor` was added: `resetJavaScriptCode`, which was needed for the method `resetDebofuscationCode` of `YoutubeStreamExtractor` - `areHardcodedClientVersionAndKeyValid` in `YoutubeParsingHelper` returns now a `boolean` instead of an `Optional<Boolean>` - the `fetchVideoInfoPage` method of `YoutubeStreamExtractor` was removed because YouTube returns now 404 for every client with the `get_video_info` page - some unused objects and some warnings in `YoutubeStreamExtractor` were removed and fixed Co-authored-by: TiA4f8R <[email protected]>
Migrate YouTube comments to the desktop version by using the `next` endpoint of the InnerTube internal API. With the desktop version, we are able to get the exact like count of YouTube comments (by parsing the accessibility data) (the current extraction is used as a fallback). We are also now able to get if the uploader of the comment is verified or not. Co-authored-by: TiA4f8R <[email protected]>
…ions of YouTube Music search results The clickTrackingParams of YouTube Music search results are not needed to get continuations. This commit removes their use, which may improve privacy.
…ctorTest Without removing RunWith and SuiteClasses annotations (and the corresponding imports) in YoutubePlaylistExtractorTest and YoutubeMixPlaylistExtractorTest, some mocks cannot be generated, so the CI fails because of the missing mocks. Mocks of workings tests have been also updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, finally! I'm gonna test some things and if I see nothing wrong I'll merge this and open a PR for the hotfix. Are you all ok with this?
final byte[] body = JsonWriter.string(prepareDesktopJsonBuilder(localization, | ||
getExtractorContentCountry()) | ||
.value("browseId", "VL" + getId()) | ||
.value("params", "wgYCCAA%3D") // Show unavailable videos |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we sure about this? What is the purpose of showing unavailable videos in playlists? YouTube does not show them normally, and in NewPipe they would just create problems. Anyway, we'll think about this later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before a YouTube update, they were shown every time, so I thought it may be useful for some users (premieres, temporary georestrictions (think to music releases), ...).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a button on YouTube to show them manually if you were logged in iirc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On a technical side, this should be just base64 URL-encoded protobuf.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't have to be logged in to show them manually: it also works if you are a guest.
Ok, I tested as much as I could all of the features I could think of. It works well. Thank you @TiA4f8R and @FireMasterK for your hard work :-) |
Use the new internal API called
innertube
oryoutubei
(https://www.youtube.com/youtubei/v1/endpoint?key=INNERTUBE_API_KEY
) to fetch informations of YouTube contents instead of using thepbj
JSON (https://www.youtube.com/webpage_endpoint?pbj=1
; for YouTube Music, nothing were changed because the search already uses the InnerTube API). Responses are pretty similar (most of time, the order of the objects is just changed), so this should not be a big work like it was in 2020 for the migration from the old HTML YouTube pages to the desktop polymer version and itspbj
JSON.Thispbj
seems to be deprecated, the desktop website if YouTube is only using this API for video comments right now (there are A/B tests with the next endpoint right now).The changes in this PR needs testing for exceptions due to a big traffic and if the API returns 429/Too Many Requests, support of this in the extractor needs to be check (maybe sending cookies generated by a captcha to this request on high network traffics should bypass this error code).
Extraction of comments is fixed with this PR, like the extraction of embeddable age-restricted videos.
Improvements made in this PR:
Page
class, see e0011deguide
endpoint, which returns the menu items of the websiteget_search_suggestions
endpoint with an empty string suggestion, used by the website when loading it for the first time of a session.Screenshot of a 403/Forbidden error message (this should only happen if your IP was banned by Google):
Endpoints changed:
TO DO:
add a better spoofing of the mobile API (by analyzing the requests made by the Android client, which uses protobuf)It should be made in a separate PR.update mocks and client version when PR is approved.I also reformatted some code to be in the 100 characters line limit and used final where possible, in the files that I changed.
This will close #568 (even if that I still use the desktop version of the new internal API instead of the mobile version, excepted for the videos when I use to the Android API if a video is protected by
signatureCiphers
).Thanks to @FireMasterK for his findings.
APK for testing
See AudricV/NewPipe#1 for an up to date debug APK.
(Sorry in advance for my English.)