Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for extracting multiple tabs from a channel #279

Closed
wants to merge 17 commits into from
Closed

Add support for extracting multiple tabs from a channel #279

wants to merge 17 commits into from

Conversation

wb9688
Copy link
Contributor

@wb9688 wb9688 commented Mar 3, 2020

  • I carefully read the contribution guidelines and agree to them.
  • I did test the API against NewPipe.
  • I agree to ASAP create a PULL request for NewPipe for making in compatible when I changed the api.

To do:

  • Add support for different tabs in a channel
  • Add support for YouTube's playlists tab (partially closes Support for Channel Tabs NewPipe#2414)
  • Add support for SoundCloud's popular tracks tab
  • Add support for SoundCloud's albums tab
  • Add support for SoundCloud's playlists tab
  • Make the tests not fail (something with most channels not having a next page of playlists…)
  • Make it possible to have both streams and playlists in a tab
  • Add support for SoundCloud's reposts tab
  • Actually test it in NewPipe (e.g. I think I'm missing tabs stuff in ChannelInfo?)

Are there any more tabs I should add?

@wb9688 wb9688 added the enhancement New feature or request label Mar 3, 2020
@Stypox
Copy link
Member

Stypox commented Mar 3, 2020

Maybe a tab with channel description and links

@wb9688
Copy link
Contributor Author

wb9688 commented Mar 3, 2020

@Stypox: That's not possible, a 'tab' here has to be a list of streams, playlists or channels. Making a separate tab for the description and stuff would just be UI stuff in NewPipe.

@Stypox
Copy link
Member

Stypox commented Mar 3, 2020

You're correct, sorry ;-)

@wb9688 wb9688 mentioned this pull request Mar 4, 2020
@wb9688 wb9688 marked this pull request as ready for review March 4, 2020 16:03
@wb9688
Copy link
Contributor Author

wb9688 commented Mar 8, 2020

@TobiGr and @mauriciocolli: I think this PR is finished now, so I'd appreciate it if you review it (once you have time for that).

I have one point for discussion though. When should we add a tab? E.g. should we always add all tabs? In some cases it's only possible to know a tab is empty once we've requested the initial page.

Don't merge this until I've the NewPipe part ready though.

@wb9688 wb9688 mentioned this pull request Mar 8, 2020
10 tasks
@PeterHindes
Copy link

Hey, refer to my issue requesting the adding of support for playlist tabs. Music channels often have an albums category tag that you can select for on their page. This will need to be supported in the ui and in the extractor.

@PeterHindes
Copy link

PeterHindes commented Mar 12, 2020

Video Example since the links have broken in my old issue.

Copy link
Contributor

@mauriciocolli mauriciocolli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When should we add a tab? E.g. should we always add all tabs? In some cases it's only possible to know a tab is empty once we've requested the initial page.

I think that's not a problem and adding that there is a tab is enough, let the client lazy-load it when needed, otherwise it may be too much for some services.

Dynamically adding seems to require some amount of work right now because by the looks of YouTube channel options, the owner can customize its tabs.

But in the cases where the tab is the default one (or comes with the default request), like you did, I think it's pretty much welcome to parse it already.

No time to review it further right now, but covered some points.


JsonObject sectionListContinuation = ajaxJson.getObject(1).getObject("response")
.getObject("continuationContents").getObject("gridContinuation");
public List<ChannelTabExtractor> getTabs() throws ParsingException {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More information about the tab will be needed probably, for example, which tab will the feed use (how will it select which one)? We may need to add a type field to the tab, and even the order that the items are (related to the feature that @PeterHindes mentioned).

But I think we can get by for now by assuming that all tabs have the same order (new → old).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The feed will use the first tab (which is the reason for a279797) and we should document that somewhere.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it good enough though, don't a279797 just prove it that it isn't?

I think that by relying on the position, we just introduce an anti-pattern to the code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that this isn't the best solution, but I think it's fine if we clearly document it, unless you have a better suggestion.


import static org.schabi.newpipe.extractor.services.youtube.linkHandler.YoutubeParsingHelper.getJsonResponse;

public class YoutubeChannelVideosExtractor extends ChannelTabExtractor {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The id is being extracted using the default way (ListLinkHandler) which returns a string starting in user/ or channel/. Was this intentional?

But I guess it doesn't matter that much anyway because, currently, a tab can only be created by having the ChannelInfo first, let's see how the sort order can be implemented later.

Also, I see that the name is being used as a sort of an id right now.

Copy link
Contributor Author

@wb9688 wb9688 Mar 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I just couldn't think of any other way, as a LinkHandler isn't really needed, since it could only be get through the ChannelInfo/ChannelExtractor, which is intentional.

However, now I think of it we should support URLs like https://soundcloud.com/thatrickaz/reposts, and automatically open the right tab.

Comment on lines +55 to +61
public ChannelTabExtractor getChannelTabExtractor() {
return channelTabExtractor;
}

public void setChannelTabExtractor(ChannelTabExtractor channelTabExtractor) {
this.channelTabExtractor = channelTabExtractor;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought that Info objects were supposed to be plain java/data objects? Storing a extractor in it feels definitely wrong (not serializable as well).

Would like your opinion/suggestions on this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, that's why the channelTabExtractor variable is transient. This is needed to be able to get more items. The getMoreItems() function sets the channelTabExtractor again if it's not there anymore. We're doing this in CommentsInfo as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, that's why the channelTabExtractor variable is transient

I noticed it, I was talking about the use of the extractor object at all in the data class.

This is needed to be able to get more items.

Is it though? Aren't we able to just use the nextPageUrl to get the next page? We are able to that in ChannelInfo itself currently.

But now we don't even need to fetch the initial page because the continuation response already include all we need.

@wb9688
Copy link
Contributor Author

wb9688 commented Mar 18, 2020

I think that's not a problem and adding that there is a tab is enough, let the client lazy-load it when needed, otherwise it may be too much for some services.

@mauriciocolli: How are we gonna lazy-load it? That's not possible with the current architecture, as all ChannelTabInfos are put into a List in the ChannelInfo.

Also in case of YouTube, I think we can't know what tabs are exactly available, unless we make a request to both /videos (Music, More from the artist and Uploads should all be tabs) and /playlists (Created playlists and Albums should be tabs). Currently this PR only support Uploads and Created playlists, but I'll leave other tabs and detection of which ones are available to a later PR.

@mauriciocolli
Copy link
Contributor

@wb9688: How are we gonna lazy-load it? That's not possible with the current architecture, as all ChannelTabInfos are put into a List in the ChannelInfo.

Maybe we should change that then?

I think we should be able to use the same concept of something like kiosks I guess, which can be freely instantiated and fetch more items on its own as well.

As for the lists, put a loaded tab when it doesn't need to be fetched again, and some kind of placeholder when it needs to (with its tab's id/url to instantiate later).

Also in case of YouTube, I think we can't know what tabs are exactly available, unless we make a request to both /videos (Music, More from the artist and Uploads should all be tabs) and /playlists (Created playlists and Albums should be tabs).

Should the options really be separate tabs then? Seems like following YouTube's design would be better on this one. It'd also work well with lazy loading this way.

When the user changed an option, the front end would just reload the tab with the new parameters.

Currently this PR only support Uploads and Created playlists, but I'll leave other tabs and detection of which ones are available to a later PR.

👍

@wb9688
Copy link
Contributor Author

wb9688 commented Mar 19, 2020

I think we should be able to use the same concept of something like kiosks I guess, which can be freely instantiated and fetch more items on its own as well.

@mauriciocolli: I don't think so, since not all channels have the same tabs, so I think instantiating ChannelTabs from the Channels instead of the StreamingServices is the way to go. However, dynamic tabs and tabs we haven't implemented yet will be left to another PR.

I also think we should use the same approach to instantiate the comments from the streams, so we don't have to get the video page again in the comments, but could just pass the continuation and save a request (unless the user has switched to another app before requesting the comments.

I've been thinking about lazy-loading and I think we should add a function to ChannelTabInfo to populate fields other than the name and call that only once that tab is selected in NewPipe.

Should the options really be separate tabs then? Seems like following YouTube's design would be better on this one. It'd also work well with lazy loading this way.

I don't think we should implement sub-tabs, as that's just unnecessary complexity imho.

I'll rebase on dev again and implement lazy-loading today, and maybe the better LinkHandler as well.

@Stypox
Copy link
Member

Stypox commented Mar 19, 2020

Why is the extractor based on Info containers, that collect all info at once? Wouldn't it be better to dynamically load everything? This would mean adding many ReactiveX queries in the client, one for every field extracted by the Extractor, but probably improve the way errors are handled, and would show progress to the user one piece at a time, thus giving the impression of faster loading and also enabling some options/buttons some time before (like the current behaviour with comments).

@wb9688
Copy link
Contributor Author

wb9688 commented Mar 19, 2020

@Stypox: They're serializable and thus could be used to restore the state on Android.

@Stypox
Copy link
Member

Stypox commented Mar 19, 2020

That's something that should eventually be done on Android side (e.g. loading everything separately but still saving all in a Serializable variable)

List<ChannelTabInfo> tabs = new ArrayList<>();
for (int i = 0; i < extractor.getTabs().size(); i++) {
try {
ChannelTabInfo tabInfo = ChannelTabInfo.getInfo(extractor.getTabs().get(i), i == 0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it hardcoded that the first tab will always be the feed one? Also, what happens if the first tab throws an error?

Copy link
Contributor Author

@wb9688 wb9688 Mar 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's hardcoded, see this conversation, and I'll change that.

List<ChannelTabExtractor> channelTabExtractors = channelExtractor.getTabs();
for (ChannelTabExtractor channelTabExtractor : channelTabExtractors) {
if (channelTabExtractor.getName().equals(tabInfo.getName())) {
tabInfo.setChannelTabExtractor(channelTabExtractor);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then break

import org.schabi.newpipe.extractor.StreamingService;
import org.schabi.newpipe.extractor.linkhandler.ListLinkHandler;

public abstract class ChannelTabExtractor extends ListExtractor<InfoItem> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't getName() represent the name of an item? In this case if I understand correctly it is being used only to pass information about the tab to the app. For that purpose using a shared enum or some integer constants would be a better fit, since those can't contain typos. So I would add a getTabType() base function here. In case that's not done, I'd suggest adding a notice here about the different usage of getName().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't e.g. "Videos" the name of the tab?

Comment on lines -19 to +24
import static org.junit.Assert.*;
import static org.schabi.newpipe.extractor.ExtractorAsserts.*;
import static org.schabi.newpipe.extractor.StreamingService.*;
import static org.junit.Assert.assertEquals;
import static org.junit.Assert.assertNotEquals;
import static org.junit.Assert.assertNotNull;
import static org.junit.Assert.assertTrue;
import static org.schabi.newpipe.extractor.ExtractorAsserts.assertEmptyErrors;
import static org.schabi.newpipe.extractor.ExtractorAsserts.assertIsSecureUrl;
import static org.schabi.newpipe.extractor.ExtractorAsserts.assertNotEmpty;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wildcard imports are ok here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My IDE changes that automatically.

import static org.schabi.newpipe.extractor.services.DefaultTests.*;
import static org.schabi.newpipe.extractor.services.DefaultTests.defaultTestGetPageInNewExtractor;
import static org.schabi.newpipe.extractor.services.DefaultTests.defaultTestMoreItems;
import static org.schabi.newpipe.extractor.services.DefaultTests.defaultTestRelatedItems;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also here, wildcard imports are ok

@mauriciocolli
Copy link
Contributor

I don't think so, since not all channels have the same tabs, so I think instantiating ChannelTabs from the Channels instead of the StreamingServices is the way to go.

I've been thinking about lazy-loading and I think we should add a function to ChannelTabInfo to populate fields other than the name and call that only once that tab is selected in NewPipe.

Isn't Info objects supposed to be completed objects, I can't think of a reason to not make it just like kiosks. I don't know if we even need the LinkHandler stuff though, some other strategy seems to be the way to go.

And why store a extractor instance and even a function to load it when we could use the current architecture? Unless you want to change all that? I think another PR would be welcome to introduce such a change (the kiosks implementation is definitely not perfect).

For example, here:

public static ListExtractor.InfoItemsPage<InfoItem> getMoreItems(ChannelTabInfo tabInfo, String pageUrl) throws IOException, ExtractionException {
assureChannelTabExtractor(tabInfo);
return tabInfo.getChannelTabExtractor().getPage(pageUrl);
}

Why do we need a loaded ChannelExtractor when YouTube gives us a complete continuation response? This is up to the individual tab to decide if needs or not (if it is needed, it'd just fetch the initial page again, like it was done in the old channel extractor).

I don't think we should implement sub-tabs, as that's just unnecessary complexity imho.

That more like a filter than a sub-tab though, it'd reload just like it's done in search now.

@wb9688
Copy link
Contributor Author

wb9688 commented Mar 19, 2020

@mauriciocolli: We need the ChannelTabExtractor to load a next page. In kiosks, we also need a KioskExtractor in KioskInfo, but they're just recreated every time in getMoreItems(), which is possible there. Since ChannelTabExtractors are only retrievable through a ChannelExtractor, it wouldn't make sense that way, as we would need to fetch the channel page to know what tabs are available and then select the correct one. We don't want to fetch the channel page again every time we're scrolling down on a tab, so we store the ChannelTabExtractor. There's no other way to instantiate the ChannelTabExtractor, though we do indeed only need the nextPageUrl. Do you have any better solution that allows dynamic tabs for each channel, and passing data from the ChannelExtractor to the ChannelTabExtractor for the initial page?

Also, the Extractor class requires there to be a LinkHandler, which we're of course extending, so yes, we do need it.

@mauriciocolli
Copy link
Contributor

We need the ChannelTabExtractor to load a next page. In kiosks, we also need a KioskExtractor in KioskInfo, but they're just recreated every time in getMoreItems(), which is possible there.

Also, the Extractor class requires there to be a LinkHandler, which we're of course extending, so yes, we do need it.

Yep, that's what should be done here then.

Since ChannelTabExtractors are only retrievable through a ChannelExtractor, it wouldn't make sense that way, as we would need to fetch the channel page to know what tabs are available and then select the correct one. We don't want to fetch the channel page again every time we're scrolling down on a tab, so we store the ChannelTabExtractor. There's no other way to instantiate the ChannelTabExtractor, though we do indeed only need the nextPageUrl. Do you have any better solution that allows dynamic tabs for each channel, and passing data from the ChannelExtractor to the ChannelTabExtractor for the initial page?

I suggested a solution above, #279 (comment):

As for the lists, put a loaded tab when it doesn't need to be fetched again, and some kind of placeholder when it needs to (with its tab's id/url to instantiate later).

I did some rough tests and it worked out fine. Instead of returning a complete tab info, a placeholder would be returned instead. When the frontend checks that it is just a placeholder, it would just fetch the complete tab by using the placeholder info.

That way, we can still offer a way to not duplicate request for tabs that are available in the first request.

@mauriciocolli
Copy link
Contributor

@wb9688: About that example that you requested, I did a very rough proof of concept of what I was talking about, you can find it on this branch at my fork.

@wb9688 wb9688 added this to the 0.20.0 milestone Apr 19, 2020
@B0pol B0pol added the multiservice Issues related to multiple services label Apr 20, 2020
@B0pol B0pol mentioned this pull request Jun 4, 2020
14 tasks
@wb9688 wb9688 removed this from the 0.20.0 milestone Jul 13, 2020
@ghost
Copy link

ghost commented Aug 13, 2020

Hey @wb9688 @Stypox @TobiGr @B0pol please revive this, and include it in 0.20.0 again the great 0.20.0 release should be released with this one aswell. Honestly pls?

@B0pol
Copy link
Member

B0pol commented Aug 17, 2020

We have already big changes for 0.20 for NewPipe part (unified player, media notifications, SAF), then we don't plan to add it back in 0.20

@MD77MD
Copy link

MD77MD commented Aug 17, 2020

I agree 👍... we should save some feature for future releases... also to avoid delaing 0.20 any longer.

@MD77MD
Copy link

MD77MD commented Sep 28, 2020

I think this should be the feature highlight of newPipe 0.20.1

@Stypox
Copy link
Member

Stypox commented Sep 28, 2020

@MD77MD usually breaking changes happen in major releases, like 0.20.0 with unified ui and iirc 0.19.0 for the feed. Minor releases like 0.20.1 usually focus on bugfixing and small improvements. So this would go into 0.21.0, probably.

@MD77MD
Copy link

MD77MD commented Sep 28, 2020

I see... thank you for clarifying the difference 😄

@wb9688 wb9688 closed this Oct 24, 2020
@MD77MD
Copy link

MD77MD commented Oct 24, 2020

@wb9688 is there a reason we're closing this?

@xibr
Copy link

xibr commented Oct 24, 2020

Something wrong with @wb9688 has closed many pull requests.

@selurvedu
Copy link

wb9688 wants to merge 17 commits into TeamNewPipe:dev from unknown repository

Looks like wb's NewPipe repo got deleted and all corresponding PRs got closed.

@MD77MD
Copy link

MD77MD commented Aug 16, 2021

@Stypox @mhmdanas @TobiGr

could we still use this

@triallax
Copy link
Contributor

triallax commented Aug 16, 2021

Yes, it is possible to base a PR on another one (in fact, that is what has been done for the SAF PR). That being said, please don't ping me (or anybody really) when I am not participating unless for good reason. Other people already subscribed to this thread can answer your question too.

@TeamNewPipe TeamNewPipe locked and limited conversation to collaborators Aug 20, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request multiservice Issues related to multiple services
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support for Channel Tabs
9 participants