Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The rank of a post in the aggregated feed should be inversely proportional to the size of the community #1026

Closed
half-adder opened this issue Jul 26, 2020 · 8 comments · Fixed by #3907
Labels
area: sorting enhancement New feature or request

Comments

@half-adder
Copy link

half-adder commented Jul 26, 2020

Is your proposal related to a problem?

It's hard to see posts from smaller communities when you are also subscribed to larger communities.

Describe the solution you'd like

The weight of a post in the aggregated feed should be inversely proportional to the size of the community. This will allow posts from smaller communities (which get fewer upvotes) to float higher in the aggregated main feed, and be interspersed with posts from larger communities (which get many upvotes).

Consider a user that is subscribed to 3 communities:

C_0: 3 subscribers
C_1: 100 subscribers
C_2: 1000 subscribers

Then, an additional term could be added to the weight of the posts from each respective community:

C_0: weight * s * (1/3)
C_1: weight * s * (1/100)
C_2: weight * s * (1/1000)

weight = weight as it exists today
s = "scale factor" (i.e. how much the size of the community negatively affects the weight)

Describe alternatives you've considered

Instead of community size, maybe other indicators could be used. Off the top of my head, perhaps the average number of upvotes in the community (or a rolling average, of say the last week).

Additional context

That's it. Thanks for all of your work, Lemmy is really cool! I would make a PR but I've never used Rust...


Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

@half-adder half-adder added the enhancement New feature or request label Jul 26, 2020
@half-adder
Copy link
Author

OK I thought about this a little more.

Here's my first pass at a function that does this mapping.

z = rank / (1 + (scale_factor * community_size))

where rank is the ranking as described in https://dev.lemmy.ml/docs/about_ranking.html (note, rank should be in the range [0,1]. I assume this is the range you get when you divide the rank described by 10k?)

community_size needs also to be in the range [0, 1]. I think the most sensible way to achieve this is to normalize the community size relative to each user. So, 0 is mapped to the # of subscribers that the user's least-subscribed community has, and 1 is mapped to the # of subscribers thaat the user's most-subscribed community has, using the process described here.

And here is what that mapping looks like for various scale_factor (the colors are quantized here to be able to more clearly see the contours):
lemmyfig

As you can see, for larger communities, it takes a larger number of votes to get the equivalent z as a smaller community. The degree to which this applies is controlled by the scale_factor. So, I think this function achieves the desired result.

@half-adder half-adder changed the title The weight of a post in the aggregated feed should be inversely proportional to the size of the community The rank of a post in the aggregated feed should be inversely proportional to the size of the community Jul 26, 2020
@guillaume-uH57J9
Copy link

guillaume-uH57J9 commented Dec 26, 2022

Hi ! This sounds like a good idea.
With the current ranking, very big communities seems to be over-represented on the homepage. Sometimes a couple communities account for 80% of the homepage.
All things being equal, it does make sense for posts with lots of upvotes/comments to get a higher rank. But it would be good to display a diversity of communities on Lemmy's homepage.

@half-adder you suggest possible scaling, the graph seems like a good start. However I don't understand the need for normalization. Is this required by Lemmy's design?
When looking at Lemmy's documentation on ranking I see values as high as 600.

Normalization would probably be difficult in a fediverse settings. You'd either need to

  • Set a an arbitrary maximum at which score is capped. Arbitrary things are usually bad, and in this case posts above this max would have identical score.
  • Or, Look at all the fediverse's communities and sort them to find the largest one to get a maximum. That's a relatively complex operation to compute a post's score. And since communities size vary often, you'd need to recompute all posts score continuously, or accept that scores may be outdated (ie normalized using old maximum).

Scaling without normalizing would be saner IMHO, in order to obtain ranks that are absolutes, can be computed independently of other communities' size, and can be globally compared across the fediverse.
Using either log() or inverse pow() functions.

For instance:
z = rank / log(1 + community_size * factor)
z = rank / (community_size^(1/factor))

@dessalines
Copy link
Member

dessalines commented Dec 29, 2022

The z = rank / log(1 + community_size * factor) would be appropriate, but it should also use the monthly or weekly active users (IE activity), rather than community size, which is mostly useless.

Since there would then be two scale factors (one that affects the timed rank, and one that affects the community activity), they would need to be tuned in such a way as to not swamp out the other affect.

The time influence should always be stronger.

That would make the final rank something like:

z = ScaleFactor1 * log(Max(1, 3 + Score)) / (Time + 2)^Gravity / log(1 + active_monthly_users * ScaleFactor2) or

z = log_score_factor / pow_time_decay / log_community_activity

@dessalines
Copy link
Member

This way you would force all instances to be about all topics equally. I personally don't like to see so much Shit Reactionaries Say posts from Lemmygrad.ml on Lemmy.ml but while this would fix that, it creates a bigger problem than what it's fixing. There has to be a better way.

Block any communities you don't want to see, or use the Local or Subscribed filter to not see federated communities. This issue is completely separate from that.

What I would like is for users to be able to give communities a weight in the form of 0-100 points represented as 0 to 5 stars and get an amount of posts from each community in their feed proportional to the weight. Otherwise assign a weight automatically to each community based on each user interactions with the posts in the community as a percentage of upvotes vs downvotes.

This sounds incredibly complicated for users or admins to do, when all they want is to see posts from both smaller communities and larger ones, without having to explicitly add weighting values for each.

@guillaume-uH57J9 's solution is the best way to handle this.

@L3v3L
Copy link
Contributor

L3v3L commented Jun 21, 2023

For now could we add this to the post select (https://github.com/LemmyNet/lemmy/blob/b214d3dc00c269d7987ace7f5522e2ff406eec03/crates/db_views/src/post_view.rs#LL288C1-L288C16)

ROW_NUMBER() OVER (PARTITION BY post.community_id ORDER BY post_aggregates.score DESC) AS community_rank

I tried with all my might to get this translated into diesel, but it seems rust has gotten the better of me.

Explaination: It assigns a rank number based on the score in it's community. We then create a sort for Best Day, Best Month etc etc (I can create the sorts)

@dessalines @Nutomic

@half-adder
Copy link
Author

half-adder commented Jun 22, 2023

@half-adder you suggest possible scaling, the graph seems like a good start. However I don't understand the need for normalization. Is this required by Lemmy's design?

I don't think you need to normalize, but it made visual comparison of the scaling factors easier

@ghost
Copy link

ghost commented Jun 27, 2023

What about balancing instances based on monthly active users instead of communities?

Request for Comments: Balance Scores Based on Monthly Active Users

@Atemu
Copy link

Atemu commented Sep 7, 2023

Thank you! <3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: sorting enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants