Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Search should return private rooms where the bot is already in or invited to (world_readable) #271

Open
bkil opened this issue Jun 22, 2023 · 9 comments · May be fixed by #272
Open

Search should return private rooms where the bot is already in or invited to (world_readable) #271

bkil opened this issue Jun 22, 2023 · 9 comments · May be fixed by #272
Labels
T-Other Questions, user support, anything else.

Comments

@bkil
Copy link

bkil commented Jun 22, 2023

I made a private room where I invited the bot (or even have the bot joined in from the past) I expect that room search on the archive site should return this room as it is able to show its chat log.


See below for some background information.

If I invite the bot to a room that is invite-only but has history visibility set to public, it would be desirable to show the chat log via the web.

You may ask why this odd combination is beneficial. It's a new proposal against scammers for public groups. Anything else other than invite-only allows for spamming and flooding room state, and we proposed creating a CAPTCHA bot that would let you speak this way, while still retaining readability of the room for outsiders.

https://archive.matrix.org/r/mod-ideas:matrix.org/date/2023/06/22?at=$Y42s_vlxoOy_JRm71sKx09a8gguAgK-fKjRe29FtQXk

@MadLittleMods MadLittleMods added the T-Other Questions, user support, anything else. label Jun 22, 2023
@MadLittleMods
Copy link
Contributor

MadLittleMods commented Jun 22, 2023

I think the current logic would work how you want if you set your history visibility to world_readable. You could also adjust the power levels in the room so people can't send messages by default until you promote them. Is this sufficient for your use case?

https://github.com/matrix-org/matrix-public-archive/blob/e4800852fff23cd4a03ef05282c71f58516cf2a4/server/routes/room-routes.js#L831-L834

Anything else seems a bit precarious to support.

Maybe, if we support a X-Robots-Tag/<meta name="robots" content="noindex, nofollow"> like thing that is being proposed in MSC4021, we could also include an explicit index directive which is mentioned in a few places but is actually not a specified directive and assumed to be a default anyway (reference)

@bkil
Copy link
Author

bkil commented Jun 22, 2023

Muting people based on power level won't work. The moment you allow arbitrary people joining in your room, it is vulnerable to uncontrolled join-flooding/brigading that poisons aggregate room state that is still not dealt with properly and making joins and sync slower for everyone.

We've already seen instances of this attack where they join in thousands of users per minute to your room from various HS in parallel. There can be other variations to poisoning room state, such as via invites or join-leaves.

Hence why I suggest to don't let them join in the first place.

I set history visibility to wide-open, the room to private and invited the bot, but the bot did not join the room.

@MadLittleMods
Copy link
Contributor

I set history visibility to wide-open, the room to private and invited the bot, but the bot did not join the room.

@bkil I think if you try to visit the room in the archive, the bot user will try to join and may be successful now that you invited it.

@bkil
Copy link
Author

bkil commented Jun 22, 2023

Okay, it now worked after following a direct link. So now I have a different question. Why does the room search not return rooms that are marked as private but that could still be previewed (such as due to already having invited the bot there)?

@bkil bkil changed the title Opting in: support previewing of private rooms Search should return private rooms where the bot is already in or invited to Jun 22, 2023
@MadLittleMods
Copy link
Contributor

MadLittleMods commented Jun 22, 2023

Why does the room search not return rooms that are marked as private but that could still be previewed (such as due to already having invited the bot there)?

We currently are only show public rooms in the room directory,

https://github.com/matrix-org/matrix-public-archive/blob/e4800852fff23cd4a03ef05282c71f58516cf2a4/server/lib/matrix-utils/fetch-public-rooms.js#L43-L48

I think it's mainly the case that most non-public rooms won't be joinable and will lead people to 403 Forbidden

But we could change that to also show rooms which are world_readable. Mind creating a PR for that?

@bkil
Copy link
Author

bkil commented Jun 22, 2023

It might be wise then to groom the search result page a bit. Hits for rooms should come first that are either publicly joinable or where the bot had already joined (or was invited to). The latter part might require some shuffling with its internal state.

Then the ones with world_readable history could be listed under a different heading that explains that joining these automatically is not allowed per se, but a member is first required to invite the bot there to be able to see the log.

@MadLittleMods
Copy link
Contributor

@bkil This kind of complexity isn't possible to support since the app is stateless (or efficient with the current Matrix API's).

Those kind of clarification notes would be nice but I don't see a way to effectively support that kind of thing.

@bkil
Copy link
Author

bkil commented Jun 22, 2023

Well, we could meet in the middle then and show world_readable invite-only rooms with a different color/decoration (and at the very end of the search results under their own heading?) with a small disclaimer that clicking this link may fail unless you have invited @archive:matrix.org to your room beforehand.

Or maybe this could be shown just on-demand when getting a forbidden response.

@MadLittleMods
Copy link
Contributor

@bkil I'm open to simply, additionally showing rooms that are world_readable in the list (no complexity needed). Since the room is world_readable, the history will be accessible regardless of whether we can join the room. But we would need to adjust the archive logic to just try to use the Matrix API's first and only try to join if it fails.

I've addressed this in #272

We could additionally make the 403: Forbidden error page a lot more friendly (which should be tracked separately)

@MadLittleMods MadLittleMods changed the title Search should return private rooms where the bot is already in or invited to Search should return private rooms where the bot is already in or invited to (world_readable) Jun 23, 2023
MadLittleMods added a commit that referenced this issue Jun 30, 2023
Happens to address part of #271
but made primarily as a follow-up to #239

---

Only 42% rooms on the `matrix.org` room directory are `world_readable` which means we will get pages of rooms that are half-empty most of the time if we just naively fetch 9 rooms at a time.

Ideally, we would be able to just add a filter directly to `/publicRooms` in order to only grab the `world_readable` rooms and still get full pages but the filter option doesn't allow us to slice by `world_readable` history visibility.

Instead, we have to paginate until we get a full grid of 9 rooms, then make a final `/publicRooms` request to backtrack to the exact continuation point so next page won't skip any rooms in between.

---

We had empty spaces in the grid before because some rooms in the room directory are private which we filtered out before. But that was a much more rare experience since only 2% of rooms were private .
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-Other Questions, user support, anything else.
Projects
None yet
2 participants