Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow SQL queries #2877

Closed
Nutomic opened this issue May 24, 2023 · 81 comments
Closed

Slow SQL queries #2877

Nutomic opened this issue May 24, 2023 · 81 comments
Labels
area: database bug Something isn't working

Comments

@Nutomic
Copy link
Member

Nutomic commented May 24, 2023

There are some problems with database lockups which seem to be caused by slow queries. I set log_min_duration_statement=3000 and collecting any slow queries in this issue. These should be optimized, because in case of of db pool size 5, and 5 users triggering a slow query at the same time, all db queries would fail for the next couple of seconds.

7 seconds:
SELECT "post"."id", "post"."name", "post"."url", "post"."body", "post"."creator_id", "post"."community_id", "post"."removed", "post"."locked", "post"."published", "post"."updated", "post"."deleted", "post"."nsfw", "post"."embed_title", "post"."embed_description", "post"."embed_video_url", "post"."thumbnail_url", "post"."ap_id", "post"."local", "post"."language_id", "post"."featured_community", "post"."featured_local", "person"."id", "person"."name", "person"."display_name", "person"."avatar", "person"."banned", "person"."published", "person"."updated", "person"."actor_id", "person"."bio", "person"."local", "person"."banner", "person"."deleted", "person"."inbox_url", "person"."shared_inbox_url", "person"."matrix_user_id", "person"."admin", "person"."bot_account", "person"."ban_expires", "person"."instance_id", "community"."id", "community"."name", "community"."title", "community"."description", "community"."removed", "community"."published", "community"."updated", "community"."deleted", "community"."nsfw", "community"."actor_id", "community"."local", "community"."icon", "community"."banner", "community"."hidden", "community"."posting_restricted_to_mods", "community"."instance_id", "community_person_ban"."id", "community_person_ban"."community_id", "community_person_ban"."person_id", "community_person_ban"."published", "community_person_ban"."expires", "post_aggregates"."id", "post_aggregates"."post_id", "post_aggregates"."comments", "post_aggregates"."score", "post_aggregates"."upvotes", "post_aggregates"."downvotes", "post_aggregates"."published", "post_aggregates"."newest_comment_time_necro", "post_aggregates"."newest_comment_time", "post_aggregates"."featured_community", "post_aggregates"."featured_local", "community_follower"."id", "community_follower"."community_id", "community_follower"."person_id", "community_follower"."published", "community_follower"."pending", "post_saved"."id", "post_saved"."post_id", "post_saved"."person_id", "post_saved"."published", "post_read"."id", "post_read"."post_id", "post_read"."person_id", "post_read"."published", "person_block"."id", "person_block"."person_id", "person_block"."target_id", "person_block"."published", "post_like"."score", coalesce(("post_aggregates"."comments" - "person_post_aggregates"."read_comments"), "post_aggregates"."comments") FROM (((((((((((("post" INNER JOIN "person" ON ("post"."creator_id" = "person"."id")) INNER JOIN "community" ON ("post"."community_id" = "community"."id")) LEFT OUTER JOIN "community_person_ban" ON ((("post"."community_id" = "community_person_ban"."community_id") AND ("community_person_ban"."person_id" = "post"."creator_id")) AND (("community_person_ban"."expires" IS NULL) OR ("community_person_ban"."expires" > CURRENT_TIMESTAMP)))) INNER JOIN "post_aggregates" ON ("post_aggregates"."post_id" = "post"."id")) LEFT OUTER JOIN "community_follower" ON (("post"."community_id" = "community_follower"."community_id") AND ("community_follower"."person_id" = '33517'))) LEFT OUTER JOIN "post_saved" ON (("post"."id" = "post_saved"."post_id") AND ("post_saved"."person_id" = '33517'))) LEFT OUTER JOIN "post_read" ON (("post"."id" = "post_read"."post_id") AND ("post_read"."person_id" = '33517'))) LEFT OUTER JOIN "person_block" ON (("post"."creator_id" = "person_block"."target_id") AND ("person_block"."person_id" = '33517'))) LEFT OUTER JOIN "community_block" ON (("community"."id" = "community_block"."community_id") AND ("community_block"."person_id" = '33517'))) LEFT OUTER JOIN "post_like" ON (("post"."id" = "post_like"."post_id") AND ("post_like"."person_id" = '33517'))) LEFT OUTER JOIN "person_post_aggregates" ON (("post"."id" = "person_post_aggregates"."post_id") AND ("person_post_aggregates"."person_id" = '33517'))) LEFT OUTER JOIN "local_user_language" ON (("post"."language_id" = "local_user_language"."language_id") AND ("local_user_language"."local_user_id" = '11402'))) WHERE (((((((((("community_follower"."person_id" IS NOT NULL) AND ("post"."nsfw" = 'f')) AND ("community"."nsfw" = 'f')) AND ("local_user_language"."language_id" IS NOT NULL)) AND ("community_block"."person_id" IS NULL)) AND ("person_block"."person_id" IS NULL)) AND ("post"."removed" = 'f')) AND ("post"."deleted" = 'f')) AND ("community"."removed" = 'f')) AND ("community"."deleted" = 'f')) ORDER BY "post_aggregates"."featured_local" DESC , hot_rank("post_aggregates"."score", "post_aggregates"."newest_comment_time_necro") DESC , "post_aggregates"."newest_comment_time_necro" DESC LIMIT '40' OFFSET '0'

3.5 seconds:
SELECT "post"."id", "post"."name", "post"."url", "post"."body", "post"."creator_id", "post"."community_id", "post"."removed", "post"."locked", "post"."published", "post"."updated", "post"."deleted", "post"."nsfw", "post"."embed_title", "post"."embed_description", "post"."embed_video_url", "post"."thumbnail_url", "post"."ap_id", "post"."local", "post"."language_id", "post"."featured_community", "post"."featured_local", "person"."id", "person"."name", "person"."display_name", "person"."avatar", "person"."banned", "person"."published", "person"."updated", "person"."actor_id", "person"."bio", "person"."local", "person"."banner", "person"."deleted", "person"."inbox_url", "person"."shared_inbox_url", "person"."matrix_user_id", "person"."admin", "person"."bot_account", "person"."ban_expires", "person"."instance_id", "community"."id", "community"."name", "community"."title", "community"."description", "community"."removed", "community"."published", "community"."updated", "community"."deleted", "community"."nsfw", "community"."actor_id", "community"."local", "community"."icon", "community"."banner", "community"."hidden", "community"."posting_restricted_to_mods", "community"."instance_id", "community_person_ban"."id", "community_person_ban"."community_id", "community_person_ban"."person_id", "community_person_ban"."published", "community_person_ban"."expires", "post_aggregates"."id", "post_aggregates"."post_id", "post_aggregates"."comments", "post_aggregates"."score", "post_aggregates"."upvotes", "post_aggregates"."downvotes", "post_aggregates"."published", "post_aggregates"."newest_comment_time_necro", "post_aggregates"."newest_comment_time", "post_aggregates"."featured_community", "post_aggregates"."featured_local", "community_follower"."id", "community_follower"."community_id", "community_follower"."person_id", "community_follower"."published", "community_follower"."pending", "post_saved"."id", "post_saved"."post_id", "post_saved"."person_id", "post_saved"."published", "post_read"."id", "post_read"."post_id", "post_read"."person_id", "post_read"."published", "person_block"."id", "person_block"."person_id", "person_block"."target_id", "person_block"."published", "post_like"."score", coalesce(("post_aggregates"."comments" - "person_post_aggregates"."read_comments"), "post_aggregates"."comments") FROM (((((((((((("post" INNER JOIN "person" ON ("post"."creator_id" = "person"."id")) INNER JOIN "community" ON ("post"."community_id" = "community"."id")) LEFT OUTER JOIN "community_person_ban" ON ((("post"."community_id" = "community_person_ban"."community_id") AND ("community_person_ban"."person_id" = "post"."creator_id")) AND (("community_person_ban"."expires" IS NULL) OR ("community_person_ban"."expires" > CURRENT_TIMESTAMP)))) INNER JOIN "post_aggregates" ON ("post_aggregates"."post_id" = "post"."id")) LEFT OUTER JOIN "community_follower" ON (("post"."community_id" = "community_follower"."community_id") AND ("community_follower"."person_id" = '-1'))) LEFT OUTER JOIN "post_saved" ON (("post"."id" = "post_saved"."post_id") AND ("post_saved"."person_id" = '-1'))) LEFT OUTER JOIN "post_read" ON (("post"."id" = "post_read"."post_id") AND ("post_read"."person_id" = '-1'))) LEFT OUTER JOIN "person_block" ON (("post"."creator_id" = "person_block"."target_id") AND ("person_block"."person_id" = '-1'))) LEFT OUTER JOIN "community_block" ON (("community"."id" = "community_block"."community_id") AND ("community_block"."person_id" = '-1'))) LEFT OUTER JOIN "post_like" ON (("post"."id" = "post_like"."post_id") AND ("post_like"."person_id" = '-1'))) LEFT OUTER JOIN "person_post_aggregates" ON (("post"."id" = "person_post_aggregates"."post_id") AND ("person_post_aggregates"."person_id" = '-1'))) LEFT OUTER JOIN "local_user_language" ON (("post"."language_id" = "local_user_language"."language_id") AND ("local_user_language"."local_user_id" = '-1'))) WHERE ((((((("post"."community_id" = '16') AND ("post"."nsfw" = 'f')) AND ("community"."nsfw" = 'f')) AND ("post"."removed" = 'f')) AND ("post"."deleted" = 'f')) AND ("community"."removed" = 'f')) AND ("community"."deleted" = 'f')) ORDER BY "post_aggregates"."featured_community" DESC , hot_rank("post_aggregates"."score", "post_aggregates"."newest_comment_time_necro") DESC , "post_aggregates"."newest_comment_time_necro" DESC LIMIT '20' OFFSET '0'

3.6 seconds:
SELECT "comment_reply"."id", "comment_reply"."recipient_id", "comment_reply"."comment_id", "comment_reply"."read", "comment_reply"."published", "comment"."id", "comment"."creator_id", "comment"."post_id", "comment"."content", "comment"."removed", "comment"."published", "comment"."updated", "comment"."deleted", "comment"."ap_id", "comment"."local", "comment"."path", "comment"."distinguished", "comment"."language_id", "person"."id", "person"."name", "person"."display_name", "person"."avatar", "person"."banned", "person"."published", "person"."updated", "person"."actor_id", "person"."bio", "person"."local", "person"."banner", "person"."deleted", "person"."inbox_url", "person"."shared_inbox_url", "person"."matrix_user_id", "person"."admin", "person"."bot_account", "person"."ban_expires", "person"."instance_id", "post"."id", "post"."name", "post"."url", "post"."body", "post"."creator_id", "post"."community_id", "post"."removed", "post"."locked", "post"."published", "post"."updated", "post"."deleted", "post"."nsfw", "post"."embed_title", "post"."embed_description", "post"."embed_video_url", "post"."thumbnail_url", "post"."ap_id", "post"."local", "post"."language_id", "post"."featured_community", "post"."featured_local", "community"."id", "community"."name", "community"."title", "community"."description", "community"."removed", "community"."published", "community"."updated", "community"."deleted", "community"."nsfw", "community"."actor_id", "community"."local", "community"."icon", "community"."banner", "community"."hidden", "community"."posting_restricted_to_mods", "community"."instance_id", "person1"."id", "person1"."name", "person1"."display_name", "person1"."avatar", "person1"."banned", "person1"."published", "person1"."updated", "person1"."actor_id","person1"."bio", "person1"."local", "person1"."banner", "person1"."deleted", "person1"."inbox_url", "person1"."shared_inbox_url", "person1"."matrix_user_id", "person1"."admin", "person1"."bot_account", "person1"."ban_expires", "person1"."instance_id", "comment_aggregates"."id", "comment_aggregates"."comment_id", "comment_aggregates"."score", "comment_aggregates"."upvotes", "comment_aggregates"."downvotes", "comment_aggregates"."published", "comment_aggregates"."child_count", "community_person_ban"."id", "community_person_ban"."community_id", "community_person_ban"."person_id", "community_person_ban"."published", "community_person_ban"."expires", "community_follower"."id", "community_follower"."community_id", "community_follower"."person_id", "community_follower"."published", "community_follower"."pending", "comment_saved"."id", "comment_saved"."comment_id", "comment_saved"."person_id", "comment_saved"."published", "person_block"."id", "person_block"."person_id", "person_block"."target_id", "person_block"."published", "comment_like"."score" FROM ((((((((((("comment_reply" INNER JOIN "comment" ON ("comment_reply"."comment_id" = "comment"."id")) INNER JOIN "person" ON ("comment"."creator_id" = "person"."id")) INNER JOIN "post" ON ("comment"."post_id" = "post"."id")) INNER JOIN "community" ON ("post"."community_id" = "community"."id")) INNER JOIN "person" AS "person1" ON ("comment_reply"."recipient_id" = "person1"."id")) INNER JOIN "comment_aggregates" ON ("comment"."id" = "comment_aggregates"."comment_id")) LEFT OUTER JOIN "community_person_ban" ON ((("community"."id" = "community_person_ban"."community_id") AND ("community_person_ban"."person_id" = "comment"."creator_id")) AND (("community_person_ban"."expires" IS NULL) OR ("community_person_ban"."expires" > CURRENT_TIMESTAMP)))) LEFT OUTER JOIN "community_follower" ON (("post"."community_id" = "community_follower"."community_id") AND ("community_follower"."person_id" = '8218'))) LEFT OUTER JOIN "comment_saved" ON (("comment"."id" = "comment_saved"."comment_id") AND ("comment_saved"."person_id" = '8218'))) LEFT OUTER JOIN "person_block" ON (("comment"."creator_id" = "person_block"."target_id") AND ("person_block"."person_id" = '8218'))) LEFT OUTER JOIN "comment_like" ON (("comment"."id" = "comment_like"."comment_id") AND ("comment_like"."person_id" = '8218'))) WHERE((("comment_reply"."recipient_id" = '8218') AND ("comment_reply"."read" = 'f')) AND ("person"."bot_account" = 'f')) ORDER BY "comment"."published" DESC LIMIT '40' OFFSET '0'

4 seconds:
SELECT "post"."id", "post"."name", "post"."url", "post"."body", "post"."creator_id", "post"."community_id", "post"."removed", "post"."locked", "post"."published", "post"."updated", "post"."deleted", "post"."nsfw", "post"."embed_title", "post"."embed_description", "post"."embed_video_url", "post"."thumbnail_url", "post"."ap_id", "post"."local", "post"."language_id", "post"."featured_community", "post"."featured_local", "person"."id", "person"."name", "person"."display_name", "person"."avatar", "person"."banned", "person"."published", "person"."updated", "person"."actor_id", "person"."bio", "person"."local", "person"."banner", "person"."deleted", "person"."inbox_url", "person"."shared_inbox_url", "person"."matrix_user_id", "person"."admin", "person"."bot_account", "person"."ban_expires", "person"."instance_id", "community"."id", "community"."name", "community"."title", "community"."description", "community"."removed", "community"."published", "community"."updated", "community"."deleted", "community"."nsfw", "community"."actor_id", "community"."local", "community"."icon", "community"."banner", "community"."hidden", "community"."posting_restricted_to_mods", "community"."instance_id", "community_person_ban"."id", "community_person_ban"."community_id", "community_person_ban"."person_id", "community_person_ban"."published", "community_person_ban"."expires", "post_aggregates"."id", "post_aggregates"."post_id", "post_aggregates"."comments", "post_aggregates"."score", "post_aggregates"."upvotes", "post_aggregates"."downvotes", "post_aggregates"."published", "post_aggregates"."newest_comment_time_necro", "post_aggregates"."newest_comment_time", "post_aggregates"."featured_community", "post_aggregates"."featured_local", "community_follower"."id", "community_follower"."community_id", "community_follower"."person_id", "community_follower"."published", "community_follower"."pending", "post_saved"."id", "post_saved"."post_id", "post_saved"."person_id", "post_saved"."published", "post_read"."id", "post_read"."post_id", "post_read"."person_id", "post_read"."published", "person_block"."id", "person_block"."person_id", "person_block"."target_id", "person_block"."published", "post_like"."score", coalesce(("post_aggregates"."comments" - "person_post_aggregates"."read_comments"), "post_aggregates"."comments") FROM (((((((((((("post" INNER JOIN "person" ON ("post"."creator_id" = "person"."id")) INNER JOIN "community" ON ("post"."community_id" = "community"."id")) LEFT OUTER JOIN "community_person_ban" ON ((("post"."community_id" = "community_person_ban"."community_id") AND ("community_person_ban"."person_id" = "post"."creator_id")) AND (("community_person_ban"."expires" IS NULL) OR ("community_person_ban"."expires" > CURRENT_TIMESTAMP)))) INNER JOIN "post_aggregates" ON ("post_aggregates"."post_id" = "post"."id")) LEFT OUTER JOIN "community_follower" ON (("post"."community_id" = "community_follower"."community_id") AND ("community_follower"."person_id" = '-1'))) LEFT OUTER JOIN "post_saved" ON (("post"."id" = "post_saved"."post_id") AND ("post_saved"."person_id" = '-1'))) LEFT OUTER JOIN "post_read" ON (("post"."id" = "post_read"."post_id") AND ("post_read"."person_id" = '-1'))) LEFT OUTER JOIN "person_block" ON (("post"."creator_id" = "person_block"."target_id") AND ("person_block"."person_id" = '-1'))) LEFT OUTER JOIN "community_block" ON (("community"."id" = "community_block"."community_id") AND ("community_block"."person_id" = '-1'))) LEFT OUTER JOIN "post_like" ON (("post"."id" = "post_like"."post_id") AND ("post_like"."person_id" = '-1'))) LEFT OUTER JOIN "person_post_aggregates" ON (("post"."id" = "person_post_aggregates"."post_id") AND ("person_post_aggregates"."person_id" = '-1'))) LEFT OUTER JOIN "local_user_language" ON (("post"."language_id" = "local_user_language"."language_id") AND ("local_user_language"."local_user_id" = '-1'))) WHERE ((((((((("community"."hidden" = 'f') OR ("community_follower"."person_id" = '-1'))AND ("post"."url" = 'https://blog.fabiomanganiello.com/article/Web-3.0-and-the-undeliverable-promise-of-decentralization')) AND ("post"."nsfw" = 'f')) AND ("community"."nsfw" = 'f')) AND ("post"."removed" = 'f')) AND ("post"."deleted" = 'f')) AND ("community"."removed" = 'f')) AND ("community"."deleted" = 'f')) ORDER BY "post_aggregates"."featured_local" DESC , "post_aggregates"."score" DESC , "post_aggregates"."published" DESC LIMIT '6' OFFSET '0'

@johanndt
Copy link

johanndt commented Jun 3, 2023

I've got some SQL optimization experience. I don't have an instance with real data and real load to run these on. Any chance you can run this with:
EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS, FORMAT JSON) SELECT ....

We can then paste the query and plan into a visualizer like https://tatiyants.com/pev/#/plans and see what's really going on.

Also, 5 connections for anything public facing is going to have issues. I'd highly recommend upping the number of connections and also putting a connection pooler like pgbouncer in front of the db.

@Vekseid
Copy link

Vekseid commented Jun 4, 2023

Is left join really the optimal solution in Postgres compared to a NOT IN (select ban type here)... subquery?

@dessalines
Copy link
Member

@johanndt We are in desperate need of SQL experts, as my SQL skills are very mediocre. We'd love to have your expertise.

I'll post that query JSON shortly.

@Vekseid no, joins are better than in queries with potentially thousands of inserted IDs.

@Nutomic
Copy link
Member Author

Nutomic commented Jun 4, 2023

@johanndt Thanks, any help with this is very appreciated because postgres is the biggest performance bottleneck now. I ran the explain for all queries and uploaded the output to my Nextcloud: https://nextcloud.nutomic.com/index.php/s/pARP6eSCbY98QDM

@johanndt
Copy link

johanndt commented Jun 4, 2023

Thanks @Nutomic - I'll have a look tonight after work. And agree on the sentiment that IN queries are slow when there are lots of IDs to check.

@johanndt
Copy link

johanndt commented Jun 4, 2023

@Nutomic - I had a quick look (I'm only looking at Query 1). Generally it's using indexes everywhere, so let's not try to over-optimize, but the final top level NESTED LOOP is taking 47% of the time. And I believe this loop is mostly adding the columns which is used for sorting. Can you please run it again without the ORDER BY clause?

If that's faster eg 4 seconds instead of 8, add the various ORDER BYs back in to see which one is causing the slowness? I'm highly suspicious of the hot_rank() function. This still only accounts for 50% of the slowness, but we have to start somewhere.

Other quick findings... finding Post Likes and Post Reads are the other slow parts - which makes sense as I'm sure those are the biggest tables in the system.

@RocketDerp

This comment was marked as abuse.

@dessalines
Copy link
Member

The order that things are loaded in isn't too important for the DB.

What's most important, is optimizing our DB queries (and triggers), which are nearly all single fetch queries with a ton of joins to get all the relevant data. For example, a CommentView has an attached post, creator, community, aggregate counts, etc.

These joins are done in diesel in the crates/db_views_* folders, and the specific DB tables are in crates/db_schema . Some of the bigger joins and queries are in post_view.rs, and comment_view.rs.

@johanndt
Copy link

johanndt commented Jun 5, 2023

So while waiting for the results of running without sorting, I've looked more into the hot_rank() function. I found @dessalines 's DBA Stack Exchange Post talking about it.

In the execution plan I don't see any nodes using the idx_post_aggregates_active index. Not sure why that is, but might be because the scores and latest comment time of the posts are changing more quickly now, and the index can't be used because the values being searched for are not available? Sorry not sure, since you're doing some strange things with a mutable function being indexed as immutable, and the behavior is probably undefined.

I see in the code the index is recalculated every hour, to "fix" the changing values. Even if that really did fix things, rebuilding an index is very expensive and is not a scalable solution long term as the amount of posts increase.

Instead of hourly rebuilding an index, why not just write the current hot_rank into a column of post_aggregates? It will make all the queries simpler and faster. If you index that column it'll always be used and will be up to date. You can update it whenever you update the aggregates of a post. Which I imagine happens whenever a post is upvoted or commented on. So for "hot" posts that will be pretty often.

And then also do an hourly update to catch any posts which didn't have a recent update.

You can also optimize this hourly update to only work on a subset of the posts. As I understand it the hot_rank algorithm is a decaying function. So obviously after a certain number of days it will stop being hot, then it can be marked as NULL or 0 hotness, and you only have to do the hourly update on posts of the last x days.

Think of it this way. For every comment or vote which happens on a post, how many times did someone just read or list it? Obviously changes to a post would happen a lot less than people just browsing and reading posts. For every 1 update you might probably get 100s of read requests. So it would make sense to update the hot_rank only on updates. At the moment you are calculating it on every read.

And as I said, rebuilding an index every hour is going to start hurting a lot in terms of db bloat and cpu usage.

@johanndt
Copy link

johanndt commented Jun 5, 2023

Also looked at Query 4. This one seems more straight-forward. Everything is fast and indexed, except for grabbing the correct row(s) from the post table.

This WHERE clause is causing a sequential scan: ((NOT post.nsfw) AND (NOT post.removed) AND (NOT post.deleted) AND (post.url = 'https://blog.fabiomanganiello.com/article/Web-3.0-and-the-undeliverable-promise-of-decentralization'::text))

Was looking through the code, but couldn't find where indexes are defined. Is there an index on post.url? Or maybe a combination of url, nsfw, removed, deleted?

@Nutomic
Copy link
Member Author

Nutomic commented Jun 5, 2023

The database initialization is handled by migrations in the root folder of this repo. You can follow the local development guide to get a local Lemmy instance running with database, and use psql to inspect it. If you can make a pull request with the suggested changes, that would be very helpful. I will try the ORDER BY now.

Edit: you are right, the hot rank accounts for most of the slowness. Removing it reduces the query time to 10%. Here are the outputs. Timings are different because I upgraded to a faster server.

@johanndt
Copy link

johanndt commented Jun 5, 2023

@Nutomic - I was pretty confident it was the hot_rank. If it was only a query fix, I might've had some time to do a PR, but as per one of my previous messages the suggested fix is a bigger refactor of the approach, rather than just changing one query.

I'm running a company and will be out of the country for a month from next week on a work trip, so will not be able to contribute coding time. But happy to answer more questions or do some analysis of the dumps.

Nutomic added a commit that referenced this issue Jun 6, 2023
As mentioned in #2877 (comment)

Not sure if its preferable to do this, or make a combined index
which includes post.nsfw, post.removed, post.deleted
@dessalines
Copy link
Member

dessalines commented Jun 6, 2023

Instead of hourly rebuilding an index, why not just write the current hot_rank into a column of post_aggregates?

Whoa... I didn't know why I didn't think of that. And then just run a periodic job to update that hot_rank column, rather than reindex the table?

So obviously after a certain number of days it will stop being hot, then it can be marked as NULL or 0 hotness, and you only have to do the hourly update on posts of the last x days.

GOOD point. Our time decay function seems to peak out at about two days, so the last rolling week of posts, rather than all historical ones, should be good to do this.

I'm leaning towards a periodic job rather than a DB trigger, because a trigger might have to work on every single vote, which could cause a lot of unecessary calculations.

Another benefit of that, is that we can move that time decay function to rust, rather than the hacky way I've added it to postgres.

Thank you so much for this one!

@dessalines
Copy link
Member

This WHERE clause is causing a sequential scan: ((NOT post.nsfw) AND (NOT post.removed) AND (NOT post.deleted) AND (post.url = 'https://blog.fabiomanganiello.com/article/Web-3.0-and-the-undeliverable-promise-of-decentralization'::text))

I know the answer to this question is never easy, because its not always predictable which index postgres will use... but do you recommend compound indexes for our most common queries, or single column indexes?

Nutomic added a commit that referenced this issue Jun 6, 2023
As mentioned in #2877 (comment)

Not sure if its preferable to do this, or make a combined index
which includes post.nsfw, post.removed, post.deleted
Nutomic added a commit that referenced this issue Jun 6, 2023
As mentioned in #2877 (comment)

Not sure if its preferable to do this, or make a combined index
which includes post.nsfw, post.removed, post.deleted
@Two9A
Copy link

Two9A commented Jun 6, 2023

Naive outside thought: if those fields (nsfw, removed, deleted) are bools is it worth storing them in a bitfield, and translating at the ORM layer? The rest of the existing code could continue to reference those fields on the post object, while queries like "not nsfw and not removed" become attributes & 0x09 = 0 or similar.

(Bonus: adding new flags involves no schema changes.)

@Shananra
Copy link

Shananra commented Jun 6, 2023

Would it be possible to get a db dump with sample data? I could help with some of these queries.

@johanndt
Copy link

johanndt commented Jun 6, 2023

Whoa... I didn't know why I didn't think of that. And then just run a periodic job to update that hot_rank column, rather than reindex the table?

@dessalines - Haha, that's okay. Sometimes you're so close to the problem it's hard to see anything else.

I'm leaning towards a periodic job rather than a DB trigger, because a trigger might have to work on every single vote, which could cause a lot of unecessary calculations. Another benefit of that, is that we can move that time decay function to rust, rather than the hacky way I've added it to postgres.

Yes that sounds good. In one of the lemmy community threads about performance someone said that scaling will mean that some things should happen more periodically in batches. I agree, and this change is in the right direction for that kind of thinking.

I know the answer to this question is never easy, because its not always predictable which index postgres will use... but do you recommend compound indexes for our most common queries, or single column indexes?

I personally prefer single indexes. My co-founder always argue with me saying that compound ones are all you need, since Postgresql can figure it out. But there are cases where a compound index won't help you, when you're not using the first column in the index to filter by. See this good article for more info: https://www.cybertec-postgresql.com/en/combined-indexes-vs-separate-indexes-in-postgresql/

In this specific case, I think the url column is enough, because an index on the other columns won't really help you much? Those columns are boolean, meaning only 1 of 2 values. And what percentage of your posts will be deleted or removed? Probably only a very small percentage - which means 95%+ of the index will be the same value and it will have to scan through it all in any case.

Naive outside thought: if those fields (nsfw, removed, deleted) are bools is it worth storing them in a bitfield, and translating at the ORM layer?

That's a bad idea, sorry @Two9A. Imagine the case where you're trying to retrieve a single post, but it has been deleted. With the WHERE clause in the database, it will notice immediately that the there are 0 "non-deleted" rows and will stop the query and return a 0 rows result. However if you're just getting it and not checking the flags, then it will find the row and continue to do all the other 10+ joins in the database. All these queries are joining a LOT of info together. So that's a lot of extra work you're now giving the database to do which weren't necessary. Combining things into a bitfield means you can't index on it either.

Also, if down the road you want to change the type of a column, then what do you do if it's in a bitflag? Let's say instead of a binary state, deleted might have multiple options (perma-deleted, shadow-deleted, not-deleted, etc). Now you're stuck having to write more complex data migrations. Schema changes are not a big deal, so I think not having to make schema changes is not a real benefit. Bitfields are something you use for embedded C code where you're trying to save space. Internally in postgresql they probably already store booleans as just a single bit or 2.

Nutomic added a commit that referenced this issue Jun 7, 2023
As mentioned in #2877 (comment)

Not sure if its preferable to do this, or make a combined index
which includes post.nsfw, post.removed, post.deleted
Nutomic added a commit that referenced this issue Jun 7, 2023
As mentioned in #2877 (comment)

Not sure if its preferable to do this, or make a combined index
which includes post.nsfw, post.removed, post.deleted
dessalines pushed a commit that referenced this issue Jun 7, 2023
* Add db index for post.url column

As mentioned in #2877 (comment)

Not sure if its preferable to do this, or make a combined index
which includes post.nsfw, post.removed, post.deleted

* remove unique
Nutomic added a commit that referenced this issue Jun 7, 2023
* Add db index for post.url column

As mentioned in #2877 (comment)

Not sure if its preferable to do this, or make a combined index
which includes post.nsfw, post.removed, post.deleted

* remove unique
@dullbananas
Copy link
Collaborator

Related to number 4 ... @dullbananas put in a pull request #3865 yesterday that attempts to rework the logic on these joins. I've not really understood why there is a JOIN on "is null" criteria without a user-id filter, and this new code seems to put that topic front and center:

image

It does not currently compile for me, syntax issue with Diesel. Can we get the syntax right? Thank you and have a great weekend.

Now it compiles

@RocketDerp

This comment was marked as abuse.

@dullbananas
Copy link
Collaborator

@RocketDerp That would be very helpful.

The shell script should automatically start and stop the database and lemmy server, just like test.sh.

@phiresky
Copy link
Collaborator

phiresky commented Aug 16, 2023

it takes nearly 1 hour to build 60,000 posts and then 314,913 comments on those posts

Make sure you have alter system set synchronous_commit=off set (same as in production) to make sure your numbers are good. Otherwise every single statement/COMMIT will cause a wait for fsync.

For your timing benchmark you probably need to add some parallelism with e.g. Promise.all() or better something like async-pool. Because otherwise every post will wait for every other post which is not what happens usuually

@RocketDerp

This comment was marked as abuse.

@RocketDerp

This comment was marked as abuse.

@RocketDerp

This comment was marked as abuse.

@RocketDerp

This comment was marked as abuse.

@RocketDerp

This comment was marked as abuse.

@RocketDerp

This comment was marked as abuse.

@RocketDerp

This comment was marked as abuse.

@phiresky
Copy link
Collaborator

phiresky commented Aug 18, 2023

I asked sunarus to try setting join_collapse_limit a while back and they said it made no difference (except higher planning time).

Would be interesting if you could try my PR #3872 with your test server data. (it's kinda similar to your inclusion idea). Copy-pasting the visualization I made:

@RocketDerp

This comment was marked as abuse.

@phiresky
Copy link
Collaborator

phiresky commented Aug 18, 2023

hot_rank, and my conclusion is that it decays too quickly

Yes, that's why my approach slices through the index in all dimensions it's ordered by e.g. (featured_local, hot_rank, published) (my picture is a bit simplified). I also think it should be changed to a float so ordering information between hot_rank 1 and 0 is not lost. (unrelated to perf though).

Pagination / ordering by date doesn't work because most of the sorts are pretty different from the date. It could work as a heuristic but it's not necessary by doing filtering on the exact compound value needed for the sort per sort.

(1) is maybe fixed with #3496
(2) is intentional, see #3517
(3) i'm not aware of this

@RocketDerp

This comment was marked as abuse.

@RocketDerp

This comment was marked as abuse.

@RocketDerp

This comment was marked as abuse.

@RocketDerp

This comment was marked as abuse.

@RocketDerp

This comment was marked as abuse.

@RocketDerp

This comment was marked as abuse.

@RocketDerp

This comment was marked as abuse.

@RocketDerp

This comment was marked as abuse.

@RocketDerp

This comment was marked as abuse.

@dullbananas
Copy link
Collaborator

I find the whole AND ("community_block"."person_id" IS NULL) to be confusing AF - and not something I have seen in hand-written SQL before... it reeks of machine-generated.

This checks if a matching community_block row was found in the left join. Using any other non-null column or .* would have the same effect.

I have seen things like community_block::person_id.is_null() in the diesel queries. It would be more clear to use ::star. But this problem will go away when left joins with null checks are replaced with exists, like in #3865.

@RocketDerp

This comment was marked as abuse.

@RocketDerp

This comment was marked as abuse.

@RocketDerp

This comment was marked as abuse.

@RocketDerp

This comment was marked as abuse.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: database bug Something isn't working
Projects
None yet
Development

No branches or pull requests