Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pg_stats on interesting columns #70

Closed
petergeoghegan opened this issue Feb 26, 2014 · 2 comments
Closed

pg_stats on interesting columns #70

petergeoghegan opened this issue Feb 26, 2014 · 2 comments

Comments

@petergeoghegan
Copy link
Contributor

I'm thinking of adding something like this:

select s.*, atts.indexrelid::regclass::text as index from pg_stats s join pg_attribute a on s.attname = a.attname and s.tablename = a.attrelid::regclass::text join (select unnest(indkey) attnum, indexrelid, indrelid from pg_index i) atts on (atts.attnum = a.attnum and atts.indrelid = a.attrelid) where schemaname != 'pg_catalog' and (n_distinct != -1 or correlation < 0.95) order by schemaname, tablename, attname;

The idea here is that we see statistics on interesting columns - columns that are indexed and have a non-uniform distribution (n_distinct = -1 means that there are a number of distinct elements equal to the total number of elements....values between -1 and 0 represent a multiplier of the total number of elements, whereas positive values are absolute numbers). These values are in general more likely to matter on indexed columns. So it's a reasonable way to find interesting statistics as they may relate to problematic queries. I can think of one customer in particular that I always look at this stuff for, because their listings are particularly skewed. I think if nothing else it's useful to encourage the idea that whether or not constants appearing in your query predicate are in the most common values list matters a lot.

@deafbybeheading @fdr Would you use it if you had it? I do look at this sometimes. It is frustrating to have to work to get it, and that it isn't as accessible as I'd like. Although perhaps this isn't much more accessible.

@fdr
Copy link

fdr commented Feb 26, 2014

I might not think to look at it as I won't think to recall it.

Given that:

Where would you fold it in? What I'm wondering is how do you model someone using or, in particular, discovering this feature, however it may wind up.

You could also chuck it into the command as a dusty corner (unless it gets surprisingly popular) and build validated ideas over time by referring (or checking) qualified customers directly to it to see how often if bears out your hypothesis. I am not a maintainer of pg-extras so maybe someone has a more qualified opinion of how inclusive it is, but my last understanding was "considerably".

I have my own idle guesses, like finding a way to in-line it with another command that's popular so people somehow get it "for free". But I don't even have an inkling where.

@petergeoghegan
Copy link
Contributor Author

On Tue, Feb 25, 2014 at 10:25 PM, Daniel Farina [email protected]:

Given that:

Where would you fold it in? What I'm wondering is how do you model someone
using or, in particular, discovering this feature, however it may wind
up.

It might be more useful as a psql macro that defines a CTE, so that I can
write ad-hoc queries against that CTE...but we can't ship that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants