Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update KeysScan.md #2780

Closed
wants to merge 1 commit into from
Closed

Update KeysScan.md #2780

wants to merge 1 commit into from

Conversation

japjit
Copy link

@japjit japjit commented Aug 27, 2024

Found through trial and error that not providing the database results in a timeout.

Found through trial and error that not providing the database results in a timeout.
@mgravell
Copy link
Collaborator

mgravell commented Aug 27, 2024

Before we jump into changes, can we understand the problem? I cannot reproduce a problem here:

using StackExchange.Redis;
using System;
using System.Linq;

var muxer = ConnectionMultiplexer.Connect("127.0.0.1:6379");

var db = muxer.GetDatabase();
db.StringSet("foo", "foo");
db.StringSet("bar", "bar");
var server = muxer.GetServers().Single();


Console.WriteLine("sync...");
foreach (var key in server.Keys(pattern: "*"))
{
    Console.WriteLine(key);
}

Console.WriteLine("async...");
await foreach (var key in server.KeysAsync(pattern: "*"))
{
    Console.WriteLine(key);
}

outputs

sync...
bar
foo
async...
bar
foo

At the code level, from here, not supplying a database means the default is applied, which via GetValueOrDefault() means it should default to 0 if not already set (see also here). This means that I do not believe that the docs change shown will make any difference (unless a different non-zero DefaultDatabase has been specified in your config, in which case the example explicitly does something different, but that's due to the config change).

So: what is the context here? Can we try to get a repro? If there is a problem, I'd rather fix the code than the docs, since the docs aren't wrong. Anything relevant in the server you're connecting to here? or maybe (speculating) the problem is the pattern application itself. Note that redis does not have an efficient way of filtering keys by pattern - you're essentially walking the entire keyspace chunk by chunk, which can take time. If you're looking for something that has very few matches in a huge database, it can take a lot of round trips to find the first match, due to how SCAN works. You can trade reduced round-trips for increased server-time per trip by increasing the pageSize parameter (which defaults to 250, but you can try 1000, 5000, 10000 etc to see how it changes things; note that redis uses a default of 10 for this value - the library already uses a much larger default because 10 is usually reported as overly expensive).

I suspect that the problem here has nothing to do with specifying the database.

@mgravell
Copy link
Collaborator

mgravell commented Aug 27, 2024

The other possible cause of a timeout here is if the server is low-version (or we cannot determine the version), and we therefore use KEYS instead of SCAN, and the database is large; this is massively disruptive to the server, which is why KEYS is largely considered deprecated with SCAN replacing it.

As a side note: both KEYS and SCAN should only really be used for administrative purposes like keyspace analysis - it should not be used for routine database operations. For any scenario where a user might use KEYS/SCAN for routine operations: there are usually much better options, that might require some minor rework to how you are storing data.

@japjit
Copy link
Author

japjit commented Aug 27, 2024 via email

@mgravell
Copy link
Collaborator

mgravell commented Aug 27, 2024

Indeed it should (prefer SCAN to KEYS); IMO the first thing we should do here is check what it is issuing; any chance you can use a redis-cli session to the same instance to issue the MONITOR command? this should allow us to see what is being issued when this happens; what I expect is to see some (potentially large) number of SCAN commands being issued; if it has failed to determine the server version, we might see a single KEYS command instead.

Additionally, it would be useful to see the result of DBSIZE, so we can see what sort of database we're dealing with here - is it 200 keys, or 200 million keys.

@mgravell
Copy link
Collaborator

To echo: I'm very hesitant to assume that the problem here is the database parameter. I try not to rule out anything, but I guess I'd really like to be sure first - ideally with some kind of repro. My initial investigation suggests that omitting the database here should correctly apply the default database, which defaults to zero, so the change suggested should make no difference.

@japjit
Copy link
Author

japjit commented Aug 27, 2024 via email

@japjit
Copy link
Author

japjit commented Aug 27, 2024 via email

@mgravell
Copy link
Collaborator

If this recurs, I'd be interested in helping investigate - I'm not saying "there was never a problem" - I'm simply saying "it almost certainly wasn't anything to do with the database argument". So: if it happens again, please do get in touch in an issue, so we can try to investigate.

@NickCraver
Copy link
Collaborator

Closing this out for now to tidy up, but happy to follow-up if we see it again :)

@NickCraver NickCraver closed this Sep 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants