Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid loading DB for commands where it's not necessary. #8783

Open
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

nicktobey
Copy link
Contributor

This PR avoids loading dolt DBs at startup, instead waiting for either DoltEnv.ReloadDB or MultiRepoEnv.ReloadDBs to be called.

Based on the specific subcommand being run:

  • MultiRepoEnv.ReloadDBs is called prior to command execution if the command will always need DB access.
  • Commands that conditionally require a DB can choose whether to call DoltEnv.ReloadDB
  • Commands that never need the DB will never load it.

Commands that make use of CliContext don't need to load the DB, as the underlying query engine will load the DB if it needs database access. As a result, commands that are implemented purely in terms of the CliContext will load the DB when Dolt is runnning an embedded SQL engine, and will skip loading the DB when Dolt is connecting to a running server.

@nicktobey nicktobey force-pushed the nicktobey/lazy-load branch from 9358401 to 5355928 Compare January 22, 2025 22:54
@nicktobey nicktobey requested a review from macneale4 January 22, 2025 22:54
@coffeegoddd
Copy link
Contributor

@nicktobey DOLT

comparing_percentages
100.000000 to 100.000000
version result total
5355928 ok 5937457
version total_tests
5355928 5937457
correctness_percentage
100.0

go/libraries/doltcore/env/environment.go Outdated Show resolved Hide resolved
// - They require that the database does not exist
// - They never access the database
// - They only ever access the database through a CliContext.
// - They always call dEnv.ReloadDB() before accessing the database for the first time.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commands should avoid using dEnv at all costs. Anywhere we still use them is a place where we use them should be fixed.

Since all of these commands use the CliContext, seems like adding the ReloadDB method there is the right thing to do. For that matter, maybe the CliContext should always load the DB for the first time.

@@ -138,6 +139,10 @@ func (cmd BranchCmd) Exec(ctx context.Context, commandStr string, args []string,
case apr.Contains(showCurrentFlag):
return printCurrentBranch(sqlCtx, queryEngine)
case apr.Contains(datasetsFlag):
dEnv.ReloadDB(ctx)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a mis in the sql migration. --datasets requires local storage access, and needs to be deprecated.

@@ -158,6 +158,10 @@ func (cmd ShowCmd) Exec(ctx context.Context, commandStr string, args []string, d
cli.PrintErrln("`dolt show --no-pretty` or `dolt show (BRANCHNAME)` is not supported when using old LD_1 storage format.")
return 1
}
dEnv.ReloadDB(ctx)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll allow this only because we already discussed how busted this is.

commands.DiffCmd{},
commands.ResetCmd{},
commands.CleanCmd{},
commands.SqlCmd{VersionStr: doltversion.Version},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sql command is kind of special - and it should really have the DB if there is going to be one. The only case where we don't want one is when we are connecting over the wire.

Copy link
Contributor

@macneale4 macneale4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still a few questions. Overall, this is definitely an improvement.

Also, looks like all the tests are failing. Maybe because of the infinite recurssion?

if dEnv.doltDB == nil {
LoadDoltDB(ctx, dEnv.FS, dEnv.urlStr, dEnv)
}
return dEnv.DoltDB(ctx)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

huh? Why doesn't the recurrse forever? shouldn't it be return dEnv.doltDb

for _, namedEnv := range mrEnv.envs {
dEnv := namedEnv.env

if dEnv.DoltDB(ctx) == nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When does this return nil? I'm not really sure what the DoltDb function is supposed to do, but my expectation would be that the LoadDoltDB call in there would succeed and the DoltDB function would return a non-nil value always

}
return fkc
func fkCollection(fks ...DoltDB(ctx).ForeignKey) *doltdb.ForeignKeyCollection {
fkc, err := doltdb.NewForeignKeyCollection(fks...)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what the heck is happening with the formatter here?

vrw := types.NewValueStore(cs)
ns := tree.NewNodeStore(cs)
db = datas.NewTypesDatabase(vrw, ns)
var db datas.Database
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

formatting again? Not sure why this is happening

@@ -866,7 +866,7 @@ func (d *DoltSession) ReleaseSavepoint(ctx *sql.Context, tx sql.Transaction, sav
return nil
}

// GetDoltDB returns the *DoltDB for a given database by name
// GetDoltDB returns the *doltDB for a given database by name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

revert

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants