Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: centralize git checkouts and db paths #13187

Merged
merged 1 commit into from
Dec 22, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 2 additions & 3 deletions src/cargo/core/global_cache_tracker.rs
Original file line number Diff line number Diff line change
Expand Up @@ -560,11 +560,10 @@ impl GlobalCacheTracker {
) -> CargoResult<()> {
let _p = crate::util::profile::start("cleaning global cache files");
let config = clean_ctx.config;
let base_git_path = config.git_path().into_path_unlocked();
let base = BasePaths {
index: config.registry_index_path().into_path_unlocked(),
git_db: base_git_path.join("db"),
git_co: base_git_path.join("checkouts"),
git_db: config.git_db_path().into_path_unlocked(),
git_co: config.git_checkouts_path().into_path_unlocked(),
crate_dir: config.registry_cache_path().into_path_unlocked(),
src: config.registry_source_path().into_path_unlocked(),
};
Expand Down
9 changes: 6 additions & 3 deletions src/cargo/sources/git/source.rs
Original file line number Diff line number Diff line change
Expand Up @@ -248,7 +248,8 @@ impl<'cfg> Source for GitSource<'cfg> {
// exists.
exclude_from_backups_and_indexing(&git_path);

let db_path = git_path.join("db").join(&self.ident);
let db_path = self.config.git_db_path().join(&self.ident);
let db_path = db_path.into_path_unlocked();

let db = self.remote.db_at(&db_path).ok();
let (db, actual_rev) = match (self.locked_rev, db) {
Expand Down Expand Up @@ -305,10 +306,12 @@ impl<'cfg> Source for GitSource<'cfg> {
// Check out `actual_rev` from the database to a scoped location on the
// filesystem. This will use hard links and such to ideally make the
// checkout operation here pretty fast.
let checkout_path = git_path
.join("checkouts")
let checkout_path = self
.config
.git_checkouts_path()
.join(&self.ident)
.join(short_id.as_str());
let checkout_path = checkout_path.into_path_unlocked();
db.copy_to(actual_rev, &checkout_path, self.config)?;

let source_id = self
Expand Down
12 changes: 12 additions & 0 deletions src/cargo/util/config/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -368,6 +368,18 @@ impl Config {
self.home_path.join("git")
}

/// Gets the directory of code sources Cargo checkouts from Git bare repos
/// (`<cargo_home>/git/checkouts`).
pub fn git_checkouts_path(&self) -> Filesystem {
self.git_path().join("checkouts")
}

/// Gets the directory for all Git bare repos Cargo clones
/// (`<cargo_home>/git/db`).
pub fn git_db_path(&self) -> Filesystem {
self.git_path().join("db")
}
Comment on lines +371 to +381
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How appropriate is it to join on a Filesystem and pass that around?

From what little I've interacted with Filesystem, it seems like its meant to be the root for (e.g. its whats lockable) which makes it feel strange to creating Filesystems for directories we don't lock (instead we lock the parent)

Copy link
Member Author

@weihanglo weihanglo Dec 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see it like “hey this path might be locked so be careful when manipulate stuff under the path.” registry_index_path is constructed in the same way, though this doesn't justify it is correct.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, the idea is to only interact with the filesystem using Filesystem, and never use raw Path. That helps ensure you only access files with appropriate locking. Unfortunately we're not very good with that. Whenever into_path_unlocked is used, that indicates the abstraction is leaking. It is necessary in many cases, since other libraries (like libgit2) won't take a Filesystem, but I thinkinto_path_unlocked is used a little too often. There are also many places that don't use Filesystem. For example, Layout has its own locking mechanism, and thus doesn't need it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which lock is being held for git checkouts and git dbs?

If its not the individual db's and checkouts, then this seems like the wrong API in the first place as we are saying "this is lockable" when, in reality, locking it would be the wrong decision. One idea for improving this (if that is the case) is to "lock project" and allow Filesystem.join to carry-forward the path to the lockable resource.


/// Gets the Cargo base directory for all registry information (`<cargo_home>/registry`).
pub fn registry_base_path(&self) -> Filesystem {
self.home_path.join("registry")
Expand Down