Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for Binary dependencies #227

Merged
merged 1 commit into from
Oct 13, 2020
Merged

Support for Binary dependencies #227

merged 1 commit into from
Oct 13, 2020

Conversation

UebelAndre
Copy link
Collaborator

@UebelAndre UebelAndre commented Sep 20, 2020

This change introduces support for Binary dependencies. If you take a look at #218, an issue is outlined where cargo treats binary dependencies differently than library dependencies and therefore, targets will not be generated when using cargo raze. There is now support for a new field in Cargo.toml files called [raze.binaries]. See the following example:

[raze.binaries]
wasm-bindgen-cli = "0.2.68"

By adding this field and using genmode = "Remote" (notably not "Vendored"), Raze will now download the crate from the registry specified in the settings, generate a lock file for it, and add the dependencies and it's metadata to the active raze session and include this metadata in the resulting Bazel files. An example of this can be seen in //examples/remote/binary_dependencies/... within this pull request.

This change will also create a lockfiles directory in the workspace_path for the lockfiles of binary dependencies. The files written will be named Cargo.{crate}-{version}.lock. If one of these files exists in the lockfiles directory, whenever raze is run again it will inject a lockfile with a matching name given the aforementioned pattern even if the create has a lockfile bundled with it. If no lockfile is present in the lockfiles directory, the bundled lockfile or a newly generated one will be installed there at the end of a run.

...
Generated /cargo-raze/examples/remote/binary_dependencies/cargo/remote/wit-writer-0.2.0.BUILD.bazel successfully
Generated /cargo-raze/examples/remote/binary_dependencies/cargo/BUILD.bazel successfully
Generated /cargo-raze/examples/remote/binary_dependencies/cargo/crates.bzl successfully
Generated /cargo-raze/examples/remote/binary_dependencies/cargo/lockfiles/Cargo.wasm-bindgen-cli-0.2.68.lock successfully  <- Note this output line
Bazel version info
Build label: 3.4.1
...

Another example of this change in use can be seen in bazelbuild/rules_rust#411.

@UebelAndre
Copy link
Collaborator Author

UebelAndre commented Sep 20, 2020

@damienmg As I mentioned in other pull requests, this is the thing I was working on. Because this change includes the others, it'd be fine to ignore it for now and focus on the others. I also feel this provides functionality similar to #222. I'll need to dig deeper at what the changes are there that were waiting for the recent crates_index release.

@UebelAndre
Copy link
Collaborator Author

Just to note, only the last 3 commits of this PR are the actual change

@UebelAndre
Copy link
Collaborator Author

@damienmg Ok, this should now only contain the changes relevant to this pull request

@damienmg
Copy link
Member

This is a bit too much details for me and I would be more comfortable if @smklein or @acmcarther would review this one.

@UebelAndre
Copy link
Collaborator Author

Maybe if there's no movement on ehuss/cargo-clone-crate#1 the code could be incorporated into this project. I would like to try to avoid that if possible but I also don't feel that change will realistically be merged 😞

@UebelAndre
Copy link
Collaborator Author

UebelAndre commented Sep 22, 2020

RustSec/cargo-lock#68 published 5.0.0-rc which is better than relying on the git repo. I've updated the dependency to use this version and will update it again once the proper release is out.

edit: A 5.0.0 release has been published and this PR has been updated to use that.

@UebelAndre
Copy link
Collaborator Author

UebelAndre commented Sep 22, 2020

I think I'll need to provide a mechanism to save Cargo.lock files of [raze.binaries] targets so they can be committed and reused. Right now the change here will download a binary dependency and generate a lockfile for it which is bad for pinning. Maybe there's a new setting binary_lockfiles_dir

[raze]
binary_lockfiles_dir = "//cargo/lockfiles"

Note that this would follow the same placement rules as workspace_path.

When you run raze with the example above, you get an output

├── Cargo.lock
├── Cargo.toml
├── cargo
│   └── lockfiles
│       └── Cargo.wasm-bindgen-cli-0.2.68.lock
└── src
    └── lib.rs

If Cargo.{crate}-{version}.lock exists then that file will be used. If it's not, cargo generate-lockfile will be ran on the downloaded source and then copied to that location for future reuse.

edit: I implemented something like this as I felt it was an important enough feature for build reproducibility. I'll update the top description

@UebelAndre
Copy link
Collaborator Author

b6b8071, 4fc91f7, and bbed53a have been broken out and put into their own pull requests.

@UebelAndre
Copy link
Collaborator Author

Moving some of the context from the top description into a comment

This pull request includes #226 #225 and #224 and is the implementation of #218. There are a few outstanding things that should likely be resolved before merging this. See ehuss/cargo-clone-crate#1 and RustSec/cargo-lock#68.

@UebelAndre
Copy link
Collaborator Author

@damienmg @smklein @acmcarther I think this pull request is ready for review. The only hesitation is that #230 should probably get closed first so 27afe6f can be removed from the list of commits.

@UebelAndre
Copy link
Collaborator Author

UebelAndre commented Sep 28, 2020

@smklein @acmcarther This change is now ready for review.

Thanks @damienmg for helping me with #230

@UebelAndre
Copy link
Collaborator Author

@smklein Hey, do you know when you might get a chance to review this? 😅

@smklein
Copy link
Contributor

smklein commented Sep 29, 2020 via email

@UebelAndre
Copy link
Collaborator Author

@smklein Thank you so much! I don't mean to pressure you, I was just looking to hear from you and find out if you felt like you had the bandwidth to do a review in the nearish future 😅

I'm pretty excited about this feature though so I hope the changes are good! 😄

Copy link
Contributor

@smklein smklein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@acmcarther , I still think you should take a look - this only has partial overlap with my contributions to Raze, so I think it would be good to have some of your perspective on "feature-wise, is this something we want", and I can help review the lower-level details.

Top-level comments so far:

  1. If you depend on a binary then commands are inconsistent rust-lang/cargo#8708 implies this binary crates might not be supported long-term by Cargo - specifically, the comment "I don't know if there was ever any intent to support or not support it. Given that it is mostly broken now (and has never worked), a warning would probably help those who stumble on it."

Given this tendency, IMO it makes sense to implement this binary dependency feature in the most non-invasive way possible - I've added some comments regarding refactors for clarity.

  1. Can we add some documentation for this feature, possibly in the README, under "handling unconventional crates"?

  2. Thanks for the tests!


// Note: This will need to change if the dependency version changes
let wasm_bindgen_path =
Path::new("./external/remote_binary_dependencies__wasm_bindgen_cli__0_2_68/cargo_bin_wasm_bindgen");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we parse this as an environment variable and pass it from the build system?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't think of an easy way off the top of my head, but would that simplify anything?

examples/remote/binary_dependencies/src/main.rs Outdated Show resolved Hide resolved
impl/src/metadata.rs Outdated Show resolved Hide resolved
impl/src/planning.rs Outdated Show resolved Hide resolved
impl/src/planning.rs Outdated Show resolved Hide resolved
impl/src/planning.rs Outdated Show resolved Hide resolved
impl/src/planning.rs Outdated Show resolved Hide resolved
impl/src/planning.rs Outdated Show resolved Hide resolved
impl/src/planning.rs Outdated Show resolved Hide resolved
impl/src/planning.rs Outdated Show resolved Hide resolved
impl/src/planning.rs Outdated Show resolved Hide resolved
@UebelAndre
Copy link
Collaborator Author

UebelAndre commented Sep 30, 2020

@smklein Addressed some feedback and asked some questions. Ready for another review 😄

Edit: I didn't see your last comment. Yes, I can definitely update the README

Copy link
Member

@acmcarther acmcarther left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got a ton of nits, not really much of substance. I think I've lost the project context needed to produce a real useful comment. I'll make an effort to do so on next pass though.

impl/Cargo.toml Outdated Show resolved Hide resolved
impl/src/bazel.rs Outdated Show resolved Hide resolved
impl/src/metadata.rs Outdated Show resolved Hide resolved
impl/src/metadata.rs Outdated Show resolved Hide resolved
impl/src/metadata.rs Outdated Show resolved Hide resolved
impl/src/planning.rs Show resolved Hide resolved
impl/src/settings.rs Outdated Show resolved Hide resolved
impl/src/planning.rs Outdated Show resolved Hide resolved
impl/src/planning.rs Show resolved Hide resolved
impl/src/planning.rs Show resolved Hide resolved
@UebelAndre
Copy link
Collaborator Author

@acmcarther I primarily have questions for you but also addressed some feedback. I know your time is limited but would greatly appreciate another review so I can get this pull request closed out 😄

@acmcarther
Copy link
Member

Made a really brief pass, but I'll need to sit down and dedicate a block of time to address the other comments (and provide my promised substantive comments).

@acmcarther
Copy link
Member

(In general, despite my grumble grumble about how the abstractions are breaking down, I value new features over satisfying my personal taste in abstraction design, so please don't be too discouraged. As in my prior message, I'll set aside a block of time to get into the nuts and bolts of this PR soon.)

@UebelAndre
Copy link
Collaborator Author

@acmcarther Thanks! Looking forward to your feedback 😄

@UebelAndre
Copy link
Collaborator Author

@acmcarther @smklein Hey do you guys have time today to do another review 😅 🙏

@UebelAndre
Copy link
Collaborator Author

@acmcarther @smklein Hey, friendly ping for another review 😅

Do you know if you'll have time this week?

@UebelAndre
Copy link
Collaborator Author

I don't know what to do about the fact that this PR is still flagged as Changes requested I think I addressed all I could and had some open questions. I thought maybe resolving the open comments would transition the state of the issue... but today I learned that is not the case 😞 Sorry about that

@acmcarther
Copy link
Member

Sorry for my lack of responsiveness, I operate on Valve time (or Elon Musk time), not human time. Time apparently is very fluid for me.

You're stuck in the "changes required" stage because you need an explicit LGTM. I think addressing all the requested changes is not sufficient.

Anyway, I haven't gained any new insight in the time I spent being lazy, so I'll just review again (right now) and call it good.

@UebelAndre
Copy link
Collaborator Author

Sorry for my lack of responsiveness, I operate on Valve time (or Elon Musk time), not human time. Time apparently is very fluid for me.

Good ol' Elon time 😆

You're stuck in the "changes required" stage because you need an explicit LGTM. I think addressing all the requested changes is not sufficient.

Good to know, thanks!

Anyway, I haven't gained any new insight in the time I spent being lazy, so I'll just review again (right now) and call it good.

@acmcarther You have no idea how excited I am to get these changes through!!! I really hope they're good 🙏 😅

Copy link
Member

@acmcarther acmcarther left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I appreciate the effort you've put in here. I'm gonna say some really annoying stuff though.

This is kind of unprecedented for me, but I'm going to insist you split this PR up into more manageable pieces. I've got many issues with the approach here and I'm kind of overwhelmed trying to structure them in a productive way without a tiresome amount of iteration.

Here's a general framework for how this PR could be split up in a way that I could productively review it.

  1. Update the dependencies of cargo-raze. This will be an easy review and submit.
  2. Implement support for generating metadata in "in-place". I'm not certain I'll actually approve this. I might instead ask you to move the responsibility of moving stuff into tempdir outside of the metadata fetcher. Basically, your preferred behavior would be come the actual behavior, and the "bonus stuff" around tempdirs would get moved out.
  3. Extend MetadataFetcher to support loading metadata necessary for (future) planning of binary dependencies. I'm not exactly sure how this should look, but one important point here is that we need additional metadata during planning for binary dependencies that isn't currently available. We should identify what that is and add it in a discrete PR.
  4. Leverage the additional metadata to extend generation to include binary dependencies.
  5. Add the smoke-tests.

Some of the particular issues (not covered, or covered obliquely or covered directly above) that motivate this request:

  1. The smoke test files being included here are kind of overwhelming.
  2. I don't think the planner should be generating files or performing remote requests. The behavior that lives here should probably live in metadata fetcher. That class has precedent for downloading files, network requests, etc. Conceptually, that class should be a "ball of all metadata needed for planning". If planner needs to go get a bunch more stuff, then our abstractions have failed us.
  3. I'm not sure how i feel about fields being moved around between WorkspacePlanner, CrateContext, etc. I wanted to address this, but felt a bit overwhelmed by the rest of the PR so I didn't get a chance to devote a lot of thought to it yet.
  4. I raised this earlier, but I'm not really comfortable with how the "clone" and "checksum" functions which perform network requests and file operations currently live in metadata.rs, but are primarily used by planning code. With (2), metadata.rs is probably the right place, but right now it feels arbitrary.
  5. All together, I feel like these "crate fetching" and "checksum" features deserve a proper abstraction. Something like a trait which represents the "main idea" of what they do, with an implementation to back it. This enables testing of the types that depend on this functionality, without performing the side effects.
  6. This is really minor, but we've let the cyclomatic complexity nesting and general complexity of many of these functions get out of control. In produce_crate_contexts there's a filter_map with a body that's ~50 lines long, at ~5th level of indentation in the containing function.

EDIT: wordsmithing

impl/src/bin/cargo-raze.rs Outdated Show resolved Hide resolved
impl/src/metadata.rs Outdated Show resolved Hide resolved
impl/src/metadata.rs Outdated Show resolved Hide resolved
impl/src/settings.rs Outdated Show resolved Hide resolved
impl/src/metadata.rs Outdated Show resolved Hide resolved
impl/src/metadata.rs Outdated Show resolved Hide resolved
)?;

let crate_dir = dir.join(format!("{}-{}", name, version));
match crate_dir.exists() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm upgrading this from optional to Style Nit.

I kind of hate these matches-over-boolean, and there are a lot of them here. For consistency, I ask them all to be updated.

impl/src/planning.rs Show resolved Hide resolved
@UebelAndre
Copy link
Collaborator Author

@acmcarther Ok, with those PR's merged it reduces the footprint of this change a bit more. I think this is now ready to resume discussions in earlier comments.

@acmcarther
Copy link
Member

acmcarther commented Oct 8, 2020

Can we move smoke tests and examples somewhere else. That's a major gripe I have.

(i recognize that they represent the test coverage for this feature... but maybe we need some unit tests if they're load bearing)

@UebelAndre
Copy link
Collaborator Author

Can we move smoke tests and examples somewhere else. That's a major gripe I have.

Removed those commits.

@acmcarther
Copy link
Member

Holy cow you've done it. It's not 20k delta anymore! Thank you very much!

I'm just about out of time for now to review, but I'll be glad to review this in detail ASAP (hopefully later today, but maybe 10AM tomorrow)

Copy link
Member

@acmcarther acmcarther left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really going to take a while

impl/src/bazel.rs Show resolved Hide resolved
impl/src/planning.rs Show resolved Hide resolved

pub fn fetch_crate_src(
registry_url: &str,
dir: &Path,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really can't tell if you've deliberately or accidentally ignored these comments, and it bothers me.

impl/src/metadata.rs Outdated Show resolved Hide resolved
let dir = tempfile::TempDir::new().unwrap();
let lockfile_dir = tempfile::TempDir::new().unwrap();

let (path, _files) = fetch_crate_src(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is important to me that we make sure no unit tests are cloning git repositories.

This will probably require you to create abstractions which can inject dependencies. Specifically, cargo_clone::Cloner probably needs to be wrapped and stubbed for testing.

The larger problem here is that you've not really created any abstractions at all.

impl/src/metadata.rs Outdated Show resolved Hide resolved
impl/src/metadata.rs Outdated Show resolved Hide resolved
@acmcarther
Copy link
Member

I might just LGTM this after a couple of cycles. Reviewing this PR has made me realize that there is a huge amount of code debt.

@UebelAndre
Copy link
Collaborator Author

#227 (comment) What comments?

@UebelAndre
Copy link
Collaborator Author

UebelAndre commented Oct 9, 2020

@acmcarther I don't understand why some comments don't have a field for me to reply to them

If possible can I ask that you always create comments that have a section for me to directly reply to. It bothers me that you think I'm deliberately ignoring your feedback. There's a lot on this issue and it's difficult for me to keep track. If I see a comment that I can resolve, It helps me track that as an action item. Once all comments are resolved, then I would request another review. I'm doing my best here and am not trying to be rude or irritate anyone

@UebelAndre
Copy link
Collaborator Author

#227 (comment) I'm gathering metadata in the same place raze used to before any of my changes. Are you asking me to make fetch_metadata do everything?

@UebelAndre
Copy link
Collaborator Author

#227 (comment) I mocked the endpoints this would attempt to reach out to. This should never be doing any git operations btw. Do you still feel https://docs.rs/cargo-clone-crate/0.1.3/cargo_clone/struct.Cloner.html should be wrapped and mocked?

Copy link
Member

@acmcarther acmcarther left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, whatever. If this is idiomatic (which, I think I'm coming to believe it is), then I think I hate modern Rust.

I have lots opinions about things (pattern matching over booleans, ".1" at the end of things, weird arrangement of responsibilities, you including ridiculously huge lambdas in filter_map), but I see you're the person that prepared #212, so I expect you to be better educated and more invested on the direction of things than I am and for that reason I trust your judgment in terms of where responsibilities should be and what is idiomatic in this language.

Someday I really would like to rewrite this whole thing and fix (what I see as) cruft that has accumulated due to my lack of good discipline in reviewing, but who know when that will happen, and I expect my code would be idiosyncratic w.r.t. the way the rest of the ecosystem writes code.

That all said. please try to prepare much smaller PRs in the future.

README.md Outdated Show resolved Hide resolved
.metadata
.packages
.iter()
.map(|pkg| pkg)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

???

impl/src/planning.rs Outdated Show resolved Hide resolved
impl/src/planning.rs Show resolved Hide resolved
impl/src/metadata.rs Outdated Show resolved Hide resolved
@UebelAndre
Copy link
Collaborator Author

Well, whatever. If this is idiomatic (which, I think I'm coming to believe it is), then I think I hate modern Rust.

I have lots opinions about things (pattern matching over booleans, ".1" at the end of things, weird arrangement of responsibilities, you including ridiculously huge lambdas in filter_map), but I see you're the person that prepared #212, so I expect you to be better educated and more invested on the direction of things than I am and for that reason I trust your judgment in terms of where responsibilities should be and what is idiomatic in this language.

Someday I really would like to rewrite this whole thing and fix (what I see as) cruft that has accumulated due to my lack of good discipline in reviewing, but who know when that will happen, and I expect my code would be idiosyncratic w.r.t. the way the rest of the ecosystem writes code.

That all said. please try to prepare much smaller PRs in the future.

@acmcarther I'm still pretty new to Rust and have learned a lot in contributing to this project. I definitely still have a lot to learn though. I do agree with a lot of your feedback here and I have already made a few branches with some refactoring done that is targeted at creating clear separation of responsibility among the different modules. Though that being said, I'm definitely going to be having a stronger focus on smaller/more focused pull requests. I see a lot of potential in this project and want to do what I can to improve it. Do you think you could open a draft PR or an issue ticket outlining some of the things that bother you most about the current state of the project? I'd be happy to take a look and submit some PRs 😄

@UebelAndre
Copy link
Collaborator Author

@acmcarther I've squashed the changes in this PR and rebased #240 on top of these changes.

@acmcarther
Copy link
Member

acmcarther commented Oct 12, 2020

Thank you for your continued work here and elsewhere in the repo. I really appreciate it, and I think users of this project appreciate it even more


Brain dump on future direction of the project follows. I might turn this into more specific actions and PRs I prepare, but if not, at least this is here:

The real original sin of cargo-raze was the failure to write the original code with unit tests in mind. That means dependency injection where necessary, and clearly delineated responsibilities. Instead, once the lack of unit tests because a real practical problem (because things would break), I added this "smoke-test" thing, which has become load bearing, and in some cases makes code review a major hassle.

If I were (motivated) to clean the repo up in a vacuum, the north star would be to make all of the code testable and well tested for the purpose of guiding refinement of the design. In more concrete terms this means:

  1. Survey the rust ecosystem and identify ways that developers write testable code. This probably means looking at the top crates and seeing what they do.
  2. Add the unit tests and refactor the code where necessary to facilitate this better testing approach -- this should make the smoke tests "less load bearing".
  3. Implement a test harness for the type of functional tests which are currently implemented via "smoke-test.sh", where unit tests are impractical for verifying the behavior of multiple moving parts.
  4. Port those smoke tests to the new test harness. The ideal end state is that the most common types of tests can be defined "declaratively", meaning via a cargo toml, and maybe a directory structure defined in a test configuration file. This would result in the generation of a test which can run raze and verify that the code builds.

Through this process, it would become more clear what abstractions are currently doing too much (not abiding by SRP), and I think also we would end up with an end product which is easier for developers to hack on, because they'd be able to start with TDD. They would figure out what is not being generated and add it to the functional and unit tests, then update the implementation to make it happen.


Beyond code health, I think it's really important to get this project building with bazel as well, so that this can be brought in as a regular bazel dependency, compiled, and then used as part of genrules. In the past, I explored what doing this "late generation of BUILD files" might look like, but I ran into some fundamental issues with where this generation would happen in the bazel lifecycle and I got a bit disheartened.

A long time ago, the model this project followed was generating bzl files containing the metadata required for a rules_rust genrule to generate the rust_library invocations at a later time. I departed from this for some reason I've since forgotten, but I think ultimately the best version of cargo-raze would return to that model. In part, it would be nicer because it would allow third party bazel rules to traverse the dependency tree generated by cargo-raze and perform their own types of rule generation, such as generating targets for supporting rust-analyzer, or some such.

EDIT:
Another possible angle on this would be reviewing how go's analogous generator, bazel-gazelle, does things. We could understand how their model works, and what parts of go's "dependency semantics" make that approach work for them. With that in mind, we could decide whether moving closer to their model makes sense.

@UebelAndre
Copy link
Collaborator Author

@acmcarther Thank you so much! Would you mind if I turned that message into an issue and we continued that conversation there?

Also, is there anything more I should do for this PR?

@UebelAndre
Copy link
Collaborator Author

UebelAndre commented Oct 13, 2020

I updated the dependencies one last time. https://crates.io/crates/httpmock shipped version 0.5.0 so I'm using that and not the beta. (This is only a dev dependency so should have very little impact).

@acmcarther
Copy link
Member

My comment probably represents four separate issues (which may or may not already exist)

  1. Improve unit test coverage
  2. Replace "smoke-test.sh" with proper functional tests.
  3. Evaluate writing crate metadata into bzl files for consumption by rules_rust
  4. Assess lessons from bazel_gazelle and produce a proposed set of changes for cargo-raze.

@acmcarther
Copy link
Member

Merging now. I'll cut a release that includes this and other recent changes.

@acmcarther acmcarther merged commit 2d190e1 into google:master Oct 13, 2020
@UebelAndre UebelAndre deleted the binary-dependencies branch October 13, 2020 17:54
@UebelAndre
Copy link
Collaborator Author

@acmcarther Thank you so so much!!!!

I feel like some of the issues could be cleaned up or consolidated already. It might be worth doing a pass on them 😅

@UebelAndre
Copy link
Collaborator Author

@acmcarther also, I'd like to use the changes this PR to update rules_rust but I have another PR that I've based my other branches off of that I'm hoping to get your eyes on. Could you take a quick look at bazelbuild/rules_rust#420 tomorrow? 🙏 There's one question in-particular I'm hoping to get your eyes on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants