-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bundle initial binaries for offline fallback. #1638
Comments
I've been giving this a lot of thought as well. I recently added the ability to use a mirror for the primary rover binary (#1675), and since then, we've run into the same issues with the supergraph and router plugins for rover. I was orignally thinking that moving their installation to the front of the job would also allow them to more easily make use of that npm config as well. While this would be preferrable to how it is now, I think an even better solution would be for each of those plugins to be published as their own npm modules, and for the This would allow the same code to be utilized to install them, and additionally, would allow them to be auditable by security scans from the parent project including them. |
Note: i got halfway through writing a PR for the plugin install/options rust code in this repo, that would allow a definition of a binary mirror/offline cache, before stopping to discuss if this was really the right approach. A npm package that installs a binary, which later installs other binaries is a little bit of a scary proposition security wise, especially when we've got servers effectively proxying github instead of directly utilizing github, (which could be used to get verified sigs and such), as part of both steps of the binary installs. I generally don't lean into the heavy handed security approaches for low-level dev tooling, but this makes even me pause. |
To possibly link some issues, if you had a Docker image with the Rover binary would those still solve those offline needs or would this still need to reach out to install more subcommands? |
It is doable and workable like that, but puts an additional maintenance and complexity requirement on a team, when integrating everything with While we could make that work, I'd like to help devise a solution that is easier to integrate with the node ecosystem. (I realize the move to rust enables you guys to start to approach all ecosystems). That being said, you could just as easily make those plugins pypi modules in the same way, enabling the same type of ease of use with python (and so on for other languages). It would have the added benefit of not having one binary downloading another binary at a later date and time, which removes the ability for npm/pip to do first-party audits and validate hashes and ensure integrity of installed components. That means it fits smoothly into any enterprise level security and auditing tooling (and any respective governance those enterprises may have), making adoption of rover and apollo tools much easier for enterprises with these types of policies and protections in place. |
that being said, the immediate advantage of a docker image is that we can directly verify the integrity of the docker image in a way we can't currently. However, we'd need to make sure that the rover commands inside the docker image don't fail when they can't reach the network to check for newer versions of those plugins. |
@LongLiveCHIEF @rizowski Can you try updating to the latest version of Rover. I just tried reproducing your issue and I an not having any issues running rover commands after I yarn install and turn off wifi I see that the If you run rover with the |
That's because the info command doesn't use the plugins, and doesn't
trigger an attempt to identify latest plugin version or to download one of
the plug-in binaries.
…On Tue, Aug 15, 2023, 2:22 PM Shane Myrick ***@***.***> wrote:
@LongLiveCHIEF <https://github.com/LongLiveCHIEF> @rizowski
<https://github.com/rizowski> Can you try updating to the latest version
of Rover. I just tried reproducing your issue and I an not having any
issues running rover commands after I yarn install and turn off wifi
[image: Screenshot 2023-08-15 at 12 22 21 PM]
<https://user-images.githubusercontent.com/2446877/260816014-8b285c97-321d-41e2-b6b1-ef2850b9552d.jpg>
—
Reply to this email directly, view it on GitHub
<#1638 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAZBLANPQJC2ELJAKNBKFETXVPD73ANCNFSM6AAAAAAZFLPDOA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
And again, the problem is that the binary itself is a secondary package that then attempts to install its own package, with no way to audit or define the destination source of that package and the way that you can for the direct package dependency itself. |
The test for this, is that the host machine only has internet access to specific endpoints, and all other outgoing traffic is blocked. For all intents and purposes, the machine never has, and never will, have direct access to rover.apollo.dev. So initial installation will never succeed. That's why I added support for rover binary host mirror in the above referenced PR. At a bare minimum, we'd require the same level of support for the Rover plugins. However even if that support was added, it's still a security issue with the design since npm installs the initial binary, and that binary is the only actor that has control over subsequent binary installations. This means any controls around the safety of those secondary binaries is opaque to the host systems. |
@LongLiveCHIEF If you run rover subcommand with the Can that allow you to use the commands you need in an offline mode? |
Honestly this is basically standard operating practice at an enterprise level these days, and modern package managers (including cargo and npm) were built to solve these problems. Why are they not being utilized? |
I have shared your feedback with the Rover team to look at. In the mean time, I am trying to help if there is anyway to unblock your team. Is this still a hard blocker that everything must be in one binary from NPM? Do you have a need to use multiple versions of Federation to compose a supergraph? |
ah... i see. I really appreciate that, but right now we're more concerned with a secure solution than an immediate solution, and I'm happy to help provide information needed about typically enterprise protocols and policies so that we come up with a solution that will satisfy everyone. Right now, we're able to use the docker trick by building on a host on a different network, but that's causing some complexities and challenges described above, and there are times as well that the binaries just straight up fail to download, which is another reason the current technique doesn't really work. There are times it fails even when there aren't networking restrictions in the way, which is another reason more standard distribution techniques would likely improve things. (Looks like we're not the only ones here, see: #1253 #1136 #1471 #1583 all seem relevant by searching issues for "download"). I have a workaround that is ok for now, but it's not something that will work long-term, and I'd like to help contribute a solution that will benefit not just us, but all your users. I started writing code to add a flag via the existing clap crate in the plugin installation source code, to allow it to read the existing env var I added in #1675 or from a If needed, I can pick that back up as a somewhat quick and compliant solution, but I suspect even with that implemented, issues linked above, and the requirements that caused @rizowski to open this specific issue to begin with, will still pop up quite often. |
oh, and nope... at least not at the moment. But the rover plugin versions don't have any relationship to the federation spec versions (I think), so I don't think that really matters for this particular concern unless I'm missing something? |
I was discussing with the Rover maintainers and there currently are two commands that are not fully bundled with the initial binary of Rover: Both of these commands are not bundled because they both fetch the composition binaries from our host and there is a unique binary for each version of Federation. So yes, you are correct that Rover is not tied to Federation spec versions and it was architected this way so that you could use the same Rover binary but upgrade your Federation version to compose a new supergraph. All the other commands (checks, publishes, contracts, etc) should work with your Yarn cache and not require any additional downloads. They will of course connect to GraphOS to publish the data for your graph, but that would be these URLs, not rover.apollo.dev. For Enterprises, Rover is not designed to be the solution to compose your supergraph. To securely compose a new supergraph and to validate that it is not going to break any existing clients, we have built GraphOS. This allows you to Check, Publish, and deliver a new supergraph to your Apollo Router. Does your enterprise have a separate requirement that you need to run composition manually or can you use the other available Rover commands to interact with GraphOS that don't require any double binaries? |
This is good info, let me unpack a bit here with your help
So i'm assuming this means that when
This is not consistent with what we've been told in marketing materials, communicatons, and even in the docs. I've sorta picked up on the fact that everything supergraph seems to work better if done in GraphOS, but when we've asked that question about schemas living next to code for testing and such, we've been told that the schemas can live next to the code. I think this may be a disconnect between sales and engineering.
I can't share this information publicly, but can connect with you offline if you can connect with me through Vann and Joey. What I can say here is that a common pattern for graphql usage in the industry is for the schema to live next to the server code for local testing of server code/resolvers. I just assumed that whatever our requirements might be, that pattern would be supported. Regardless, with this new information, I still see a path forward. I have a few ideas, but need to grind them over a bit before I propose them. Basically, i still think there's a way these plugin dependencies can be installed at the same time and through the same package management tooling as rover itself. I'd also like to expand a bit on the docker solution. After taking a long look in general at containers as cli commands, I've found this tends to not work well anywhere. Locally on dev machines, there are issues with Mac and Windows and resulting file permissions of files created, and especially issues depending on hypervisors used for docker machine on those docker installations. Then in CI and production/testing environments where the command is utilized, you never wind up using the cli image as either a base or a final image, because usually using that cli is just part of the process of starting or creating something else, which means you'll need to use a base that more aligned with your end goal, and wind up having to install the cli.... which just goes right back to the package manager being the best way to utilize it since the goal is to utilize it with a project that is already using a package manager for everything else. While that is a generalization of cli docker images, it's pretty spot on for rover as well. I don't see that there's be a lot of use for a Rover docker image. |
For the Further questions can probably be better answered by in-person/meeting. We can redirect that to Apollo GTM team to set that up, but I did want to ask more for everyone else following this issue:
That is definitely the case. Apollo and Federation make no assumptions about how you build your GraphQL schema, how your teams manages your code, or how you test internally for your teams subgraphs and other data services. Subgraphs are the logical boundary and to develop your schema and resolvers there is no connection or plan in GraphOS required. Once you get to testing and running your supergraph, that is where GraphOS provides tooling to make that easier. Your subgraphs are still owned and managed by your team and their schemas are what are compiled into the supergraph so it does not matter where those live. When you need to test and deliver a new supergraph, you can build all the tooling and infrastructure to do so, or GraphOS can provide that for you. Our goal is that you should not need any extra rover plugin dependencies to use our platform so if you have an use case which you do, we would like to hear more. Thanks for sharing the info about Docker. I have also seen the same thing at other enterprises. So what might be a better fit for the Rover/Apollo team longer term is not having to provide for the app/language environments (NPM, Python), but instead for the CI/CD environments, like what I did here for GitHub Actions |
Even if we're using graphos, the ability to test local code locally is paramount, and that means utilizing rover to test changes to the schema associated with changes in resolvers. Adding a CI/CD step to publish to GraphOS for something that is a simple hot reload locally is hugely wasteful of a developers time. It also means we have to utilize resources to run those builds/pipelines every time we have even the smallest code change, and that impacts everybody that shares those resources, and has a financial impact as well. This is why the only real solutions that will work in the long run is at least one of:
|
Hi @LongLiveCHIEF - really appreciate your thorough thoughts and explanations of your requirements here and my hope is that we can get to a solution that works for you and your organization as quickly as possible. Looking at your three options, the one that will get you unblocked fastest is likely your suggestion to make plugin installs also respect the I'm pretty sure it would just be a matter of changing this bit of code to something like this: pub fn get_tarball_url(&self) -> RoverResult<String> {
Ok(format!(
"{host}/tar/{name}/{target_arch}/{version}",
host = self.get_host(),
name = self.get_name(),
target_arch = self.get_target_arch()?,
version = self.get_tarball_version()
))
}
fn get_host(&self) -> String {
std::env::var("APOLLO_ROVER_DOWNLOAD_HOST").unwrap_or_else(|| "https://rover.apollo.dev".to_string())
} Does this approach seem reasonable to you? |
@EverlastingBugstopper yep! That's the gist of the code change i had made locally before deciding to see if there was another approach. That will at least get us moving. I still think we'd have long-term security concerns, but this would go a long way towards solving them. |
Totally - adding checksums to the installer logic would help a ton here as well I think. Fwiw the code behind |
That's helpful. If you give me a bit, i'll submit a PR with this change, along with changes to the binstall scripts that will allow this to be supported for every install technique, instead of just Looks like you can't approve your own PR's anyways, so that might make things faster. I'll add your |
Actually, scratch that, the binstallers reference github directly, so the mirror setup would be different. |
Adding this in combination with the mirror support would likely resolve concerns. All other systems where secondary dependencies manage installation of further packages have this feature, (think |
Thanks for helping out with the code solution @EverlastingBugstopper! @LongLiveCHIEF I apologize if there was a misunderstanding about the differences in Rover and GraphOS from marketings, sales, or engineering. Apollo uses the term GraphOS to encompass everything we provide, from Router, Studio, the Schema Registry, Rover, APIs, and clients. Apollo GraphOS is the platform; Rover CLI is one part of that platform and provides an interface to use it. It does not provide all the platform features as a CLI locally, similar to how the AWS CLI does not allow you to run your own instance of AWS with all it’s services locally. Rover is an interface to trigger actions that happen on the entire GraphOS Platform, with some specific commands to help you validate for local dev. Hopefully, with these changes, you can run the compose commands in an offline environment. When you are ready to run the |
Putting this here for posterity. The timing of this is a crazy coincidence: serde-rs/serde#2538 |
I was wondering if somebody would bring that up though I think the situations are quite different. From my understanding - people are upset about that because Rover and the plugins it uses both have reproducible build steps - if building Rover from source is something you want to do, you can, but we spend a decent amount of our development time supporting pre-built binaries with our own CI systems in order to reduce the amount of time it takes to adopt our feature set and integrate Apollo into your own systems. These two issues smell a little similar but upon closer inspection I don't think they are really the same at all, but that's just my two cents. |
I agree that the situation is different, at least from a distribution perspective, but what I wanted to call out were some of the comments the community made about the security considerations :
Granted, here you guys do a decent job of making it clear when a plugin binary is being downloaded, or when the rover binary is being download by npm. However, these I think are easy to miss on systems without restrictions when the commands
This one was massively downvoted. Unfortunately, despite the integrity of an author, most security systems rely on the principle of zero-trust these days, and the community responded to the concept of just "trusting" a binary that was installed without any systems allowing it to be audited.
In this case, cat videos being a stand-in for a privilige escalation attack, it means cat video are not a good thing 🐱
This hits the nail on the head. Without any checks or even ability to intercept in a framework that has a standard on how to check signatures/sums, then you have no way to validate authenticity in the event the source becomes compromised. And in a zero-trust system, you don't even trust yourself let alone 3rd parties.
Although not all of this comment applies to the rover situation, much of it does. Basically, we need a way to guarantee that binaries installed are in fact authored by you. |
I find many of those comments as part of the discussion relevant, as they discuss zero-trust supply chain practices being common, and they highlight the scrutiny placed on binaries vs source code. Source code can be scanned and malicious patterns identified. Binaries however, have to be trusted. In the case of rover, we first need to trust the rover binary, and then subsequently trust that the rover binary trusts the plugin binaries. That's a lot of trust to be given in any system where zero-trust is the standard operating procedure. |
Yeah - it all comes down to tradeoffs for sure. The main reason for opting for a plugin approach is to allow for the most amount of flexibility when using federation - it is incredibly valuable to be able to test any combination of the router (and its query planner) with any version of composition - the way we've achieved this is through binary plugins. We could have chosen to instead distribute this functionality via dynamically linked libraries (or do them with an API call, but then you lose the air-gapped approach for dev environments). However, since we already had a binary distribution system in place it was decided to build and distribute the plugins as binaries using our existing setup. My problem w/relying on checksums as a meaningful way of ensuring security is that if the binary distribution system was compromised - who's to say that the checksums wouldn't also be compromised? If a malicious actor can put their own binaries in place of officially distributed ones, wouldn't they also immediately update the checksum to match said malicious binaries? I think this really comes down to "do we trust prebuilt binaries at all" and if the answer is no, building from source is really the only option that gives you full control - which again, you can do for both Rover and the plugins as both codebases are source available with well-documented build processes if that is your organization's need. All this to say - we don't have immediate plans to change the underlying architecture here to say - bundle specific versions of the router + federation in a single Rover binary because it severely limits the flexibility of the software. We understand the need for air-gapped solutions which is why we shipped the plugins with the We do upload checksums to our GitHub releases as well for both Rover and the plugins and would absolutely accept a contribution to validate those checksums as part of the install process, but again I would ask if that is truly a more secure approach (if our build systems are compromised the checksums would also be compromised). If pre-built binaries are a no-go for your organization, I highly recommend building from source. |
Yep, i definitely think there's a middle ground here where we can maintain flexibility and increase a bit of security/auditing capabilities.
One thing I can think of is a That way we can establish "trust" while maintaining flexibility. It's normally not about trusting/not-trusting binaries wholesale, it's about being able to detect supply-chain attacks. Signing should accomplish that nicely, so that even when you have updates, we can detect that they were signed by you and that we can trust them. Basically, gpg key checking or something similar. We can start a new thread/discussion for that though, since it sounds like we're all basically on the same page now, where we agree that flexibility needs to be retained, but that there could be ways to add more "trust" to the existing process without impacting existing users processes. I think for the original "upfront upload for offline fallback", this can now be accomplished by running the @rizowski what do you think about that? Can we call this one good? |
I'm not really seeing the change benefit my initial use case. My case is when there is no internet or mirror available. I can reproduce the issue with Because I don't have internet the second time around, yarn doesn't finish setting up my other dependencies so essentially even though I have a cached version of the rover package, because it can't reach out or fallback to a default version, I must find internet to be able to run my application or remove rover from my dependencies. I don't need rover to run my service, but rover prevents my other dependencies from resolving from the cache if internet is unavailable. |
@rizowski have you tried setting |
@LongLiveCHIEF Hmm that's an interesting idea. I usually only reserve optional dependencies for package development. That will work for me. I would probably just ask that the documentation for installing the package notes this case or maybe recommend installing rover as an |
Following the conversation here, is it safe to say that we are unblocked and okay to move forward? |
Description
Describe the issue that you're seeing.
I realize that it's easier to download the rust binaries after rover is installed on someone's device and be able to patch it without publishing a new version. However, it really puts a damper on being able to utilize offline caching. I've been burned a couple of times by this, not even when I'm offline, but when there is some outage between me and the binaries.
Steps to reproduce
yarn-offline-mirror "~/.yarn-offline-cache"
node_modules
Expected result
I expect it to finish installing, so long as I have my dependencies cached on disk.
At the very least could it bundle the last built version, attempt to fetch the dependencies and if it fails, silently does so? That way there is at least a fallback.
Actual result
Environment
Run
rover info
and paste the results hereN/A can't install it.
If you can't run
rover info
for some reason, please provide as much of the following info as possible:rover --help
): 0.14.2The text was updated successfully, but these errors were encountered: