Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Rewrite git://github.com to https://github.com in package URLs. #11312

Merged
merged 1 commit into from
Nov 4, 2015
Merged

Conversation

LachlanGunn
Copy link
Contributor

Introduction

As git:// URLs cannot be loaded behind proxies, the current package manager is broken for many institutional users without manual configuration of Git using the "git://".insteadOf workaround. This is a major usability problem, and leaves a bad first impression for new users who are used to the polished installer of Matlab.

As discussed in #7005, the git:// protocol also provides no security against man-in-the-middle attacks, to which the current system is vulnerable due to the lack of digital signatures in the process.

The Pkg package has therefore been made to automatically change all git://github.com URLs to https://github.com. URLs pointing to other domains will be unaffected. If a complete change from git:// to https:// is made, then it is anticipated that existing URLs in METADATA.jl will be changed from git:// to https://. This may render the URL normalisation part of these changes unnecessary, however it would still be necessary to change the METADATA.jl URL to use HTTPS.

Changes

  1. The JuliaLang/METADATA.jl repository URL has been changed from git:// to https:// .
  2. Generated URLS have all been changed from git:// to https://
  3. The existing URL normalisation code in base/pkg/git.jl has been modified so that it normalises to https:// rather than git://
  4. All URLs are now normalised before being passed to "git clone" or Git.set_remote_url.

Together, these changes allow packages to be installed without access to the git:// port.

Test procedure

  1. Compile Julia in Vagrant-setup virtual machine
  2. Run 'sudo iptables -I OUTPUT -p tcp --dport 9418 -j REJECT'
  3. Ensure that port is blocked by running 'git clone git://github.com/JuliaLang/Example.jl'.
  4. Run 'jlmake test-pkg'

@@ -148,11 +148,12 @@ function status(io::IO, pkg::AbstractString, ver::VersionNumber, fix::Bool)
end

function clone(url::AbstractString, pkg::AbstractString)
info("Cloning $pkg from $url")
normalized_url = Git.normalize_url(url)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this also rewrite urls when users explicitly use ssh, for example? If the user explicitly provides a url, it shouldn't be rewritten.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

colne is done by user. Do not mess with user's url.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This won't rewrite SSH URLs, only git and https, but you're right, if the user manually provides a URL to clone(), it shouldn't be rewritten. I'll remove the normalisation from this function. For reference, the regex (which I've not touched) is:

r"^(?:git@|git://|https://(?:[\w\.\+\-]+@)?)github.com[:/](([^/].+)/(.+?))(?:\.git)?$"i

@Keno
Copy link
Member

Keno commented May 17, 2015

This used to be the default, but then got changed because it turned out more users had https blocked than git://. We really need to provide a way to have the user be able to configure it (or try to autodetect).

@simonster
Copy link
Member

I would really like to see everything (but especially METADATA.jl) use https by default. The current setup is really not very secure.

@tkelman
Copy link
Contributor

tkelman commented May 17, 2015

Not to mention SSH keys are a completely foreign beast on Windows, ref #10892

@tkelman tkelman added the packages Package management and loading label May 19, 2015
@mikewl
Copy link
Contributor

mikewl commented May 19, 2015

To keep everyone happy, why not modify the function arguments with a kwarg for selecting the URL type?

@LachlanGunn
Copy link
Contributor Author

That seems like the best idea---my thinking is to make HTTPS the default, but allowing the user to use git:// or to disable rewriting altogether with an option to Pkg.init. Any thoughts?

@aviks
Copy link
Member

aviks commented May 19, 2015

@LachlanGunn +1

@ihnorton
Copy link
Member

+1

@LachlanGunn
Copy link
Contributor Author

Ok, the first of these issues is dealt with---Pkg.clone() won't do any rewriting anymore. Making all rewriting optional/configurable will be a few days more.

@tkelman
Copy link
Contributor

tkelman commented May 19, 2015

So is this still WIP then?

@LachlanGunn LachlanGunn changed the title RFC: Rewrite git://github.com to https://github.com in package URLs. WIP: Rewrite git://github.com to https://github.com in package URLs. May 19, 2015
@LachlanGunn
Copy link
Contributor Author

Yes. I've changed the title to reflect this.

@ViralBShah
Copy link
Member

I too like the idea of https by default, but perhaps Pkg can automatically fall back to git if https fails, and print a warning to that effect, and finally printing an error message about proxies/firewalls, if git fails too.

@simonster
Copy link
Member

I don't think Pkg should automatically fall back to git:// without user intervention. That has the same security issues as using git:// by default. If you don't cancel before the package build scripts run, your computer can be compromised.

@ScottPJones
Copy link
Contributor

I think @simonster is correct, these days the security issues are extremely important...

@felipenoris
Copy link
Contributor

I would suggest https:// to be the default behavior without fallback.
Maybe Pkg could provide a function to customize this. Something like Pkg.usegitprotocol(true) writing this configuration to the user configuration file ( ~/.juliarc.jl or something like that).

@LachlanGunn
Copy link
Contributor Author

Yes, you are correct. This change was reasonably close to being finished before I was interrupted by other things, I must really knock off the final interface.

Since to my knowledge there is no cryptographic verification of the packages beyond the checksum, I agree that this really is important.

@felipenoris
Copy link
Contributor

awesome

@LachlanGunn
Copy link
Contributor Author

Ok, so I have merged the changes forward and finally gotten around to making the protocol configurable---"https", "git", etc., or "" to make it not change anything. Default is HTTPS.

Moving to an apartment whose internet connection has git:// blocked gave me a bit of encouragement :)

After this, I'm planning to look at changing the git:// URLs in the build system to https://.

@LachlanGunn LachlanGunn changed the title WIP: Rewrite git://github.com to https://github.com in package URLs. RFC: Rewrite git://github.com to https://github.com in package URLs. Oct 17, 2015
@kmsquire
Copy link
Member

Just in case this isn't obvious from previous conversations about git vs https, as strange as it may seem, there have been reports of the opposite problem--that https (to GitHub at least) was blocked, but that the git protocol worked. So while one might be good as a default, it should be possible to use the other.

@LachlanGunn
Copy link
Contributor Author

Yes, that capability is there---I have added a new function to Pkg.git that allows the protocol to be changed.

@kmsquire
Copy link
Member

Cool, thanks for the clarification.

@felipenoris
Copy link
Contributor

LibGit2?

@@ -2,7 +2,7 @@

module Cache

import ...LibGit2, ..Dir, ...Pkg.PkgError
import ...LibGit2, ..Dir, ...Pkg.PkgError, ..Git
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not use Git module. It will be removed. LibGit2 has normalization function.

@LachlanGunn
Copy link
Contributor Author

I'm not sure that LibGit2 is the right place though, since it's supposed to be a wrapper library as I understand it, and adding a bunch of github-specific functionality seems a bit out of place. But if that's the consensus or I've misunderstood the purpose of LibGit2, then I'll go with the majority and put it there.

@tkelman
Copy link
Contributor

tkelman commented Oct 18, 2015

Look around the LibGit2 module or the Pkg module (maybe in Pkg.Read?), there are already likely a few places that deal with urls.

@@ -112,13 +114,27 @@ function set_remote_url(url::AbstractString; remote::AbstractString="origin", di
run(`config remote.$remote.url $url`, dir=dir)
m = match(GITHUB_REGEX,url)
m === nothing && return
push = "[email protected]:$(m.captures[1]).git"
push = "$rewrite_url_to://[email protected]:$(m.captures[1]).git"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should put git@ prefix only for git protocol.

@LachlanGunn LachlanGunn reopened this Oct 27, 2015
@LachlanGunn
Copy link
Contributor Author

Righto; thanks for all of your comments, I've tried to address them all in this next iteration.

The URL normalisation is now in Pkg.Cache, and the configuration function has been referenced into the top level Pkg module. What do you guys think?

@@ -105,20 +105,18 @@ function is_ancestor_of(a::AbstractString, b::AbstractString; dir="")
readchomp(`merge-base $A $b`, dir=dir) == A
end

const GITHUB_REGEX =
r"^(?:git@|git://|https://(?:[\w\.\+\-]+@)?)github.com[:/](([^/].+)/(.+?))(?:\.git)?$"i

function set_remote_url(url::AbstractString; remote::AbstractString="origin", dir="")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this version of the function is obsolete and subject to being deleted - see the version in libgit2.jl

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First of all, no changes to git.jl, all modifications should go to libgit2.jl

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I'll leave it in its previous state then. I was a bit antsy about the potentially inconsistent behaviour, even if it's slated for removal and not being used at the moment.

@tkelman
Copy link
Contributor

tkelman commented Oct 27, 2015

On naming, I think Pkg.setprotocol! would be preferable to Pkg.set_git_proto

@LachlanGunn
Copy link
Contributor Author

@tkelman Yes, you're right, I'm retrospect it isn't very idiomatic. I'll definitely be changing that.

@LachlanGunn
Copy link
Contributor Author

I've reverted Pkg.Git to its original state now as well. Any further thoughts?

export dir, init, rm, add, available, installed, status, clone, checkout,
update, resolve, test, build, free, pin, PkgError
update, resolve, test, build, free, pin, PkgError, set_git_proto
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix name here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack, that is really embarrassingly stupid! Sorry to keep wasting your time on such things.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no big deal, this is why code review is a thing

@LachlanGunn
Copy link
Contributor Author

Righto, the proper function name has now been exported.

@LachlanGunn
Copy link
Contributor Author

With those changes made, are there any further comments? Apologies for the bump.

@tkelman
Copy link
Contributor

tkelman commented Nov 2, 2015

This looks good, the bump was warranted. I think we should document the new protocol setting API.

@LachlanGunn
Copy link
Contributor Author

Righto, thanks---now that it's settled I'll do documentation/tests/etc. The fact that the URLs are going to be fiddled with is itself probably something to make clear in the documentation of PkgDev/METADATA as well.

@wildart
Copy link
Member

wildart commented Nov 3, 2015

LGTM

@LachlanGunn
Copy link
Contributor Author

If PkgDev is depending on this shall I get this squashed up and make a second pull request for the tests?

@LachlanGunn LachlanGunn closed this Nov 3, 2015
@LachlanGunn LachlanGunn reopened this Nov 3, 2015
@tkelman
Copy link
Contributor

tkelman commented Nov 4, 2015

works for me

tkelman added a commit that referenced this pull request Nov 4, 2015
WIP: Rewrite git://github.com to https://github.com in package URLs.
@tkelman tkelman merged commit 8798aab into JuliaLang:master Nov 4, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
packages Package management and loading
Projects
None yet
Development

Successfully merging this pull request may close these issues.