Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal to add missing support for short-hash src URLs #6450

Closed
fungi opened this issue Mar 27, 2019 · 10 comments · Fixed by #13686
Closed

Proposal to add missing support for short-hash src URLs #6450

fungi opened this issue Mar 27, 2019 · 10 comments · Fixed by #13686
Labels
topic/ui Change the appearance of the Gitea UI type/enhancement An improvement of existing functionality type/proposal The new feature has not been accepted yet but needs to be discussed first.

Comments

@fungi
Copy link

fungi commented Mar 27, 2019

Similar to #211, it would be convenient to support abbreviated commit IDs in file content/browsing URLs. This is a supported pattern in other Gitea URLs already (commit and archive at least), and also doable with similar interfaces like gitweb, cgit, bitbucket and github. Shorter IDs are handy for reducing the URL length when linking from media where column count is at a premium and wrapping tends to be avoided when possible (mailing list threads, IRC discussions, ...).

@fungi
Copy link
Author

fungi commented Mar 27, 2019

Are there any objections to this? I haven't looked closely at the code paths involved, but if it's as simple as it was to add for archive downloads then I might be able to take a stab at submitting a PR for it myself. Many thanks in advance for considering!

@techknowlogick techknowlogick added type/enhancement An improvement of existing functionality type/proposal The new feature has not been accepted yet but needs to be discussed first. topic/ui Change the appearance of the Gitea UI labels Mar 27, 2019
@mateusza
Copy link

I think accessing URL with shortened commit ID should generate a redirection, not display the content.

Duplicate Content

@jolheiser
Copy link
Member

@mateusza I agree, and there is a request for this on the PR discussion.

jeblair pushed a commit to jeblair/gitea that referenced this issue Mar 28, 2019
This supports using a git SHA of any length between 7 and 40
characters in URLs (e.g. /src/commit/SHA/...).  Previously
only commit SHAs of the full 40 character length were supported.

The RepoRefAny ref type is used in one place, in the
/api/v1/user/repo/raw API endpoint where it is used to guess whether
the remainder of the path is a ref name followed by a file path
or merely a file path.  There is no good way to guess whether
a shortened SHA is intended in that circumstance (e.g.,
/raw/beefcafe/README.txt could be /README.txt in the beefcafe{..32}
commit, or /beefcafe/README.txt on the master branch).  For this
case, we don't support shortened SHAs and only match on the full
one.

Signed-off-by: James E. Blair <[email protected]>
@fungi
Copy link
Author

fungi commented Mar 28, 2019

That sounds entirely reasonable as well. Either way solves the challenge I have.

@jeblair
Copy link
Contributor

jeblair commented Mar 28, 2019

I think we could support a redirection, but if we did so, we may make it harder to create shortened links in the first place. If I wanted to create a shortened link, I would first navigate to the page in the browser, then edit the URL bar to shorten the link, then hit enter to verify that I did so correctly; if the correct page reloaded, I would then copy that to my email/IRC/whatever. If we redirected there, I would end up with the same long URL again, so I would have to edit the URL externally, then paste it back in to the browser to verify it worked.

Granted, it's a minor annoyance, but unless redirecting to the full SHA has other benefits, this makes me lean toward simply supporting it without a redirect.

This is not a strongly-held belief, I'm happy with either way. :)

@techknowlogick
Copy link
Member

techknowlogick commented Mar 28, 2019

I'd prefer the redirect, that way we don't have multiple pages that show the same thing (might mess up webcrawlers on the site is one issue that I see, among others), but instead the user would always end up on the canonical page.

Edit: Also see @zeripath's response below.

@zeripath
Copy link
Contributor

Short SHA are not permanent links, any additional commit could suddenly make your link no longer work. (Admittedly this is the case with the full SHA, but in that case we're at a whole different world of trouble.)

In the early days of the Linux kernel you could get away with 7 SHA now you almost always need a 10-12 SHA.

@fungi
Copy link
Author

fungi commented Mar 28, 2019

With 7 hex chars (3.5 bytes or 28 bits) the odds that ID will collide with another in the repository (assuming a completely even distribution and spherical cows with no wind resistance) is a bit north of 1 in 250 million.

Edit for minor clarification: you would need 250 million commits for a 1:1 chance of a collision of a given ID, though if your project has a mere million commits then the chance is something like 1 in 25. So while I agree that the odds aren't great when you have a repository in the hundreds of thousands of commits neighborhood (or perhaps even in the tens of thousands), the fact stands that 7 hex digits is what many familiar tools (including Git itself) use as a standard abbreviation length.

@mateusza
Copy link

With 7 hex chars (3.5 bytes or 28 bits) the odds that ID will collide with another in the repository (assuming a completely even distribution and spherical cows with no wind resistance) is a bit north of 1 in 250 million.

Yes. But what we should look at is probability of ANY id colliding with at least one other id. And the numbers look very different here.

https://preshing.com/20110504/hash-collision-probabilities/
https://en.wikipedia.org/wiki/Birthday_problem

Edit for minor clarification: you would need 250 million commits for a 1:1 chance of a collision of a given ID,

No.

With 16^7 +1 = 268435457 commits you would have 100% chance there exists at least one 7-digit collision. But you never have exactly 100% chance of collision of any given short ID. Theoretically, there could be milions of objects and only one of starting with "0" digit. Why not? Unlikely, but not impossible.

@fungi
Copy link
Author

fungi commented Aug 19, 2019

Yes. But what we should look at is probability of ANY id colliding with at least one other id.

I think it depends on what you're concerned about. If these are done as redirects to the equivalent URL with the full-length commit ID then the main risk seems to be when someone includes a link with the shortened version in, say, an archived mailing list post which can't be easily corrected later and then, at some point, that shortened ID begins to collide with another in the same repository. Not every commit is going to be linked to by someone in such a manner, so for me it comes down to the odds that a particular abbreviation collides rather than the odds that there could exist a collision of at least one abbreviation somewhere within the repository.

openstack-gerrit pushed a commit to openstack/nova that referenced this issue Aug 29, 2019
Also update some outdated URLs at the same time, e.g. defcore is now
interop.

Unfortunately unlike GitHub, gitea doesn't yet support URLs with
shortened SHA1s; however this is being worked on:

    go-gitea/gitea#6450

Change-Id: I6e6b63619f1138cc961b61be548453361d01f73c
openstack-gerrit pushed a commit to openstack/openstack that referenced this issue Aug 29, 2019
* Update nova from branch 'master'
  - Merge "Switch some GitHub URLs to point to opendev.org"
  - Switch some GitHub URLs to point to opendev.org
    
    Also update some outdated URLs at the same time, e.g. defcore is now
    interop.
    
    Unfortunately unlike GitHub, gitea doesn't yet support URLs with
    shortened SHA1s; however this is being worked on:
    
        go-gitea/gitea#6450
    
    Change-Id: I6e6b63619f1138cc961b61be548453361d01f73c
@go-gitea go-gitea locked and limited conversation to collaborators Jan 18, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
topic/ui Change the appearance of the Gitea UI type/enhancement An improvement of existing functionality type/proposal The new feature has not been accepted yet but needs to be discussed first.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants