Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support downloading from "dumb http" with git-annex. #6

Merged
merged 4 commits into from
Sep 20, 2022

Conversation

kousu
Copy link
Member

@kousu kousu commented May 8, 2022

This makes git clone https://...... && git annex get function, so that datasets can be shared widely, without the big roadblock of necessarily having to create accounts or ssh keys for users.

Fixes #3

I don't think git-annex supports uploading over http, and
neither does this. If it turns out it does I'll revise this.

Ref: https://git-annex.branchable.com/tips/setup_a_public_repository_on_a_web_site/
but this skip several of those instructions:

  • they assume you're deploying to a non-bare repository
  • gitea already takes care of the update-server-info hook
  • we want to enforce gitea's access control so we don't
    use core.sharedRepository

Follows #18

@kousu kousu requested a review from mguaypaq May 8, 2022 06:10
@kousu
Copy link
Member Author

kousu commented May 8, 2022

Some things to test:

Some things to nitpick:

  • my comments are too wordy

@kousu
Copy link
Member Author

kousu commented May 8, 2022

There's a build of this at https://github.com/neuropoly/gitea/releases/tag/v1.16.7-git-annex-http (rebased on v1.16.7 instead of main, to make it safer to deploy). I've installed it on our current dev server https://data.praxisinstitute.org.dev.neuropoly.org/.

Deploying it made this work on the pre-existing annexed dataset we had there:

[kousu@nigiri down]$ git clone https://data.praxisinstitute.org.dev.neuropoly.org/uofc/candice-fmri-.git
Cloning into 'candice-fmri-'...
remote: Enumerating objects: 71, done.
remote: Counting objects: 100% (71/71), done.
remote: Compressing objects: 100% (57/57), done.
remote: Total 71 (delta 21), reused 0 (delta 0)
Receiving objects: 100% (71/71), 9.72 KiB | 9.72 MiB/s, done.
Resolving deltas: 100% (21/21), done.
[kousu@nigiri down]$ cd candice-fmri-/
[kousu@nigiri candice-fmri-]$ git annex get
get sub-BAN01/anat/sub-BAN01_T1w.nii.gz (from origin...) 
ok                                   
get sub-BAN02/anat/sub-BAN02_T1w.nii.gz (from origin...) 
ok                                
get sub-BAN03/anat/sub-BAN03_T1w.nii.gz (from origin...) 
ok                                
(recording state in git...)

@kousu
Copy link
Member Author

kousu commented May 8, 2022

I set that repo back to its mixed-case name:

Screenshot 2022-05-08 at 02-46-36 CANDICE-fMRI-

and that worked too:

[kousu@nigiri down]$ git clone https://data.praxisinstitute.org.dev.neuropoly.org/UofC/CANDICE-fMRI-.git
Cloning into 'CANDICE-fMRI-'...
remote: Enumerating objects: 71, done.
remote: Counting objects: 100% (71/71), done.
remote: Compressing objects: 100% (57/57), done.
remote: Total 71 (delta 21), reused 0 (delta 0)
Receiving objects: 100% (71/71), 9.72 KiB | 4.86 MiB/s, done.
Resolving deltas: 100% (21/21), done.
[kousu@nigiri down]$ cd CANDICE-fMRI-/
[kousu@nigiri CANDICE-fMRI-]$ git annex get
get sub-BAN01/anat/sub-BAN01_T1w.nii.gz (from origin...) 
47%   2.87 MiB        895 KiB/s 3s    
ok                                    
get sub-BAN02/anat/sub-BAN02_T1w.nii.gz (from origin...) 
52%   3.31 MiB          2 MiB/s 1s
ok                                
get sub-BAN03/anat/sub-BAN03_T1w.nii.gz (from origin...) 
ok                                
(recording state in git...)

so it looks like case-sensitivity isn't a problem.

routers/web/web.go Outdated Show resolved Hide resolved
@gitea-sync gitea-sync bot force-pushed the git-annex branch 2 times, most recently from 9cd7415 to c582e59 Compare May 9, 2022 13:37
Copy link
Member

@mguaypaq mguaypaq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's great that you got this working!!! Here are my comments on the code as it stands. You can ping me again if/when you want another review.

cmd/serv.go Outdated Show resolved Hide resolved
routers/web/web.go Outdated Show resolved Hide resolved
routers/web/repo/http.go Show resolved Hide resolved
routers/web/repo/http.go Show resolved Hide resolved
routers/web/web.go Outdated Show resolved Hide resolved
routers/web/web.go Outdated Show resolved Hide resolved
routers/private/serv.go Outdated Show resolved Hide resolved
cmd/serv.go Outdated Show resolved Hide resolved
cmd/serv.go Outdated Show resolved Hide resolved
cmd/serv.go Outdated Show resolved Hide resolved
@kousu
Copy link
Member Author

kousu commented May 10, 2022

A test against Drone: https://drone.data.praxisinstitute.org.dev.neuropoly.org/nguenthe/CANDICE-fMRI-/2/1/3

It did not work:

+ git fetch --depth=1 origin git-annex:git-annex
From https://data.praxisinstitute.org.dev.neuropoly.org/nguenthe/CANDICE-fMRI-
 * [new branch]      git-annex  -> git-annex
 * [new branch]      git-annex  -> origin/git-annex
+ git annex get -J8
(scanning for unlocked files...)
fatal: could not read Username for 'https://data.praxisinstitute.org.dev.neuropoly.org': terminal prompts disabled

  Remote origin not usable by git-annex; setting annex-ignore
get sub-BAN01/anat/sub-BAN01_T1w.nii.gz (not available) 
[...]

"terminal prompts disabled" is the sort of problem you get when stdin is closed or not a tty. But this is a private repo:

Screenshot 2022-05-10 at 18-07-22 CANDICE-fMRI-

and the clone clearly worked, and even the git fetch worked, so I'll have to look into why that doesn't extend to git annex get.

Solving this is not "on the critical path", because we're planning to drop DroneCI anyway and use @mguaypaq's localhost bids-validator Gitea plugin, and our main use case for HTTP downloads is for sharing open access data, but I'd really like to not leave this half-done if I can.

cmd/serv.go Outdated Show resolved Hide resolved
@gitea-sync gitea-sync bot force-pushed the git-annex branch 3 times, most recently from a1dc221 to a49efd4 Compare May 15, 2022 13:27
@kousu kousu force-pushed the git-annex-http branch from 2ec9909 to 359a422 Compare May 16, 2022 04:24
@kousu kousu mentioned this pull request May 16, 2022
@kousu kousu force-pushed the git-annex branch 2 times, most recently from c5dfbff to f7f3c7b Compare May 16, 2022 04:49
@gitea-sync gitea-sync bot force-pushed the git-annex branch 3 times, most recently from 3444b1b to 5f1ec72 Compare May 18, 2022 13:34
@kousu kousu force-pushed the git-annex-http branch from 359a422 to 327c3c1 Compare May 19, 2022 03:13
@gitea-sync gitea-sync bot force-pushed the git-annex branch 2 times, most recently from 4026ee0 to de4addd Compare May 20, 2022 13:31
kousu added a commit that referenced this pull request Oct 19, 2023
This makes HTTP symmetric with SSH clone URLs.

This gives us the fancy feature of _anonymous_ downloads,
so people can access datasets without having to set up an
account or manage ssh keys.

Previously, to access "open access" data shared this way,
users would need to:

  1. Create an account on gitea.example.com
  2. Create ssh keys
  3. Upload ssh keys (and make sure to find and upload the correct file)
  4. `git clone [email protected]:user/dataset.git`
  5. `cd dataset`
  6. `git annex get`

This cuts that down to just the last three steps:

  1. `git clone https://gitea.example.com/user/dataset.git`
  2. `cd dataset`
  3. `git annex get`

This is significantly simpler for downstream users, especially for those
unfamiliar with the command line.

Unfortunately there's no uploading. While git-annex supports uploading
over HTTP to S3 and some other special remotes, it seems to fail on a
_plain_ HTTP remote. See #7
and https://git-annex.branchable.com/forum/HTTP_uploads/#comment-ce28adc128fdefe4c4c49628174d9b92.

This is not a major loss since no one wants uploading to be anonymous anyway.

To support private repos, I had to hunt down and patch a secret extra security
corner that Gitea only applies to HTTP for some reason (services/auth/basic.go).

This was guided by https://git-annex.branchable.com/tips/setup_a_public_repository_on_a_web_site/

Fixes #3

Co-authored-by: Mathieu Guay-Paquet <[email protected]>
gitea-sync bot pushed a commit that referenced this pull request Oct 30, 2023
This makes HTTP symmetric with SSH clone URLs.

This gives us the fancy feature of _anonymous_ downloads,
so people can access datasets without having to set up an
account or manage ssh keys.

Previously, to access "open access" data shared this way,
users would need to:

  1. Create an account on gitea.example.com
  2. Create ssh keys
  3. Upload ssh keys (and make sure to find and upload the correct file)
  4. `git clone [email protected]:user/dataset.git`
  5. `cd dataset`
  6. `git annex get`

This cuts that down to just the last three steps:

  1. `git clone https://gitea.example.com/user/dataset.git`
  2. `cd dataset`
  3. `git annex get`

This is significantly simpler for downstream users, especially for those
unfamiliar with the command line.

Unfortunately there's no uploading. While git-annex supports uploading
over HTTP to S3 and some other special remotes, it seems to fail on a
_plain_ HTTP remote. See #7
and https://git-annex.branchable.com/forum/HTTP_uploads/#comment-ce28adc128fdefe4c4c49628174d9b92.

This is not a major loss since no one wants uploading to be anonymous anyway.

To support private repos, I had to hunt down and patch a secret extra security
corner that Gitea only applies to HTTP for some reason (services/auth/basic.go).

This was guided by https://git-annex.branchable.com/tips/setup_a_public_repository_on_a_web_site/

Fixes #3

Co-authored-by: Mathieu Guay-Paquet <[email protected]>
gitea-sync bot pushed a commit that referenced this pull request Oct 31, 2023
This makes HTTP symmetric with SSH clone URLs.

This gives us the fancy feature of _anonymous_ downloads,
so people can access datasets without having to set up an
account or manage ssh keys.

Previously, to access "open access" data shared this way,
users would need to:

  1. Create an account on gitea.example.com
  2. Create ssh keys
  3. Upload ssh keys (and make sure to find and upload the correct file)
  4. `git clone [email protected]:user/dataset.git`
  5. `cd dataset`
  6. `git annex get`

This cuts that down to just the last three steps:

  1. `git clone https://gitea.example.com/user/dataset.git`
  2. `cd dataset`
  3. `git annex get`

This is significantly simpler for downstream users, especially for those
unfamiliar with the command line.

Unfortunately there's no uploading. While git-annex supports uploading
over HTTP to S3 and some other special remotes, it seems to fail on a
_plain_ HTTP remote. See #7
and https://git-annex.branchable.com/forum/HTTP_uploads/#comment-ce28adc128fdefe4c4c49628174d9b92.

This is not a major loss since no one wants uploading to be anonymous anyway.

To support private repos, I had to hunt down and patch a secret extra security
corner that Gitea only applies to HTTP for some reason (services/auth/basic.go).

This was guided by https://git-annex.branchable.com/tips/setup_a_public_repository_on_a_web_site/

Fixes #3

Co-authored-by: Mathieu Guay-Paquet <[email protected]>
gitea-sync bot pushed a commit that referenced this pull request Nov 1, 2023
This makes HTTP symmetric with SSH clone URLs.

This gives us the fancy feature of _anonymous_ downloads,
so people can access datasets without having to set up an
account or manage ssh keys.

Previously, to access "open access" data shared this way,
users would need to:

  1. Create an account on gitea.example.com
  2. Create ssh keys
  3. Upload ssh keys (and make sure to find and upload the correct file)
  4. `git clone [email protected]:user/dataset.git`
  5. `cd dataset`
  6. `git annex get`

This cuts that down to just the last three steps:

  1. `git clone https://gitea.example.com/user/dataset.git`
  2. `cd dataset`
  3. `git annex get`

This is significantly simpler for downstream users, especially for those
unfamiliar with the command line.

Unfortunately there's no uploading. While git-annex supports uploading
over HTTP to S3 and some other special remotes, it seems to fail on a
_plain_ HTTP remote. See #7
and https://git-annex.branchable.com/forum/HTTP_uploads/#comment-ce28adc128fdefe4c4c49628174d9b92.

This is not a major loss since no one wants uploading to be anonymous anyway.

To support private repos, I had to hunt down and patch a secret extra security
corner that Gitea only applies to HTTP for some reason (services/auth/basic.go).

This was guided by https://git-annex.branchable.com/tips/setup_a_public_repository_on_a_web_site/

Fixes #3

Co-authored-by: Mathieu Guay-Paquet <[email protected]>
gitea-sync bot pushed a commit that referenced this pull request Nov 2, 2023
This makes HTTP symmetric with SSH clone URLs.

This gives us the fancy feature of _anonymous_ downloads,
so people can access datasets without having to set up an
account or manage ssh keys.

Previously, to access "open access" data shared this way,
users would need to:

  1. Create an account on gitea.example.com
  2. Create ssh keys
  3. Upload ssh keys (and make sure to find and upload the correct file)
  4. `git clone [email protected]:user/dataset.git`
  5. `cd dataset`
  6. `git annex get`

This cuts that down to just the last three steps:

  1. `git clone https://gitea.example.com/user/dataset.git`
  2. `cd dataset`
  3. `git annex get`

This is significantly simpler for downstream users, especially for those
unfamiliar with the command line.

Unfortunately there's no uploading. While git-annex supports uploading
over HTTP to S3 and some other special remotes, it seems to fail on a
_plain_ HTTP remote. See #7
and https://git-annex.branchable.com/forum/HTTP_uploads/#comment-ce28adc128fdefe4c4c49628174d9b92.

This is not a major loss since no one wants uploading to be anonymous anyway.

To support private repos, I had to hunt down and patch a secret extra security
corner that Gitea only applies to HTTP for some reason (services/auth/basic.go).

This was guided by https://git-annex.branchable.com/tips/setup_a_public_repository_on_a_web_site/

Fixes #3

Co-authored-by: Mathieu Guay-Paquet <[email protected]>
kousu added a commit that referenced this pull request Nov 4, 2023
This makes HTTP symmetric with SSH clone URLs.

This gives us the fancy feature of _anonymous_ downloads,
so people can access datasets without having to set up an
account or manage ssh keys.

Previously, to access "open access" data shared this way,
users would need to:

  1. Create an account on gitea.example.com
  2. Create ssh keys
  3. Upload ssh keys (and make sure to find and upload the correct file)
  4. `git clone [email protected]:user/dataset.git`
  5. `cd dataset`
  6. `git annex get`

This cuts that down to just the last three steps:

  1. `git clone https://gitea.example.com/user/dataset.git`
  2. `cd dataset`
  3. `git annex get`

This is significantly simpler for downstream users, especially for those
unfamiliar with the command line.

Unfortunately there's no uploading. While git-annex supports uploading
over HTTP to S3 and some other special remotes, it seems to fail on a
_plain_ HTTP remote. See #7
and https://git-annex.branchable.com/forum/HTTP_uploads/#comment-ce28adc128fdefe4c4c49628174d9b92.

This is not a major loss since no one wants uploading to be anonymous anyway.

To support private repos, I had to hunt down and patch a secret extra security
corner that Gitea only applies to HTTP for some reason (services/auth/basic.go).

This was guided by https://git-annex.branchable.com/tips/setup_a_public_repository_on_a_web_site/

Fixes #3

Co-authored-by: Mathieu Guay-Paquet <[email protected]>
gitea-sync bot pushed a commit that referenced this pull request Nov 5, 2023
This makes HTTP symmetric with SSH clone URLs.

This gives us the fancy feature of _anonymous_ downloads,
so people can access datasets without having to set up an
account or manage ssh keys.

Previously, to access "open access" data shared this way,
users would need to:

  1. Create an account on gitea.example.com
  2. Create ssh keys
  3. Upload ssh keys (and make sure to find and upload the correct file)
  4. `git clone [email protected]:user/dataset.git`
  5. `cd dataset`
  6. `git annex get`

This cuts that down to just the last three steps:

  1. `git clone https://gitea.example.com/user/dataset.git`
  2. `cd dataset`
  3. `git annex get`

This is significantly simpler for downstream users, especially for those
unfamiliar with the command line.

Unfortunately there's no uploading. While git-annex supports uploading
over HTTP to S3 and some other special remotes, it seems to fail on a
_plain_ HTTP remote. See #7
and https://git-annex.branchable.com/forum/HTTP_uploads/#comment-ce28adc128fdefe4c4c49628174d9b92.

This is not a major loss since no one wants uploading to be anonymous anyway.

To support private repos, I had to hunt down and patch a secret extra security
corner that Gitea only applies to HTTP for some reason (services/auth/basic.go).

This was guided by https://git-annex.branchable.com/tips/setup_a_public_repository_on_a_web_site/

Fixes #3

Co-authored-by: Mathieu Guay-Paquet <[email protected]>
gitea-sync bot pushed a commit that referenced this pull request Nov 6, 2023
This makes HTTP symmetric with SSH clone URLs.

This gives us the fancy feature of _anonymous_ downloads,
so people can access datasets without having to set up an
account or manage ssh keys.

Previously, to access "open access" data shared this way,
users would need to:

  1. Create an account on gitea.example.com
  2. Create ssh keys
  3. Upload ssh keys (and make sure to find and upload the correct file)
  4. `git clone [email protected]:user/dataset.git`
  5. `cd dataset`
  6. `git annex get`

This cuts that down to just the last three steps:

  1. `git clone https://gitea.example.com/user/dataset.git`
  2. `cd dataset`
  3. `git annex get`

This is significantly simpler for downstream users, especially for those
unfamiliar with the command line.

Unfortunately there's no uploading. While git-annex supports uploading
over HTTP to S3 and some other special remotes, it seems to fail on a
_plain_ HTTP remote. See #7
and https://git-annex.branchable.com/forum/HTTP_uploads/#comment-ce28adc128fdefe4c4c49628174d9b92.

This is not a major loss since no one wants uploading to be anonymous anyway.

To support private repos, I had to hunt down and patch a secret extra security
corner that Gitea only applies to HTTP for some reason (services/auth/basic.go).

This was guided by https://git-annex.branchable.com/tips/setup_a_public_repository_on_a_web_site/

Fixes #3

Co-authored-by: Mathieu Guay-Paquet <[email protected]>
gitea-sync bot pushed a commit that referenced this pull request Nov 7, 2023
This makes HTTP symmetric with SSH clone URLs.

This gives us the fancy feature of _anonymous_ downloads,
so people can access datasets without having to set up an
account or manage ssh keys.

Previously, to access "open access" data shared this way,
users would need to:

  1. Create an account on gitea.example.com
  2. Create ssh keys
  3. Upload ssh keys (and make sure to find and upload the correct file)
  4. `git clone [email protected]:user/dataset.git`
  5. `cd dataset`
  6. `git annex get`

This cuts that down to just the last three steps:

  1. `git clone https://gitea.example.com/user/dataset.git`
  2. `cd dataset`
  3. `git annex get`

This is significantly simpler for downstream users, especially for those
unfamiliar with the command line.

Unfortunately there's no uploading. While git-annex supports uploading
over HTTP to S3 and some other special remotes, it seems to fail on a
_plain_ HTTP remote. See #7
and https://git-annex.branchable.com/forum/HTTP_uploads/#comment-ce28adc128fdefe4c4c49628174d9b92.

This is not a major loss since no one wants uploading to be anonymous anyway.

To support private repos, I had to hunt down and patch a secret extra security
corner that Gitea only applies to HTTP for some reason (services/auth/basic.go).

This was guided by https://git-annex.branchable.com/tips/setup_a_public_repository_on_a_web_site/

Fixes #3

Co-authored-by: Mathieu Guay-Paquet <[email protected]>
gitea-sync bot pushed a commit that referenced this pull request Nov 8, 2023
This makes HTTP symmetric with SSH clone URLs.

This gives us the fancy feature of _anonymous_ downloads,
so people can access datasets without having to set up an
account or manage ssh keys.

Previously, to access "open access" data shared this way,
users would need to:

  1. Create an account on gitea.example.com
  2. Create ssh keys
  3. Upload ssh keys (and make sure to find and upload the correct file)
  4. `git clone [email protected]:user/dataset.git`
  5. `cd dataset`
  6. `git annex get`

This cuts that down to just the last three steps:

  1. `git clone https://gitea.example.com/user/dataset.git`
  2. `cd dataset`
  3. `git annex get`

This is significantly simpler for downstream users, especially for those
unfamiliar with the command line.

Unfortunately there's no uploading. While git-annex supports uploading
over HTTP to S3 and some other special remotes, it seems to fail on a
_plain_ HTTP remote. See #7
and https://git-annex.branchable.com/forum/HTTP_uploads/#comment-ce28adc128fdefe4c4c49628174d9b92.

This is not a major loss since no one wants uploading to be anonymous anyway.

To support private repos, I had to hunt down and patch a secret extra security
corner that Gitea only applies to HTTP for some reason (services/auth/basic.go).

This was guided by https://git-annex.branchable.com/tips/setup_a_public_repository_on_a_web_site/

Fixes #3

Co-authored-by: Mathieu Guay-Paquet <[email protected]>
gitea-sync bot pushed a commit that referenced this pull request Nov 9, 2023
This makes HTTP symmetric with SSH clone URLs.

This gives us the fancy feature of _anonymous_ downloads,
so people can access datasets without having to set up an
account or manage ssh keys.

Previously, to access "open access" data shared this way,
users would need to:

  1. Create an account on gitea.example.com
  2. Create ssh keys
  3. Upload ssh keys (and make sure to find and upload the correct file)
  4. `git clone [email protected]:user/dataset.git`
  5. `cd dataset`
  6. `git annex get`

This cuts that down to just the last three steps:

  1. `git clone https://gitea.example.com/user/dataset.git`
  2. `cd dataset`
  3. `git annex get`

This is significantly simpler for downstream users, especially for those
unfamiliar with the command line.

Unfortunately there's no uploading. While git-annex supports uploading
over HTTP to S3 and some other special remotes, it seems to fail on a
_plain_ HTTP remote. See #7
and https://git-annex.branchable.com/forum/HTTP_uploads/#comment-ce28adc128fdefe4c4c49628174d9b92.

This is not a major loss since no one wants uploading to be anonymous anyway.

To support private repos, I had to hunt down and patch a secret extra security
corner that Gitea only applies to HTTP for some reason (services/auth/basic.go).

This was guided by https://git-annex.branchable.com/tips/setup_a_public_repository_on_a_web_site/

Fixes #3

Co-authored-by: Mathieu Guay-Paquet <[email protected]>
gitea-sync bot pushed a commit that referenced this pull request Nov 10, 2023
This makes HTTP symmetric with SSH clone URLs.

This gives us the fancy feature of _anonymous_ downloads,
so people can access datasets without having to set up an
account or manage ssh keys.

Previously, to access "open access" data shared this way,
users would need to:

  1. Create an account on gitea.example.com
  2. Create ssh keys
  3. Upload ssh keys (and make sure to find and upload the correct file)
  4. `git clone [email protected]:user/dataset.git`
  5. `cd dataset`
  6. `git annex get`

This cuts that down to just the last three steps:

  1. `git clone https://gitea.example.com/user/dataset.git`
  2. `cd dataset`
  3. `git annex get`

This is significantly simpler for downstream users, especially for those
unfamiliar with the command line.

Unfortunately there's no uploading. While git-annex supports uploading
over HTTP to S3 and some other special remotes, it seems to fail on a
_plain_ HTTP remote. See #7
and https://git-annex.branchable.com/forum/HTTP_uploads/#comment-ce28adc128fdefe4c4c49628174d9b92.

This is not a major loss since no one wants uploading to be anonymous anyway.

To support private repos, I had to hunt down and patch a secret extra security
corner that Gitea only applies to HTTP for some reason (services/auth/basic.go).

This was guided by https://git-annex.branchable.com/tips/setup_a_public_repository_on_a_web_site/

Fixes #3

Co-authored-by: Mathieu Guay-Paquet <[email protected]>
gitea-sync bot pushed a commit that referenced this pull request Nov 11, 2023
This makes HTTP symmetric with SSH clone URLs.

This gives us the fancy feature of _anonymous_ downloads,
so people can access datasets without having to set up an
account or manage ssh keys.

Previously, to access "open access" data shared this way,
users would need to:

  1. Create an account on gitea.example.com
  2. Create ssh keys
  3. Upload ssh keys (and make sure to find and upload the correct file)
  4. `git clone [email protected]:user/dataset.git`
  5. `cd dataset`
  6. `git annex get`

This cuts that down to just the last three steps:

  1. `git clone https://gitea.example.com/user/dataset.git`
  2. `cd dataset`
  3. `git annex get`

This is significantly simpler for downstream users, especially for those
unfamiliar with the command line.

Unfortunately there's no uploading. While git-annex supports uploading
over HTTP to S3 and some other special remotes, it seems to fail on a
_plain_ HTTP remote. See #7
and https://git-annex.branchable.com/forum/HTTP_uploads/#comment-ce28adc128fdefe4c4c49628174d9b92.

This is not a major loss since no one wants uploading to be anonymous anyway.

To support private repos, I had to hunt down and patch a secret extra security
corner that Gitea only applies to HTTP for some reason (services/auth/basic.go).

This was guided by https://git-annex.branchable.com/tips/setup_a_public_repository_on_a_web_site/

Fixes #3

Co-authored-by: Mathieu Guay-Paquet <[email protected]>
gitea-sync bot pushed a commit that referenced this pull request Nov 12, 2023
This makes HTTP symmetric with SSH clone URLs.

This gives us the fancy feature of _anonymous_ downloads,
so people can access datasets without having to set up an
account or manage ssh keys.

Previously, to access "open access" data shared this way,
users would need to:

  1. Create an account on gitea.example.com
  2. Create ssh keys
  3. Upload ssh keys (and make sure to find and upload the correct file)
  4. `git clone [email protected]:user/dataset.git`
  5. `cd dataset`
  6. `git annex get`

This cuts that down to just the last three steps:

  1. `git clone https://gitea.example.com/user/dataset.git`
  2. `cd dataset`
  3. `git annex get`

This is significantly simpler for downstream users, especially for those
unfamiliar with the command line.

Unfortunately there's no uploading. While git-annex supports uploading
over HTTP to S3 and some other special remotes, it seems to fail on a
_plain_ HTTP remote. See #7
and https://git-annex.branchable.com/forum/HTTP_uploads/#comment-ce28adc128fdefe4c4c49628174d9b92.

This is not a major loss since no one wants uploading to be anonymous anyway.

To support private repos, I had to hunt down and patch a secret extra security
corner that Gitea only applies to HTTP for some reason (services/auth/basic.go).

This was guided by https://git-annex.branchable.com/tips/setup_a_public_repository_on_a_web_site/

Fixes #3

Co-authored-by: Mathieu Guay-Paquet <[email protected]>
gitea-sync bot pushed a commit that referenced this pull request Nov 13, 2023
This makes HTTP symmetric with SSH clone URLs.

This gives us the fancy feature of _anonymous_ downloads,
so people can access datasets without having to set up an
account or manage ssh keys.

Previously, to access "open access" data shared this way,
users would need to:

  1. Create an account on gitea.example.com
  2. Create ssh keys
  3. Upload ssh keys (and make sure to find and upload the correct file)
  4. `git clone [email protected]:user/dataset.git`
  5. `cd dataset`
  6. `git annex get`

This cuts that down to just the last three steps:

  1. `git clone https://gitea.example.com/user/dataset.git`
  2. `cd dataset`
  3. `git annex get`

This is significantly simpler for downstream users, especially for those
unfamiliar with the command line.

Unfortunately there's no uploading. While git-annex supports uploading
over HTTP to S3 and some other special remotes, it seems to fail on a
_plain_ HTTP remote. See #7
and https://git-annex.branchable.com/forum/HTTP_uploads/#comment-ce28adc128fdefe4c4c49628174d9b92.

This is not a major loss since no one wants uploading to be anonymous anyway.

To support private repos, I had to hunt down and patch a secret extra security
corner that Gitea only applies to HTTP for some reason (services/auth/basic.go).

This was guided by https://git-annex.branchable.com/tips/setup_a_public_repository_on_a_web_site/

Fixes #3

Co-authored-by: Mathieu Guay-Paquet <[email protected]>
gitea-sync bot pushed a commit that referenced this pull request Nov 14, 2023
This makes HTTP symmetric with SSH clone URLs.

This gives us the fancy feature of _anonymous_ downloads,
so people can access datasets without having to set up an
account or manage ssh keys.

Previously, to access "open access" data shared this way,
users would need to:

  1. Create an account on gitea.example.com
  2. Create ssh keys
  3. Upload ssh keys (and make sure to find and upload the correct file)
  4. `git clone [email protected]:user/dataset.git`
  5. `cd dataset`
  6. `git annex get`

This cuts that down to just the last three steps:

  1. `git clone https://gitea.example.com/user/dataset.git`
  2. `cd dataset`
  3. `git annex get`

This is significantly simpler for downstream users, especially for those
unfamiliar with the command line.

Unfortunately there's no uploading. While git-annex supports uploading
over HTTP to S3 and some other special remotes, it seems to fail on a
_plain_ HTTP remote. See #7
and https://git-annex.branchable.com/forum/HTTP_uploads/#comment-ce28adc128fdefe4c4c49628174d9b92.

This is not a major loss since no one wants uploading to be anonymous anyway.

To support private repos, I had to hunt down and patch a secret extra security
corner that Gitea only applies to HTTP for some reason (services/auth/basic.go).

This was guided by https://git-annex.branchable.com/tips/setup_a_public_repository_on_a_web_site/

Fixes #3

Co-authored-by: Mathieu Guay-Paquet <[email protected]>
kousu added a commit that referenced this pull request Nov 20, 2023
This makes HTTP symmetric with SSH clone URLs.

This gives us the fancy feature of _anonymous_ downloads,
so people can access datasets without having to set up an
account or manage ssh keys.

Previously, to access "open access" data shared this way,
users would need to:

  1. Create an account on gitea.example.com
  2. Create ssh keys
  3. Upload ssh keys (and make sure to find and upload the correct file)
  4. `git clone [email protected]:user/dataset.git`
  5. `cd dataset`
  6. `git annex get`

This cuts that down to just the last three steps:

  1. `git clone https://gitea.example.com/user/dataset.git`
  2. `cd dataset`
  3. `git annex get`

This is significantly simpler for downstream users, especially for those
unfamiliar with the command line.

Unfortunately there's no uploading. While git-annex supports uploading
over HTTP to S3 and some other special remotes, it seems to fail on a
_plain_ HTTP remote. See #7
and https://git-annex.branchable.com/forum/HTTP_uploads/#comment-ce28adc128fdefe4c4c49628174d9b92.

This is not a major loss since no one wants uploading to be anonymous anyway.

To support private repos, I had to hunt down and patch a secret extra security
corner that Gitea only applies to HTTP for some reason (services/auth/basic.go).

This was guided by https://git-annex.branchable.com/tips/setup_a_public_repository_on_a_web_site/

Fixes #3

Co-authored-by: Mathieu Guay-Paquet <[email protected]>
kousu added a commit that referenced this pull request Nov 29, 2023
This makes HTTP symmetric with SSH clone URLs.

This gives us the fancy feature of _anonymous_ downloads,
so people can access datasets without having to set up an
account or manage ssh keys.

Previously, to access "open access" data shared this way,
users would need to:

  1. Create an account on gitea.example.com
  2. Create ssh keys
  3. Upload ssh keys (and make sure to find and upload the correct file)
  4. `git clone [email protected]:user/dataset.git`
  5. `cd dataset`
  6. `git annex get`

This cuts that down to just the last three steps:

  1. `git clone https://gitea.example.com/user/dataset.git`
  2. `cd dataset`
  3. `git annex get`

This is significantly simpler for downstream users, especially for those
unfamiliar with the command line.

Unfortunately there's no uploading. While git-annex supports uploading
over HTTP to S3 and some other special remotes, it seems to fail on a
_plain_ HTTP remote. See #7
and https://git-annex.branchable.com/forum/HTTP_uploads/#comment-ce28adc128fdefe4c4c49628174d9b92.

This is not a major loss since no one wants uploading to be anonymous anyway.

To support private repos, I had to hunt down and patch a secret extra security
corner that Gitea only applies to HTTP for some reason (services/auth/basic.go).

This was guided by https://git-annex.branchable.com/tips/setup_a_public_repository_on_a_web_site/

Fixes #3

Co-authored-by: Mathieu Guay-Paquet <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants