Skip to content
This repository has been archived by the owner on Jan 3, 2018. It is now read-only.

Current protocol for cloning repositories is flawed #115

Closed
karthik opened this issue Oct 31, 2013 · 11 comments
Closed

Current protocol for cloning repositories is flawed #115

karthik opened this issue Oct 31, 2013 · 11 comments

Comments

@karthik
Copy link
Contributor

karthik commented Oct 31, 2013

The current protocol is for instructors create a new repo under their own accounts, then clone contents of the bc repo, and push to their own GitHub account. The problem with this approach is that it creates a fairly large repo (32.5 mb with no additional content from the instructor) with a ton of material that may never be taught at a particular bootcamp. Having all the students clone on what invariable turns out to be a slow connection becomes a bear.

I've experienced this twice that I was forced to quickly throw content elsewhere on the web (e.g. data files to work on for an exercise) on my personal page and have then curl that down.

I understand that this approach helps maintain the connections to the original bootcamp repo and (at least theoretically) make it easy to contribute content back, it doesn't work so well in practice.

@ahmadia
Copy link
Contributor

ahmadia commented Oct 31, 2013

It's actually only 12 MB. You're counting both the unpacked working tree and the .git directory.

~/s/s/test ❯❯❯ hub clone swcarpentry/bc
Cloning into 'bc'...
remote: Counting objects: 3561, done.
remote: Compressing objects: 100% (2498/2498), done.
remote: Total 3561 (delta 1301), reused 3137 (delta 934)
Receiving objects: 100% (3561/3561), 11.38 MiB | 2.72 MiB/s, done.
Resolving deltas: 100% (1301/1301), done.
Checking connectivity... done
~/s/s/test ❯❯❯ du -sh bc/.git
 12M    bc/.git
~/s/s/test ❯❯❯

@ahmadia
Copy link
Contributor

ahmadia commented Oct 31, 2013

Sorry to open with a correction @karthikram. You raise a totally valid point, sometimes it's easy to get lost in small details.

@ahmadia
Copy link
Contributor

ahmadia commented Oct 31, 2013

My current approach is not to have students clone the instructional repository, although I know that many instructors do. We're not maintaining @wking-style per-topic branches, which would allow this sort of modular assembly, but comes at complexity costs in maintaining a single unified stack.

@karthikram - are you aware of the --depth 1 flag to clone? It will grab gh-pages by default, so you can add a different commit, tag, or branch ref if you like with the -b flag.

This will be much faster if you want students to be able to get the current state of a repository using git, although all history will be gone since it only brings down the exact state that is committed at that snapshot. You can delete any content that's not being used to minimize used bandwidth.

@karthik
Copy link
Contributor Author

karthik commented Oct 31, 2013

are you aware of the --depth 1 flag to clone? It will grab gh-pages by default, so you can add a different commit, tag, or branch ref if you like with the -b flag.

@ahmadia Yep, but one of the things I try to demonstrate during my bootcamps is that we use the same tools every day (for both research and teaching). To that end, I actually show the the history of the repo, how I developed the contents, and how to move back and forth through various commits. Since students will not have enough commits of their own to look through, having a older repo to examine in more details has been helpful.

I'll just go with my own approach (start a repo from scratch with my own gh-pages), then submit any new changes through a different clone of the bc repo. That way SWC still benefits from any new curriculum development and I don't turn any new students away from using Git.

@karthik
Copy link
Contributor Author

karthik commented Oct 31, 2013

I'll close this issue unless someone else feel strongly enough to reopen and continue the discussion.

@karthik karthik closed this as completed Oct 31, 2013
@ahmadia
Copy link
Contributor

ahmadia commented Oct 31, 2013

@karthikram

To that end, I actually show the the history of the repo, how I developed the contents, and how to move back and forth through various commits. Since students will not have enough commits of their own to look through, having a older repo to examine in more details has been helpful.

Yes! I've been opening with history-viewing in my more recent Git tutorials, and starting with history has been really useful. I'll make the decision to use a really lightweight repository like https://github.com/octocat/Spoon-Knife or something heavier based on the speed of the network connection.

My final thought on this is that we ask students to download very big things (Anaconda, Canopy, sometimes Virtual Machines), so it isn't unreasonable to ask them to download a 15-25MB repository.

And I don't speak for Software Carpentry, so let's leave the issue open unless you're really having second thoughts about opening it :)

@ahmadia ahmadia reopened this Oct 31, 2013
@wking
Copy link
Contributor

wking commented Oct 31, 2013

On Thu, Oct 31, 2013 at 03:10:21PM -0700, Aron Ahmadia wrote:

We're not maintaining @wking-style per-topic branches, which would
allow this sort of modular assembly, but comes at complexity costs
in maintaining a single unified stack.

I'm still maintaining my #102-style per-topic branches ;). It's just
that git://tremily.us/swc-boot-camp.git is probably the only
aggregators that's going to use them as submodules. Both #89 and #114
incorporate my my per-topic branches via explicit merges
(subtree-style), and I think they have a fair shot at landing
upstream.

@karthik
Copy link
Contributor Author

karthik commented Oct 31, 2013

My final thought on this is that we ask students to download very big things (Anaconda, Canopy, sometimes Virtual Machines), so it isn't unreasonable to ask them to download a 15-25MB repository.

Sure. But those are often binaries that people who have installed Microsoft Office are familiar with. We also ask folks to do that beforehand at home or work where they have access to a high speed internet connection. Many of our bootcamp attendees have never used the shell before. Asking them to run a shell command (before they know how to cd somewhere or make a directory) without explaining what it does or why seems like the fastest way to turn someone away.

@wking
Copy link
Contributor

wking commented Oct 31, 2013

On Thu, Oct 31, 2013 at 03:22:37PM -0700, Aron Ahmadia wrote:

My final thought on this is that we ask students to download very
big things (Anaconda, Canopy, sometimes Virtual Machines), …

That's just to get some sanity on wonky operating systems ;).
Debian's Git package is only 1.4 kB 1, and that's the only
non-standard package you need there for the basic Bash/Git/Python boot
camp.

@gvwilson
Copy link
Contributor

I agree with @karthik: our learners' first encounter with Git should happen in the classroom, when we're there to help them deal with things that go wrong, and that means that the repo they clone should be less than 1MB (including the .git directory). I think this means we'll always use a separate repo (not 'bc') as a starting point for classes; the question is, should we build a script to automate its creation for each bootcamp?

@ahmadia
Copy link
Contributor

ahmadia commented Nov 28, 2013

Sounds good. I'll start a new issue and close this one.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants