forked from cbx33/gitt
-
Notifications
You must be signed in to change notification settings - Fork 0
/
chap8.tex
1323 lines (1120 loc) · 67 KB
/
chap8.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
% chap8.tex - Week 8
\cleardoublepage
%\phantomsection
\chapter{Week 8 - Patching, Bisecting, Bundling and Submodules}
\section{Day 1 - ``Give a man a patch''}
\subsection{Collaborating with outsiders}
We have spoken at great length now about rebasing and have seen that it is a very very powerful tool.
It can form part of your workflow in your development cycle.
However, always heed that warning that should send alarm bells ringing in the back of your mind about rebasing.
Rebasing changes the past. Rebasing changes history.
As such, it should be used a) with caution, and b) only by people who understand exactly what they are doing.
We are going to leave rebasing for a while now, take a quick look at a feature you really should know about
and then focus on some of the more advanced features of Git.
The following situation occurs fairly regularly for some people.
\begin{trenches}
John was stroking his chin and looking pensively out of the window when Simon approached his desk.
The manager hadn't seen him yet and Simon instinctively swayed a little back and forth, try to make himself known in as subtle a way as possible.
Klaus, who was watching from the corner of his eye took a more direct approach.
He took the out of date org chart down from the office divider, screwed it up into a ball and launched it at John's head.
It struck the manager squarely in the jaw causing him to almost tip from his awkwardly balanced chair.
John noticed Simon standing there and looked a little surprised.
He then noticed Klaus and in an instant understood the chain of events that had just taken place.
``Sorry Simon,'' started John, ``I've been trying to figure out a problem all morning.''
``It's no problem.'' Simon pulled up a chair and sat down. ``I was wondering if you had a few minutes to discuss Luigi?''
\thoughtbreak
``Well as Luigi is a contractor, he's not going to get access to our repository here to perform commits directly.
And he doesn't have the capability, nor do I really want him, making our code available on the internet.
But he does have a clone of our repository from last week.'' John understood the problem.
``Right!''
``Have you heard of patching in Git?'' asked John.
Simon looked at his shoes, ``Can't say I have John, sorry.''
John smiled, ``No worrys. What we can do is get Luigi to generate a patch of his changes.
We can then take that patch and apply it to our codebase. Luigi can then just reset his clone when he comes into the office.''
Simon nodded as John continued, ``Go and ask Martha about it. I think she's pretty hot on these types of things.''
Klaus giggled, ``Think she's hot eh John?''
The paper was returned.
\end{trenches}
\index{patching!process}It is a good question though. Sometimes you may have a repository that is either publically available, or made available to a group of people.
You do not necessarily want to set up a remote tracking branch and pull changes in from every single contributor.
There are two primary reasons for this;
\begin{enumerate}
\item There are a large number of people submitting small changes to the code.
\item There are difficulties in communicating between the two repositories either for security or general reasons.
\end{enumerate}
In these cases we need another way to apply changes from one branch into another.
Many larger open source projects allow contributors to email in patches.
Git does have some rather advanced ways of dealing with these types of scenarios.
We are going to scratch the surface and look at using three commands \texttt{git apply}, \texttt{git format-patch} and \texttt{git am}.
\index{patching!generating}First, let us find a way of generating a patch.
Let us take the example we have currently in our repository.
Imagine that the \textbf{develop} branch exists on another computer in a clone of our repository.
At some point in time, someone cloned our repository.
They have the HEAD of our repository at the same point as we do, but they have continued to do some development in a new branch called \textbf{develop}.
Now they are ready to give those changes back.
Firstly we are going to look at using the \texttt{git diff} tool to generate a patch file which we can apply.
\begin{code}
john@satsuki:~/coderepo$ git checkout develop
Already on 'develop'
john@satsuki:~/coderepo$ git diff master develop
diff --git a/newfile2 b/newfile2
index 3545c1d..ff59f55 100644
--- a/newfile2
+++ b/newfile2
@@ -1,2 +1,3 @@
Another new file
and a new awesome feature
+newer dev work
diff --git a/newfile3 b/newfile3
index 638113c..2e00739 100644
--- a/newfile3
+++ b/newfile3
@@ -1 +1,2 @@
These changes are in the origin
+new dev work
john@satsuki:~/coderepo$
\end{code}
That will generate us a diff from the \texttt{develop} to the \texttt{master} branch.
We could copy and paste that information from the terminal window into a file, but Linux offers us an easier way of doing this.
\begin{code}
john@satsuki:~/coderepo$ git diff master develop > our_patch.diff
john@satsuki:~/coderepo$ cat our_patch.diff
diff --git a/newfile2 b/newfile2
index 3545c1d..ff59f55 100644
--- a/newfile2
+++ b/newfile2
@@ -1,2 +1,3 @@
Another new file
and a new awesome feature
+newer dev work
diff --git a/newfile3 b/newfile3
index 638113c..2e00739 100644
--- a/newfile3
+++ b/newfile3
@@ -1 +1,2 @@
These changes are in the origin
+new dev work
john@satsuki:~/coderepo$
\end{code}
\index{patching!applying}So we can see that the file itself has the information we are looking for.
Now we can use the \indexgit{apply} tool to actually modify the files in \textbf{master} and bring in the changes that have happened in \textbf{develop}.
\begin{code}
john@satsuki:~/coderepo$ git checkout master
Switched to branch 'master'
john@satsuki:~/coderepo$ git apply our_patch.diff
john@satsuki:~/coderepo$ git diff
diff --git a/newfile2 b/newfile2
index 3545c1d..ff59f55 100644
--- a/newfile2
+++ b/newfile2
@@ -1,2 +1,3 @@
Another new file
and a new awesome feature
+newer dev work
diff --git a/newfile3 b/newfile3
index 638113c..2e00739 100644
--- a/newfile3
+++ b/newfile3
@@ -1 +1,2 @@
These changes are in the origin
+new dev work
john@satsuki:~/coderepo$ git commit -a -m 'Updated with patch'
[master 81eee9f] Updated with patch
2 files changed, 2 insertions(+), 0 deletions(-)
john@satsuki:~/coderepo$ git diff develop master
john@satsuki:~/coderepo$
\end{code}
Of course doing things this way means that we still have to commit our changes.
Plus, all of the changes that we have made in the patch are committed in one block.
Sure, we could split that using some of the techniques in the After Hours sections, but then we may not always be aware of what should be split where.
\subsection{Can we have some order please?}
There is another tool that can come to our rescue here.
It is primarily used for working with \index{mbox} mailboxes, but it also has some other uses which we will describe here.
Would it not be nice to be able to have each commit that we want to use as a patch in a separate patch file.
The file \texttt{our\_patch.diff} above contained two commits worth of data.
We have access to another tool in our fight against disparate systems.
This is the \indexgit{format-patch} command.
First we will undo the changes we made previously by resetting the \textbf{master} branch back to its older position and deleting the \texttt{our\_patch.diff} file.
\begin{code}
john@satsuki:~/coderepo$ git reflog show master -n 4
81eee9f master@{0}: commit: Updated with patch
f8d5100 master@{1}: commit: Finished new dev
1968324 master@{2}: commit: Start new dev
john@satsuki:~/coderepo$ git reset --hard f8d5100
HEAD is now at f8d5100 Finished new dev
john@satsuki:~/coderepo$ rm our_patch.diff
john@satsuki:~/coderepo$
\end{code}
We used the \texttt{git reflog} command to show what the last four \textbf{master} HEAD values were.
Then we reset the branch back to the point before the \texttt{git apply}.
Finally we deleted the patch.
\index{patching!multiple file generation}Now let us see how to use the \texttt{git format-patch} command to create multiple patch files.
\begin{code}
john@satsuki:~/coderepo$ git format-patch master..develop
0001-Some-new-dev-work.patch
0002-More-new-deving.patch
john@satsuki:~/coderepo$
\end{code}
It would appear that the result of this command is that two files have been generated.
Let us confirm our suspicions and \texttt{cat} the contents of them to ensure that they contain the data we expect.
\begin{code}
john@satsuki:~/coderepo$ cat 0001-Some-new-dev-work.patch
From af3c6d730a8632d99b5626a7c0e921d14af21f50 Mon Sep 17 00:00:00 2001
From: John Haskins <[email protected]>
Date: Thu, 7 Jul 2011 19:01:59 +0100
Subject: [PATCH 1/2] Some new dev work
---
newfile3 | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)
diff --git a/newfile3 b/newfile3
index 638113c..2e00739 100644
--- a/newfile3
+++ b/newfile3
@@ -1 +1,2 @@
These changes are in the origin
+new dev work
--
1.7.4.1
john@satsuki:~/coderepo$
\end{code}
Woah! Hold on a minute. This does not seem to be a normal diff file at all.
In fact, that is absolutely right. This is a patch file and the two are not the same.
The patch file contains much more information than the simple diff file.
For a start we get information about which commit this patch came from, who created it, when and a subject.
In fact this looks almost like an email. In fact it is created to resemble a format that would be easily emailable.
\index{patching!a range}We have specified a range of commits to the \texttt{git format-patch} command with the parameter \texttt{master..develop}.
The format of that parameter should be familar from earlier chapters when we utilised it for commands like \texttt{git diff} and \texttt{git log}.
We could now take those files, email them to someone else and they could apply them.
Let us learn one more tool, and see how we would apply those patches when they had been received at the other end.
\begin{code}
john@satsuki:~/coderepo$ git am 0001-Some-new-dev-work.patch
Applying: Some new dev work
john@satsuki:~/coderepo$ git am 0002-More-new-deving.patch
Applying: More new deving
john@satsuki:~/coderepo$ git diff master..develop
john@satsuki:~/coderepo$
\end{code}
Of course this is just a simple example case and in actual usage there may be cases where conflicts and other complications occur.
Looking at a log output, we can see that the original dates and times of the commits are maintained and are not updated.
We can ignore this if we wish and use the \texttt{--ignore-date} parameter to use the current date when committing the patch to the repository.
\begin{code}
john@satsuki:~/coderepo$ git log -n4
commit 30900fe1b7e72411dabab8b02070f36e2431f704
Author: John Haskins <[email protected]>
Date: Thu Jul 7 19:02:15 2011 +0100
More new deving
commit a8281fb589e36389cc8cb0da7ebee225b4d1adfc
Author: John Haskins <[email protected]>
Date: Thu Jul 7 19:01:59 2011 +0100
Some new dev work
commit f8d5100142b43ffaba9bbd539ba4fd92af79bf0e
Author: John Haskins <[email protected]>
Date: Thu Jul 7 08:39:29 2011 +0100
Finished new dev
commit 1968324ce2899883fca76bc25496bcf2b15e7011
Author: John Haskins <[email protected]>
Date: Thu Jul 7 08:39:07 2011 +0100
Start new dev
john@satsuki:~/coderepo$
\end{code}
Interestingly if we use our alias for the log command we see something maybe a little unexpected.
\begin{code}
john@satsuki:~/coderepo$ git logg -n6
* 30900fe (HEAD, master) More new deving
* a8281fb Some new dev work
| * aed985c (develop) More new deving
| * af3c6d7 Some new dev work
|/
* f8d5100 Finished new dev
* 1968324 Start new dev
john@satsuki:~/coderepo$
\end{code}
Notice that the branch \textbf{master} has not been simply fast forwarded to that of commit of \textbf{develop}.
This is because we have not performed a merge, but in a sense we have manually made that changes to the files and created separate commits for them.
In this way the commits \textbf{30900fe} and \textbf{a8281fb} are not the same as their \textbf{develop} counterparts.
If you intend to use this workflow, it is worth spending some time reading the man page for \texttt{git am} and \texttt{git format-patch} as both of them hold valuable information regarding the customisation and handling of patches and emails.
Tamagoyaki Inc. are not going to use this workflow often and so just applying a few patches here and there from contractors using the methods is prefectly acceptable to them.
If you were a large open source establishment, or any company that accepts a large number of patches, you may want to take a closer look at how to work these.
Now it is time to move on to some more advanced topics within Git, but first a little cleanup.
\begin{code}
john@satsuki:~/coderepo$ rm 0001-Some-new-dev-work.patch
john@satsuki:~/coderepo$ rm 0002-More-new-deving.patch
john@satsuki:~/coderepo$
\end{code}
\section{Day 2 - ``Looking for problems''}
\subsection{A problem shared is a problem bisected}
\index{bisecting}During most software development, bugs are introduced.
Sometimes these bugs are fixed immediately and sometimes they sit there in the code festering away for months on end until someone tests a specific case.
Of course it is always best to have test suites and run them regularly against the code base, but on occasions either the test case itself has a bug,
or the test case is written in such a way that a particular bug would never present itself.
Tamagoyaki Inc. have a fairly rigorous testing procedure.
Unfortunately it would seem that one particularly nasty bug has slipped through the cracks.
Cue a difficult discussion.
\begin{trenches}
``But what I don't understand John, is that you now know what happened at every step in the process.
How can something like this break and you not know about it?''
As always Markus was getting snappy and as always John was having to bite his lip.
``It's not a question about not knowing about it,'' begain John, ``The difficulty is knowing what change introduced the problem.
We are on such a rapid development schedule that too many things are changing at once.''
``Well, this is one of the reasons you guys have spent the last two months getting this version control system running.''
Markus got up and opened the door. ``I suggest you fix it.''
\thoughtbreak
``Markus is blaming us for introducing a bug?'' Rob was pretty shocked as he and Simon chatted at the water cooler.
``More like, Markus believed that a version control system was going to solve all of our problems,'' replied Simon.
Rob squinted his face up as a car drove into the buildings car park, showering the room with reflected sunlight.
He shielded his eyes. ``You know I heard there was a tool in Git for helping to find bugs.
Think I may take a look over lunch, you know, be a real hero.''
They both chuckled.
\end{trenches}
It is true that Git does have a very powerful tool for helping to detect revisions that introduced bugs into the system.
The tool is called \indexgit{bisect} and it is used to successively checkout revisions from the repository,
check them to see if the bug is present and then use that information to determine the revision that is most likely to have introduced the bug.
\index{bisecting!simple}Let us assume that the bug in our repository is a fairly simple one.
For some bizarre reason our codebase is broken unless the word \texttt{Addition} is present in one of the files.
If we run a simple Linux \texttt{grep} command across the files, we can see that the word we are after is not there.
However, if we go back to tag \textbf{v1.0a} and run the same command, we can see that the word is there.
\begin{code}
john@satsuki:~/coderepo$ grep "Addition" *
john@satsuki:~/coderepo$ git checkout v1.0a
Note: checking out 'v1.0a'.
You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.
If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:
git checkout -b new_branch_name
HEAD is now at a022d4d... Messed with a few files
john@satsuki:~/coderepo$ grep "Addition" *
my_third_committed_file:Addition to the line
john@satsuki:~/coderepo$
\end{code}
Notice the warning about checking out a non-branch.
This is perfectly normal and should not worry you but please be aware that it is obviously best to have a clean working directory before starting any type of \texttt{bisect} commands.
We can see that the string we are looking for is present in the file called \texttt{my\_third\_committed\_file}.
As our repository is very small, it would not take us long to go through and check each revision to see when this string was deleted.
In fact we have other tools available to search for the adding and removal of strings.
For now let us assume that the \emph{bug} is more complicated than this.
Let us go back to the facts.
\index{bisecting!set good point}\index{bisecting!set bad point}We know that the repository was \textbf{good} at tag \textbf{v1.0a}.
We also know that the repository is bad in its current state.
By feeding these details to the \texttt{git bisect} command, we can begin a search for the bug.
What will happen at each stage is that Git will checkout a revision that it wants us to test and we tell Git if we think that revision is good or bad.
\begin{code}
john@satsuki:~/coderepo$ git bisect start
Already on 'master'
john@satsuki:~/coderepo$ git bisect good v1.0a
john@satsuki:~/coderepo$ git bisect bad master
Bisecting: 9 revisions left to test after this (roughly 3 steps)
[ed2301ba223a63a5a930b536a043444e019460a7] Removed third file
john@satsuki:~/coderepo$
\end{code}
So we invoke the tool by running \texttt{git bisect start}.
After this we tell Git the things that we know. It was good at \textbf{v1.0a}, \texttt{git bisect good v1.0a}.
However, it was bad at \textbf{master}, our current revision, \texttt{git bisect bad master}.
After this, Git checks out revision \textbf{ed2301b} and tells us that there are \texttt{9} revisions between the two points and that it should take only \texttt{3} more steps to complete.
Now we run our test again.
\begin{code}
john@satsuki:~/coderepo$ grep "Addition" *
john@satsuki:~/coderepo$
\end{code}
\index{bisecting!marking result}As we have no result here, this would be classed as a bad revision and so we mark it as so.
\begin{code}
john@satsuki:~/coderepo$ git bisect bad
Bisecting: 3 revisions left to test after this (roughly 2 steps)
[9710177657ae00665ca8f8027b17314346a5b1c4] Added another file
john@satsuki:~/coderepo$
\end{code}
Git now presents us with a new choice and you can see that the number of revisions left to check has decreased dramatically from \texttt{9} to \texttt{3}.
We continue marking our revisions as good and bad.
\begin{code}
john@satsuki:~/coderepo$ grep "Addition" *
my_third_committed_file:Addition to the line
john@satsuki:~/coderepo$ git bisect good
Bisecting: 2 revisions left to test after this (roughly 1 step)
[cfbecabb031696a217b77b0e1285f2d5fc2ea2a3] Fantastic new feature
john@satsuki:~/coderepo$ grep "Addition" *
my_third_committed_file:Addition to the line
john@satsuki:~/coderepo$ git bisect good
Bisecting: 0 revisions left to test after this (roughly 1 step)
[b119573f4508514c55e1c4e3bebec0ab3667d071] Merge branch 'wonderful'
john@satsuki:~/coderepo$ grep "Addition" *
my_third_committed_file:Addition to the line
john@satsuki:~/coderepo$ git bisect good
ed2301ba223a63a5a930b536a043444e019460a7 is the first bad commit
commit ed2301ba223a63a5a930b536a043444e019460a7
Author: John Haskins <[email protected]>
Date: Fri Apr 1 07:37:34 2011 +0100
Removed third file
:100644 000000 68365cc0e5909dc366d31febf5ba94a3268751c6 0000000000000000000000000000000000000000 D my_third_committed_file
john@satsuki:~/coderepo$
\end{code}
Oh! Something different. Git has actually finished the bisect and has suggested to us that this commit was responsible for introducing the bug in our code.
If we look at the comment it was in this revision that we removed a particular file.
This file was the one that contained our special \texttt{Addition} string.
Git was right! We screwed up then. At this point we can go back to our \textbf{master} branch and decide what to do about it.
\begin{code}
john@satsuki:~/coderepo$ git branch -v
* (no branch) b119573 Merge branch 'wonderful'
develop aed985c More new deving
master 30900fe More new deving
wonderful 4d91aab Updated another file again
zaney 7cc32db Made another awesome change
john@satsuki:~/coderepo$ git checkout master
Previous HEAD position was b119573... Merge branch 'wonderful'
Switched to branch 'master'
john@satsuki:~/coderepo$
\end{code}
Notice that at the end of the bisect, Git does not return us to the master branch.
We are left in the last tested checked out revision.
\subsection{Automating the process}
\index{bisecting!automation}So bisecting is a very powerful way of quickly and efficiently finding the point at which bugs were introduced or \index{regression testing}regression testing.
Git was spot on when it suggested that that revision was the one responsible for the mistake.
Sometimes you may not be able to test a revision that Git checks out for you for other reasons.
In this case you can always run \texttt{git bisect skip} to skip that revision.
It is all very well being able to run this at each revision Git asks us to but to be honest, if you have 30-40 steps to test and you have to compile code to see if the bug is present it can get a little bit boring.
Git has a way of allowing us to test automatically.
The example we are going to use is obviously based on a Linux environment, but if you are a developer on a Windows platform, you should have no trouble understanding what is happening here.
We are going to create a small shell script that will automatically run our grep test.
If the string is found we will exit with a status code of \texttt{0}, indicating that it was successful and if
the string is not found, we will exit with a status code of \texttt{123}, indicating that the test was unsuccessful.
Git will use these status codes and interpret a code of \texttt{0} as \textbf{good} and a code of \texttt{123} as \textbf{bad}.
Below is a copy of our shell script which we have saved as \texttt{test.sh} and have given relevant permissions to allow it to run etc.
Notice we have had to exclude our \texttt{test.sh} file from the test, else the string \texttt{Addition} would have been found there which would have returned true every time.
\begin{code}
john@satsuki:~/coderepo$ cat test.sh
#!/bin/bash
if grep -q Addition * --exclude=test.sh
then echo "Good"
exit 0
else
echo "Bad"
exit 123
fi
john@satsuki:~/coderepo$
\end{code}
Now we invoke \texttt{git bisect} slightly differently by asking it to start and itterate over the revisions \texttt{master} to \texttt{v1.0a}.
At this point we have not told Git anything about which revisions are good or bad.
\begin{code}
john@satsuki:~/coderepo$ git bisect start master v1.0a
Bisecting: 9 revisions left to test after this (roughly 3 steps)
[ed2301ba223a63a5a930b536a043444e019460a7] Removed third file
john@satsuki:~/coderepo$
\end{code}
Now we ask Git to continue testing, but to run our script at each iteration to determine the success or failure of each checked out revision.
\begin{code}
john@satsuki:~/coderepo$ git bisect run sh ./test.sh
running sh ./test.sh
Bad
Bisecting: 3 revisions left to test after this (roughly 2 steps)
[9710177657ae00665ca8f8027b17314346a5b1c4] Added another file
running sh ./test.sh
Good
Bisecting: 2 revisions left to test after this (roughly 1 step)
[cfbecabb031696a217b77b0e1285f2d5fc2ea2a3] Fantastic new feature
running sh ./test.sh
Good
Bisecting: 0 revisions left to test after this (roughly 1 step)
[b119573f4508514c55e1c4e3bebec0ab3667d071] Merge branch 'wonderful'
running sh ./test.sh
Good
ed2301ba223a63a5a930b536a043444e019460a7 is the first bad commit
commit ed2301ba223a63a5a930b536a043444e019460a7
Author: John Haskins <[email protected]>
Date: Fri Apr 1 07:37:34 2011 +0100
Removed third file
:100644 000000 68365cc0e5909dc366d31febf5ba94a3268751c6 0000000000000000000000000000000000000000 D my_third_committed_file
bisect run success
john@satsuki:~/coderepo$
\end{code}
The parameters after the \texttt{git bisect run} tell Git which command we wish to run at each stage.
In our case it is \texttt{sh ./test.sh}.
You can see Git invoking our \texttt{test.sh} script in each case, and the result of our script, either \texttt{Good} or \texttt{Bad} depending on which was echoed from the result of the grep test.
Git has arrived at exactly the same result, but we have had to do nothing other than write a small script.
For larger tests, this would have saved us a large amount of work.
\begin{trenches}
``Simon could I have a word?'' It was Rob and he wasn't looking happy.
Simon turned to him and grinned, ``Sure buddy what's up?'' His face dropped when he saw Rob's expression.
``I think we'd better go grab the meeting room.''
Simon looked confused.
``I used the bisect tool to find the bug. But you're not gonna like what I found.''
\thoughtbreak
``Simon how could you have done that?'' John was asking the questions and they were coming thick and fast.
``I mean changing the API key for the web sevice whilst developing was not a great idea to start with, but committing that to the repository was rediculous.''
Simon sat there with his head in his hands.
``You know how secret that API key is right?'' Simon nodded.
``Simon we were supposed to be releasing this repository publically in a few weeks but now that the API is in there we can't do that.''
``John I'm really sorry OK.'' Simon was kicking himself for his mistake.
John sighed, he had been really angry to begin with but now he was calming down,
``It's OK Simon, we're all getting used to the repository and version control. Do you think we can fix it?''
\end{trenches}
\section{Day 3 - ``Filtered repos''}
\subsection{Looking at a repo with rose tinted glasses}
\index{filtering}It does happen. Sometimes when people are under pressure, mistakes are made, just like earlier when we accidently deleted our branch from the repository.
This time the mistake is a little more crucial but again it does happen and it sometimes goes a long time before it is noticed.
\begin{trenches}
``So it's been in there for how long?'' asked John.
Simon looked pretty sheepish as he mouthed the words, ``Weeks.''
John bit on the end of the pen in his hand.
His teeth chewed into the plastic, deforming the blue lid.
``Did you find a way of sorting it out yet?''
``I think so. It's not ideal, but I think so.''
\end{trenches}
It would be useful if we could rewrite the history to remove the information that we wanted to.
As it turns out there is a tool that we can use to do this.
The \indexgit{filter-branch} allows us to run operations on a branch to rewrite its history.
Hopefully you are already remembering about the care we need to take when rewriting history, but sometimes there is a real need to perform some of these operations.
Let us take a look at a few examples to see how this can work.
We are going to assume that our file \texttt{newfile1} contains some very sensitive information and we wish to remove it completely from the repository.
\begin{code}
john@satsuki:~/coderepo$ git checkout master
Already on 'master'
john@satsuki:~/coderepo$ ls -la
total 40
drwxr-xr-x 3 john john 4096 2011-07-27 19:54 .
drwxr-xr-x 32 john john 4096 2011-07-27 19:00 ..
-rw-r--r-- 1 john john 35 2011-07-22 07:15 another_file
-rw-r--r-- 1 john john 25 2011-07-22 07:15 cont_dev
drwxrwxr-x 9 john john 4096 2011-07-27 19:54 .git
-rw-r--r-- 1 john john 69 2011-07-27 19:54 newfile1
-rw-r--r-- 1 john john 58 2011-07-22 07:15 newfile2
-rw-r--r-- 1 john john 45 2011-07-22 07:15 newfile3
-rw-r--r-- 1 john john 8 2011-03-31 22:15 temp_file
-rwxrwxr-x 1 john john 114 2011-07-21 21:17 test.sh
john@satsuki:~/coderepo$
\end{code}
As you can see, currently we have \texttt{newfile1} in our tree.
We can also use the \texttt{git log} tool to see each commit which has touched that path.
\begin{code}
john@satsuki:~/coderepo$ git log --pretty=oneline master -- newfile1
9cb2af2a00fd2253060e6bf8cc6c377b3d55ecea Important Update
d50ffb2fa536d869f2c4e89e8d6a48e0a29c5cc1 Merged in zaney
a27d49ef11d9f0e66edbad8f6c7806510ad5b2be Made an awesome change
cfbecabb031696a217b77b0e1285f2d5fc2ea2a3 Fantastic new feature
55fb69f4ad26fdb6b90ac6f43431be40779962dd Added two new files
john@satsuki:~/coderepo$
\end{code}
So there were five commits in the past which have touched that path.
In our example we require the removal of this path from the entire history of the repository.
As this is a destructive operation that works on the current branch, meaning it will rewrite our branch HEAD, we are first going to switch into a new branch.
\begin{code}
john@satsuki:~/coderepo$ git checkout -b remove_file
Switched to a new branch 'remove_file'
john@satsuki:~/coderepo$
\end{code}
\index{filtering!index}Now we need to run the \texttt{git filter-branch} tool.
\begin{code}
john@satsuki:~/coderepo$ git filter-branch --index-filter 'git rm --cached --ignore-unmatch newfile1' HEAD
Rewrite 55fb69f4ad26fdb6b90ac6f43431be40779962dd (6/21)rm 'newfile1'
Rewrite 9710177657ae00665ca8f8027b17314346a5b1c4 (7/21)rm 'newfile1'
Rewrite 4ac92012609cf8ed2480aa5d7f807caf2545fe2f (8/21)rm 'newfile1'
Rewrite cfbecabb031696a217b77b0e1285f2d5fc2ea2a3 (9/21)rm 'newfile1'
Rewrite b119573f4508514c55e1c4e3bebec0ab3667d071 (10/21)rm 'newfile1'
Rewrite ed2301ba223a63a5a930b536a043444e019460a7 (11/21)rm 'newfile1'
Rewrite a27d49ef11d9f0e66edbad8f6c7806510ad5b2be (12/21)rm 'newfile1'
Rewrite 7cc32dbf121f2afa8c40337db54bafb26de5b9c4 (13/21)rm 'newfile1'
Rewrite d50ffb2fa536d869f2c4e89e8d6a48e0a29c5cc1 (14/21)rm 'newfile1'
Rewrite 9cb2af2a00fd2253060e6bf8cc6c377b3d55ecea (15/21)rm 'newfile1'
Rewrite 37950f861a3cc0868c65ee9571fc6c491aa689ea (16/21)rm 'newfile1'
Rewrite 1c3206aac0fb012bfdaf5ff00e320b565bb89e7d (17/21)rm 'newfile1'
Rewrite 1968324ce2899883fca76bc25496bcf2b15e7011 (18/21)rm 'newfile1'
Rewrite f8d5100142b43ffaba9bbd539ba4fd92af79bf0e (19/21)rm 'newfile1'
Rewrite a8281fb589e36389cc8cb0da7ebee225b4d1adfc (20/21)rm 'newfile1'
Rewrite 30900fe1b7e72411dabab8b02070f36e2431f704 (21/21)rm 'newfile1'
Ref 'refs/heads/remove_file' was rewritten
john@satsuki:~/coderepo$
\end{code}
We have passed a few parameters to \texttt{git filter-branch} and we should take a few seconds to discuss this as the syntax may seem a little strange.
Firstly we are invoking the \texttt{git filter-branch} tool, that should not be anything new at all.
Next, we are passing three parameters to it.
The first of these is the type of filter we wish to use.
In our case we have used the \texttt{--index-filter} option.
More information is available in the Git manual, but in a nutshell we have asked Git to work on the \emph{index} at each commit stage.
\index{filtering!tree}There is another similar option called \texttt{--tree-filter}, however care must be taken to distinguish between the two as using \texttt{--tree-filter} checks out the commit at each point in history.
This may not sound like a problem, until you discover that as well as checking each revision out, it also automatically adds any untracked files in the working tree and commits them.
The next parameter is the actual command that we wish Git to perform on each revision.
In this case we want to \texttt{git rm --cached --ignore-unmatch newfile1} each time.
We have enclosed the command we wish to run inside quotes so that Git does not get confused with which parameters are part of the \texttt{filter-branch} and which are part of the \texttt{rm}.
Using these options we have asked Git to work on just the \emph{index} and not to complain if it can not find the file to delete.
Lastly we list the commit range we wish to filter.
In this case we have specified the target revision as \texttt{HEAD}.
Git will interpret this as meaning everything up to the \texttt{HEAD} revision.
As such Git will be rewriting the entire history of the branch.
Now if we list the files in the directory, we can see something important has happened.
The file that we wanted removed, has gone and \texttt{newfile1} is no more.
\begin{code}
john@satsuki:~/coderepo$ ls -la
total 36
drwxr-xr-x 3 john john 4096 2011-07-27 19:53 .
drwxr-xr-x 32 john john 4096 2011-07-27 19:00 ..
-rw-r--r-- 1 john john 35 2011-07-22 07:15 another_file
-rw-r--r-- 1 john john 25 2011-07-22 07:15 cont_dev
drwxrwxr-x 9 john john 4096 2011-07-27 19:53 .git
-rw-r--r-- 1 john john 58 2011-07-22 07:15 newfile2
-rw-r--r-- 1 john john 45 2011-07-22 07:15 newfile3
-rw-r--r-- 1 john john 8 2011-03-31 22:15 temp_file
-rwxrwxr-x 1 john john 114 2011-07-21 21:17 test.sh
john@satsuki:~/coderepo$
\end{code}
Re-running the log command we ran earlier against our new branch confirms our operation.
However checking out the \textbf{master} also confirms that the file is still present elsewhere.
\begin{code}
john@satsuki:~/coderepo$ git log --pretty=oneline remove_file -- newfile1
john@satsuki:~/coderepo$ git checkout master
Switched to branch 'master'
john@satsuki:~/coderepo$ ls -la
total 40
drwxr-xr-x 3 john john 4096 2011-07-27 19:54 .
drwxr-xr-x 32 john john 4096 2011-07-27 19:00 ..
-rw-r--r-- 1 john john 35 2011-07-22 07:15 another_file
-rw-r--r-- 1 john john 25 2011-07-22 07:15 cont_dev
drwxrwxr-x 9 john john 4096 2011-07-27 19:54 .git
-rw-r--r-- 1 john john 69 2011-07-27 19:54 newfile1
-rw-r--r-- 1 john john 58 2011-07-22 07:15 newfile2
-rw-r--r-- 1 john john 45 2011-07-22 07:15 newfile3
-rw-r--r-- 1 john john 8 2011-03-31 22:15 temp_file
-rwxrwxr-x 1 john john 114 2011-07-21 21:17 test.sh
john@satsuki:~/coderepo$
\end{code}
It should be stressed at this point how destructive the \texttt{git filter-branch} command can be to your repository.
The \textbf{master} and \textbf{remove\_file} branches have diverged from the point where \texttt{newfile1} was first introduced.
Consequently all of our other branches, such as \textbf{zaney} and \textbf{wonderful} still refer to the \textbf{master} branch.
We would also have to rewrite those branches too, but because of the rewriting of commit objects, we could lose the relationships between the branches and their ancestors.
In short, though it is exceedingly powerful, this type of filtering can cause huge distress to other people working on the project.
\begin{trenches}
``So what do we do?'' asked John.
``We can't push out the repo as it is because it contains the API key.''
He massaged his forehead moving down to his eyebrows.
``But we seem to be introducing a real headache if we filter the branch. Any suggestions?''
``Well the project is going to be finished in a few weeks right?'' Simon was sitting at the end of the table.
He was ashamed and was talking through a pair of hands deperately trying to conceal his identity.
``Yeh, but what the hell has that got to do with it?'' snorted Klaus.
``I'm just thinking that we leave the repo like it is until all development has finished,'' he paused to run his hands through his hair,
``then we filter the branch just before we release it.''
He looked over at John, ``At that point there shouldn't be any test or dev branches, and we can just get everyone to clone the repo if we need to do anything else.''
John nodded. ``You know Simon I think you may have just redeemed yourself.''
\end{trenches}
\begin{callout}{Note}{Since you've been gone}
\index{filtering!purging}Even though we have rewritten our tree, the fact that another branch still has the file present means that our potentially senitive data still exists somewhere inside the repository.
In order to truly get rid of the file we would need to not only remove the file from all branches, or delete the branches that contained the file,
but also run a few more steps if we wanted to ensure the file was gone \emph{now}.
Be aware that these steps are potentially very destructive to a repository.
The best way to remove the file completely would be to remove ALL references to the file and then clone the repository.
Git will not clone objects into a new repository if nothing references them.
Alternatively if you absolutely must work on the current repository, you would need to do the following.
\newline
\newline
Delete the \texttt{filter-branch} backup using \index{git commands!update-ref@\texttt{update-ref}}\texttt{git update-ref <refname> -d}. (See the callout on \emph{More backups})
\newline
\newline
Expire all reflogs with \texttt{git reflog expire --expire=now --all}
\newline
\newline
Repack all of the pack files with \texttt{git repack -ad}\index{git commands!repack@\texttt{repack}}
\newline
\newline
Prune all unreachable objects with \texttt{git prune}\index{git commands!prune@\texttt{prune}}
\newline
\newline
As you can see some of these are quite scary procedures and so it is important that you understand all that you are doing before you do it.
\end{callout}
The idea being proposed here is only really viable because of Tamagoyaki's situation.
The code is due to be finished soon and once that happens, the team have decided to push a rewritten branch into the public domain and to resync all of their development repositories to this new branch.
It should be noted that the \texttt{filter-branch} tool can be used in other circumstances too.
We are going to take a look at just one of these.
However, let us first clean up our repository a little and move some things around.
\begin{code}
john@satsuki:~/coderepo$ mkdir tester
john@satsuki:~/coderepo$ ls
another_file cont_dev newfile1 newfile2 newfile3 temp_file tester test.sh
john@satsuki:~/coderepo$ mv test.sh tester/
john@satsuki:~/coderepo$ git mv newfile* tester
john@satsuki:~/coderepo$ git add tester/test.sh
john@satsuki:~/coderepo$ rm temp_file
john@satsuki:~/coderepo$ git status
# On branch master
# Changes to be committed:
# (use "git reset HEAD <file>..." to unstage)
#
# renamed: newfile1 -> tester/newfile1
# renamed: newfile2 -> tester/newfile2
# renamed: newfile3 -> tester/newfile3
# new file: tester/test.sh
#
john@satsuki:~/coderepo$ git commit -a -m 'Moved testing suite'
[master f08ac57] Moved testing suite
4 files changed, 9 insertions(+), 0 deletions(-)
rename newfile1 => tester/newfile1 (100%)
rename newfile2 => tester/newfile2 (100%)
rename newfile3 => tester/newfile3 (100%)
create mode 100755 tester/test.sh
john@satsuki:~/coderepo$
\end{code}
We have reverted back to our \textbf{master} branch and in doing so have regained \texttt{newfile1}.
After that, we deleted our rewritten branch and moved \texttt{test.sh} along with all of the \texttt{newfile}s into a new folder called \texttt{tester}.
\section{Day 4 - ``Let's make a library''}
\subsection{Splitting the atom}
Sometimes, after a project has been running for a while certain components actually grow rather useful.
When this happens, people often want to move it outside of the original project and maintain it as a separate library.
Of course the easiest way to do this is to just copy and paste the files out of the main project and into a subdirectory.
In doing this we would lose or disconnect all of the development history of that subproject up to this point.
\index{filtering!sub-directory}Using the \texttt{git filter-branch} we can actually pull out a folder and retain all of its history.
The methodology behind this is that we rewrite the history to a new branch, but we only pull across changes to a particular folder and we store those in the root of the branch.
Let us see how this works with a quick example.
Remember we created the \texttt{tester} folder?
We are going to make a few commits to the files in this folder to give it some history.
\begin{code}
john@satsuki:~/coderepo$ echo "More development work" >> tester/newfile1
john@satsuki:~/coderepo$ git commit -a -m 'Work on tester nf1'
[master 1a4956b] Work on tester nf1
1 files changed, 1 insertions(+), 0 deletions(-)
john@satsuki:~/coderepo$ echo "More dev work" >> tester/newfile2
john@satsuki:~/coderepo$ git commit -a -m 'Work on tester nf2'
[master 7156104] Work on tester nf2
1 files changed, 1 insertions(+), 0 deletions(-)
john@satsuki:~/coderepo$ echo "Even more dev work" >> tester/newfile3
john@satsuki:~/coderepo$ git commit -a -m 'Work on tester nf3'
[master 1433223] Work on tester nf3
1 files changed, 1 insertions(+), 0 deletions(-)
john@satsuki:~/coderepo$
\end{code}
Now we are going to split that off into a separate branch which we will then clone into a new Git repository.
After we have copied the history of the \texttt{tester} folder to a new branch, see if you can run through in your head, the steps we would need to take to pull this branch into a new repository.
\begin{code}
john@satsuki:~/coderepo$ git checkout -b tester_split
Switched to a new branch 'tester_split'
john@satsuki:~/coderepo$ git filter-branch --subdirectory-filter tester
Rewrite 1433223d9c8a8abc35410d12cf78128c318b6e42 (4/4)
Ref 'refs/heads/tester_split' was rewritten
john@satsuki:~/coderepo$ git branch
develop
master
* tester_split
wonderful
zaney
john@satsuki:~/coderepo$ ls
newfile1 newfile2 newfile3 test.sh
john@satsuki:~/coderepo$ git checkout master
Switched to branch 'master'
john@satsuki:~/coderepo$ ls
another_file cont_dev tester
john@satsuki:~/coderepo$
\end{code}
So now the directory has been split away from the original source code into a new branch. Have a think about what steps you would take to bring this into an entirely new repository.
\begin{callout}{Note}{More backups}
\index{filtering!backup}Git likes to make things easy for you.
You may not have noticed it before, but when using the \texttt{git filter-branch} tool to rewrite a branch, Git keeps a backup of the value of HEAD before you started rewriting your branch.
This backup is kept in \texttt{refs/original/refs/heads/<branch\_name>}.
This file will contain a commit ID which we can use to revert our branch back to its original state, if the filter does horribly wrong.
\end{callout}
\begin{trenches}
``So John, I managed to split the Atom library out into a new branch like you said, but I have no idea how to pull this into a new repo.''
Jack was finally feeling like he had gotten to grips with Git, but his latest task had left him feeling a little dejected.
He idly stabbed at his leg with a pen whilst waiting for John to finish his tapping away.
John lifted his keys from the keyboard and turned his chair.
``You really can't think of a way to copy what we have in one repo into another?''
Suddenly it was like a light bulb had exploded with light inside Jack's skull. "CLONES!" he shouted.
\end{trenches}
We actually have at least four methods we can use to do this.
\begin{enumerate}
\item Copy the data from one repo to another with a simple copy and paste
\item Clone our repository, delete all of the branches other than \textbf{tester\_split} and then rename it to \textbf{master}
\item Initialise a new repository, setup a remote to the original and then fetch our \textbf{tester\_split} branch
\item Create a bundle of the \textbf{tester\_split} and then clone from the bundle into a new repository
\end{enumerate}
The first of these will leave us with no history of development at all, so let us ignore it, as it is not what we require.
The second of these is trivial and should require no explanation at all.
We simply clone and then using the usual tools, we delete all unnecessary branches.
However this first method does have its disadvantages, namely the fact that when we clone the repository, we take every single object from the source repository into the new one.
Whilst this is generally not a problem it would mean that we would have to run some fairly aggressive garbage collection to remove all of these unwanted objects.
This would happen natually over time as the objects aged and were no longer referenced, but it would result in a repository that was initially much larger than it needed to be.
The other two methods deserve a little more consideration as they both perform much better in this respect.
The third method you should be familiar enough with previous material to be able to perform right now.
However, using the fetch command as we have done so before would again pull in many more objects than we require.
As such we are going to do a subtle twist to this command in the following output.
\begin{code}
john@satsuki:~/coderepo$ cd ../
john@satsuki:~$ mkdir subrepo
john@satsuki:~$ cd subrepo/
john@satsuki:~/subrepo$ git init
Initialized empty Git repository in /home/john/subrepo/.git/
john@satsuki:~/subrepo$ git remote add source /home/john/coderepo
john@satsuki:~/subrepo$ git fetch source +tester_split:master
fatal: Refusing to fetch into current branch refs/heads/master of non-bare repository
john@satsuki:~/subrepo$ fatal: The remote end hung up unexpectedly
john@satsuki:~/subrepo$
\end{code}
\index{branching!fetch single branch}\index{fetching!single branch}What we have asked Git to do is to pull only the branch \textbf{tester\_split} from the remote we called \textbf{source} and place it into \textbf{master} locally.
Think of the \texttt{+<branch>:<branch>} as \texttt{+<source>:<destination>} and all will make sense.
As you can see Git is not too happy about our intentions here as it does not like overwriting the \textbf{master} branch of a non-bare repository.
That is OK, we have another way around this.
\begin{code}
john@satsuki:~/subrepo$ git fetch source +tester_split:tmp
remote: Counting objects: 15, done.
remote: Compressing objects: 100% (14/14), done.
remote: Total 15 (delta 3), reused 0 (delta 0)
Unpacking objects: 100% (15/15), done.
From /home/john/coderepo
* [new branch] tester_split -> tmp
john@satsuki:~/subrepo$ git branch -m tmp master
john@satsuki:~/subrepo$
\end{code}
So we have almost deceived Git a little here, but I think we can live with ourselves.
By first pulling the branch into a \textbf{tmp} branch, we were then allowed to rename it as \textbf{master}.
Notice the number of objects required for this branch \texttt{15}.
If you remember when we cloned our repository a few \emph{weeks} ago, this value was a lot higher than this.
It was the subtle \texttt{+<source>:<destination>} which prevented us from pulling every last object from the source repository into our new slim \emph{sub}-repository.
\begin{code}
john@satsuki:~/subrepo$ ls
john@satsuki:~/subrepo$ git checkout master
Already on 'master'
john@satsuki:~/subrepo$ ls
newfile1 newfile2 newfile3 test.sh
john@satsuki:~/subrepo$
\end{code}
Notice that there are no files in the repository until we have checked out.
This is because all the fetch did was to \emph{fetch} the objects and place them in the repository object directory.
It did not place anything in the working directory.
If you remember this is same behaviour we saw with fetching before.
So now we have a complete copy of our \texttt{tester} component of our repository from the source into a new repository.
If we do a \texttt{git log}, we can see the history of the development.
\begin{code}
john@satsuki:~/subrepo$ git log --format=oneline
590e0eb79bc5ba0bc09f611392e643f676b00a04 Work on tester nf3
785b86d877d2a5c0679d98181a23d06ed2ba7652 Work on tester nf2
1ff89f787438f081a0d74de2d26eb2d831c9c738 Work on tester nf1
a5a0d9762dd4b50d8f3228e37b315f6056d5a034 Moved testing suite
john@satsuki:~/subrepo$
\end{code}
Unfortunately since some of our development work on these files happened outside of this directory,
this was lost when splitting and this is something to keep in mind should you ever perform this kind of operation.
\subsection{Little bundles of joy}
Git has so many ways to do things.
\index{bundling}This is in part what makes it a little daunting for those just starting but after you have gained a little experience, you begin to understand just what is happening in the background.
When this realisation hits, you are able to almost immediately think of at least two different ways of performing the same thing.
There have been numerous examples throughout the book, where there have been multiple ways to complete the same task.
Here we are going to look at just one more way that we can create a new repo from our \textbf{tester\_split} branch.
The tool we are going to introduce here is \indexgit{bundle}.
\index{bundling!creating}\index{bundling!cloning from}The \texttt{bundle} utility allows us to export a set of revisions and archive them to a file.
This file then becomes a resource that can be updated and pulled or fetched from.
This is especially useful if you have no physical connection between two computers and wish to sync some of the data from one to the other.
Let us take a quick look at how we could use the bundle tool in this case.
\begin{code}
john@satsuki:~/coderepo$ git bundle create ../tester.bundle tester_split
Counting objects: 15, done.
Compressing objects: 100% (14/14), done.
Writing objects: 100% (15/15), 1.50 KiB, done.
Total 15 (delta 3), reused 0 (delta 0)
john@satsuki:~/coderepo$ cd ..
john@satsuki:~$ git clone tester.bundle subrepo-b
Cloning into subrepo-b...
warning: remote HEAD refers to nonexistent ref, unable to checkout.
john@satsuki:~$
\end{code}
The syntax is fairly simple. The word \texttt{create} is used to tell Git to create a new bundle.
After this we specify a filename and then the tip of the branch that we want to archive. However, as can be seen above, there is a problem.
When we created the bundle, the branch which was checked out at the time was \textbf{master}.
The objects we pulled from the source repository and placed in the bundle were all from the \textbf{tester\_split} branch.
As such the HEAD of the working tree at the time of the bundle creation, pointed to an object in the \textbf{master} branch.
Obviously this object does not exist in our bundle and so Git complains.
If we had checked out \textbf{tester\_split} before creating the bundle, there would have been no complaints.
So all we have to do is to remap the HEAD of \textbf{master} to that of the HEAD of \textbf{tester\_split}.
As you can see below, it seems as if there are no branches at all and when we try to checkout master it does not exist.
What actually happened is that the objects were cloned into the repository, but as the object that the source HEAD pointed to was unavailable,
no branch was created.
With a little \texttt{git reset} trickery, we can create our \textbf{master} branch in our new repository.
\begin{code}
john@satsuki:~$ cd subrepo-b/
john@satsuki:~/subrepo-b$ git branch
john@satsuki:~/subrepo-b$ git checkout master
error: pathspec 'master' did not match any file(s) known to git.
john@satsuki:~/subrepo-b$ git reset --hard origin/tester_split
HEAD is now at 590e0eb Work on tester nf3
john@satsuki:~/subrepo-b$ git checkout master
Already on 'master'
john@satsuki:~/subrepo-b$ ls
newfile1 newfile2 newfile3 test.sh
john@satsuki:~/subrepo-b$
\end{code}
Now we have our repository complete as before and we have successfully reampped the \textbf{master} branch so that it points to \textbf{origin/tester\_split}.
\begin{trenches}
Martha and John were sitting together in the office.