-
Notifications
You must be signed in to change notification settings - Fork 12
/
diffutils.texi
4705 lines (3899 loc) · 178 KB
/
diffutils.texi
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
\input texinfo @c -*-texinfo-*-
@comment %**start of header
@setfilename diffutils.info
@include version.texi
@settitle Comparing and Merging Files
@syncodeindex vr cp
@setchapternewpage odd
@comment %**end of header
@copying
This manual is for GNU Diffutils
(version @value{VERSION}, @value{UPDATED}),
and documents the @acronym{GNU} @command{diff}, @command{diff3},
@command{sdiff}, and @command{cmp} commands for showing the
differences between files and the @acronym{GNU} @command{patch} command for
using their output to update files.
Copyright @copyright{} 1992-1994, 1998, 2001-2002, 2004, 2006, 2009-2015 Free
Software Foundation, Inc.
@quotation
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License, Version 1.3 or
any later version published by the Free Software Foundation; with no
Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts.
A copy of the license is included in the section entitled
``@acronym{GNU} Free Documentation License.''
@end quotation
@end copying
@c Debian install-info (up through at least version 1.9.20) uses only the
@c first dircategory. Put this one first, as it is more useful in practice.
@dircategory Individual utilities
@direntry
* cmp: (diffutils)Invoking cmp. Compare 2 files byte by byte.
* diff: (diffutils)Invoking diff. Compare 2 files line by line.
* diff3: (diffutils)Invoking diff3. Compare 3 files line by line.
* patch: (diffutils)Invoking patch. Apply a patch to a file.
* sdiff: (diffutils)Invoking sdiff. Merge 2 files side-by-side.
@end direntry
@dircategory Text creation and manipulation
@direntry
* Diffutils: (diffutils). Comparing and merging files.
@end direntry
@titlepage
@title Comparing and Merging Files
@subtitle for Diffutils @value{VERSION} and @code{patch} 2.5.4
@subtitle @value{UPDATED}
@author David MacKenzie, Paul Eggert, and Richard Stallman
@page
@vskip 0pt plus 1filll
@insertcopying
@end titlepage
@shortcontents
@contents
@ifnottex
@node Top
@top Comparing and Merging Files
@insertcopying
@end ifnottex
@menu
* Overview:: Preliminary information.
* Comparison:: What file comparison means.
* Output Formats:: Formats for two-way difference reports.
* Incomplete Lines:: Lines that lack trailing newlines.
* Comparing Directories:: Comparing files and directories.
* Adjusting Output:: Making @command{diff} output prettier.
* diff Performance:: Making @command{diff} smarter or faster.
* Comparing Three Files:: Formats for three-way difference reports.
* diff3 Merging:: Merging from a common ancestor.
* Interactive Merging:: Interactive merging with @command{sdiff}.
* Merging with patch:: Using @command{patch} to change old files into new ones.
* Making Patches:: Tips for making and using patch distributions.
* Invoking cmp:: Compare two files byte by byte.
* Invoking diff:: Compare two files line by line.
* Invoking diff3:: Compare three files line by line.
* Invoking patch:: Apply a diff file to an original.
* Invoking sdiff:: Side-by-side merge of file differences.
* Standards conformance:: Conformance to the @acronym{POSIX} standard.
* Projects:: If you've found a bug or other shortcoming.
* Copying This Manual:: How to make copies of this manual.
* Translations:: Available translations of this manual.
* Index:: Index.
@end menu
@node Overview
@unnumbered Overview
@cindex overview of @command{diff} and @command{patch}
Computer users often find occasion to ask how two files differ. Perhaps
one file is a newer version of the other file. Or maybe the two files
started out as identical copies but were changed by different people.
You can use the @command{diff} command to show differences between two
files, or each corresponding file in two directories. @command{diff}
outputs differences between files line by line in any of several
formats, selectable by command line options. This set of differences is
often called a @dfn{diff} or @dfn{patch}. For files that are identical,
@command{diff} normally produces no output; for binary (non-text) files,
@command{diff} normally reports only that they are different.
You can use the @command{cmp} command to show the byte and line numbers
where two files differ. @command{cmp} can also show all the bytes
that differ between the two files, side by side. A way to compare
two files character by character is the Emacs command @kbd{M-x
compare-windows}. @xref{Other Window, , Other Window, emacs, The @acronym{GNU}
Emacs Manual}, for more information on that command.
You can use the @command{diff3} command to show differences among three
files. When two people have made independent changes to a common
original, @command{diff3} can report the differences between the original
and the two changed versions, and can produce a merged file that
contains both persons' changes together with warnings about conflicts.
You can use the @command{sdiff} command to merge two files interactively.
You can use the set of differences produced by @command{diff} to distribute
updates to text files (such as program source code) to other people.
This method is especially useful when the differences are small compared
to the complete files. Given @command{diff} output, you can use the
@command{patch} program to update, or @dfn{patch}, a copy of the file. If you
think of @command{diff} as subtracting one file from another to produce
their difference, you can think of @command{patch} as adding the difference
to one file to reproduce the other.
This manual first concentrates on making diffs, and later shows how to
use diffs to update files.
@acronym{GNU} @command{diff} was written by Paul Eggert, Mike Haertel,
David Hayes, Richard Stallman, and Len Tower. Wayne Davison designed and
implemented the unified output format. The basic algorithm is described
by Eugene W. Myers in ``An O(ND) Difference Algorithm and its Variations'',
@cite{Algorithmica} Vol.@: 1, 1986, pp.@: 251--266,
@url{http://dx.doi.org/10.1007/BF01840446}; and in ``A File
Comparison Program'', Webb Miller and Eugene W. Myers,
@cite{Software---Practice and Experience} Vol.@: 15, 1985,
pp.@: 1025--1040,
@url{http://dx.doi.org/10.1002/spe.4380151102}.
@c From: "Gene Myers" <[email protected]>
@c They are about the same basic algorithm; the Algorithmica
@c paper gives a rigorous treatment and the sub-algorithm for
@c delivering scripts and should be the primary reference, but
@c both should be mentioned.
The algorithm was independently discovered as described by Esko Ukkonen in
``Algorithms for Approximate String Matching'',
@cite{Information and Control} Vol.@: 64, 1985, pp.@: 100--118,
@url{http://dx.doi.org/10.1016/S0019-9958(85)80046-2}.
@c From: "Gene Myers" <[email protected]>
@c Date: Wed, 29 Sep 1993 08:27:55 MST
@c Ukkonen should be given credit for also discovering the algorithm used
@c in GNU diff.
Related algorithms are surveyed by Alfred V. Aho in
section 6.3 of ``Algorithms for Finding Patterns in Strings'',
@cite{Handbook of Theoretical Computer Science} (Jan Van Leeuwen,
ed.), Vol.@: A, @cite{Algorithms and Complexity}, Elsevier/MIT Press,
1990, pp.@: 255--300.
@acronym{GNU} @command{diff3} was written by Randy Smith. @acronym{GNU}
@command{sdiff} was written by Thomas Lord. @acronym{GNU} @command{cmp}
was written by Torbj@"orn Granlund and David MacKenzie.
@acronym{GNU} @command{patch} was written mainly by Larry Wall and Paul Eggert;
several @acronym{GNU} enhancements were contributed by Wayne Davison and
David MacKenzie. Parts of this manual are adapted from a manual page
written by Larry Wall, with his permission.
@node Comparison
@chapter What Comparison Means
@cindex introduction
There are several ways to think about the differences between two files.
One way to think of the differences is as a series of lines that were
deleted from, inserted in, or changed in one file to produce the other
file. @command{diff} compares two files line by line, finds groups of
lines that differ, and reports each group of differing lines. It can
report the differing lines in several formats, which have different
purposes.
@acronym{GNU} @command{diff} can show whether files are different
without detailing the differences. It also provides ways to suppress
certain kinds of differences that are not important to you. Most
commonly, such differences are changes in the amount of white space
between words or lines. @command{diff} also provides ways to suppress
differences in alphabetic case or in lines that match a regular
expression that you provide. These options can accumulate; for
example, you can ignore changes in both white space and alphabetic
case.
Another way to think of the differences between two files is as a
sequence of pairs of bytes that can be either identical or
different. @command{cmp} reports the differences between two files
byte by byte, instead of line by line. As a result, it is often
more useful than @command{diff} for comparing binary files. For text
files, @command{cmp} is useful mainly when you want to know only whether
two files are identical, or whether one file is a prefix of the other.
To illustrate the effect that considering changes byte by byte
can have compared with considering them line by line, think of what
happens if a single newline character is added to the beginning of a
file. If that file is then compared with an otherwise identical file
that lacks the newline at the beginning, @command{diff} will report that a
blank line has been added to the file, while @command{cmp} will report that
almost every byte of the two files differs.
@command{diff3} normally compares three input files line by line, finds
groups of lines that differ, and reports each group of differing lines.
Its output is designed to make it easy to inspect two different sets of
changes to the same file.
These commands compare input files without necessarily reading them.
For example, if @command{diff} is asked simply to report whether two
files differ, and it discovers that the files have different sizes, it
need not read them to do its job.
@menu
* Hunks:: Groups of differing lines.
* White Space:: Suppressing differences in white space.
* Blank Lines:: Suppressing differences whose lines are all blank.
* Specified Lines:: Suppressing differences whose lines all match a pattern.
* Case Folding:: Suppressing differences in alphabetic case.
* Brief:: Summarizing which files are different.
* Binary:: Comparing binary files or forcing text comparisons.
@end menu
@node Hunks
@section Hunks
@cindex hunks
When comparing two files, @command{diff} finds sequences of lines common to
both files, interspersed with groups of differing lines called
@dfn{hunks}. Comparing two identical files yields one sequence of
common lines and no hunks, because no lines differ. Comparing two
entirely different files yields no common lines and one large hunk that
contains all lines of both files. In general, there are many ways to
match up lines between two given files. @command{diff} tries to minimize
the total hunk size by finding large sequences of common lines
interspersed with small hunks of differing lines.
For example, suppose the file @file{F} contains the three lines
@samp{a}, @samp{b}, @samp{c}, and the file @file{G} contains the same
three lines in reverse order @samp{c}, @samp{b}, @samp{a}. If
@command{diff} finds the line @samp{c} as common, then the command
@samp{diff F G} produces this output:
@example
1,2d0
< a
< b
3a2,3
> b
> a
@end example
@noindent
But if @command{diff} notices the common line @samp{b} instead, it produces
this output:
@example
1c1
< a
---
> c
3c3
< c
---
> a
@end example
@noindent
It is also possible to find @samp{a} as the common line. @command{diff}
does not always find an optimal matching between the files; it takes
shortcuts to run faster. But its output is usually close to the
shortest possible. You can adjust this tradeoff with the
@option{--minimal} (@option{-d}) option (@pxref{diff Performance}).
@node White Space
@section Suppressing Differences in Blank and Tab Spacing
@cindex blank and tab difference suppression
@cindex tab and blank difference suppression
The @option{--ignore-tab-expansion} (@option{-E}) option ignores the
distinction between tabs and spaces on input. A tab is considered to be
equivalent to the number of spaces to the next tab stop (@pxref{Tabs}).
The @option{--ignore-trailing-space} (@option{-Z}) option ignores white
space at line end.
The @option{--ignore-space-change} (@option{-b}) option is stronger than
@option{-E} and @option{-Z} combined.
It ignores white space at line end, and considers all other sequences of
one or more white space characters within a line to be equivalent. With this
option, @command{diff} considers the following two lines to be equivalent,
where @samp{$} denotes the line end:
@example
Here lyeth muche rychnesse in lytell space. -- John Heywood$
Here lyeth muche rychnesse in lytell space. -- John Heywood $
@end example
The @option{--ignore-all-space} (@option{-w}) option is stronger still.
It ignores differences even if one line has white space where
the other line has none. @dfn{White space} characters include
tab, vertical tab, form feed, carriage return, and space;
some locales may define additional characters to be white space.
With this option, @command{diff} considers the
following two lines to be equivalent, where @samp{$} denotes the line
end and @samp{^M} denotes a carriage return:
@example
Here lyeth muche rychnesse in lytell space.-- John Heywood$
He relyeth much erychnes seinly tells pace. --John Heywood ^M$
@end example
For many other programs newline is also a white space character, but
@command{diff} is a line-oriented program and a newline character
always ends a line. Hence the @option{-w} or
@option{--ignore-all-space} option does not ignore newline-related
changes; it ignores only other white space changes.
@node Blank Lines
@section Suppressing Differences Whose Lines Are All Blank
@cindex blank line difference suppression
The @option{--ignore-blank-lines} (@option{-B}) option ignores changes
that consist entirely of blank lines. With this option, for example, a
file containing
@example
1. A point is that which has no part.
2. A line is breadthless length.
-- Euclid, The Elements, I
@end example
@noindent
is considered identical to a file containing
@example
1. A point is that which has no part.
2. A line is breadthless length.
-- Euclid, The Elements, I
@end example
Normally this option affects only lines that are completely empty, but
if you also specify an option that ignores trailing spaces,
lines are also affected if they look empty but contain white space.
In other words, @option{-B} is equivalent to @samp{-I '^$'} by
default, but it is equivalent to @option{-I '^[[:space:]]*$'} if
@option{-b}, @option{-w} or @option{-Z} is also specified.
@node Specified Lines
@section Suppressing Differences Whose Lines All Match a Regular Expression
@cindex regular expression suppression
To ignore insertions and deletions of lines that match a
@command{grep}-style regular expression, use the
@option{--ignore-matching-lines=@var{regexp}} (@option{-I @var{regexp}}) option.
You should escape
regular expressions that contain shell metacharacters to prevent the
shell from expanding them. For example, @samp{diff -I '^[[:digit:]]'} ignores
all changes to lines beginning with a digit.
However, @option{-I} only ignores the insertion or deletion of lines that
contain the regular expression if every changed line in the hunk---every
insertion and every deletion---matches the regular expression. In other
words, for each nonignorable change, @command{diff} prints the complete set
of changes in its vicinity, including the ignorable ones.
You can specify more than one regular expression for lines to ignore by
using more than one @option{-I} option. @command{diff} tries to match each
line against each regular expression.
@node Case Folding
@section Suppressing Case Differences
@cindex case difference suppression
@acronym{GNU} @command{diff} can treat lower case letters as
equivalent to their upper case counterparts, so that, for example, it
considers @samp{Funky Stuff}, @samp{funky STUFF}, and @samp{fUNKy
stuFf} to all be the same. To request this, use the @option{-i} or
@option{--ignore-case} option.
@node Brief
@section Summarizing Which Files Differ
@cindex summarizing which files differ
@cindex brief difference reports
When you only want to find out whether files are different, and you
don't care what the differences are, you can use the summary output
format. In this format, instead of showing the differences between the
files, @command{diff} simply reports whether files differ. The
@option{--brief} (@option{-q}) option selects this output format.
This format is especially useful when comparing the contents of two
directories. It is also much faster than doing the normal line by line
comparisons, because @command{diff} can stop analyzing the files as soon as
it knows that there are any differences.
You can also get a brief indication of whether two files differ by using
@command{cmp}. For files that are identical, @command{cmp} produces no
output. When the files differ, by default, @command{cmp} outputs the byte
and line number where the first difference occurs, or reports that one
file is a prefix of the other. You can use
the @option{-s}, @option{--quiet}, or @option{--silent} option to
suppress that information, so that @command{cmp}
produces no output and reports whether the files differ using only its
exit status (@pxref{Invoking cmp}).
@c Fix this.
Unlike @command{diff}, @command{cmp} cannot compare directories; it can only
compare two files.
@node Binary
@section Binary Files and Forcing Text Comparisons
@cindex binary file diff
@cindex text versus binary diff
If @command{diff} thinks that either of the two files it is comparing is
binary (a non-text file), it normally treats that pair of files much as
if the summary output format had been selected (@pxref{Brief}), and
reports only that the binary files are different. This is because line
by line comparisons are usually not meaningful for binary files.
This does not count as trouble, even though the resulting output does
not capture all the differences.
@command{diff} determines whether a file is text or binary by checking the
first few bytes in the file; the exact number of bytes is system
dependent, but it is typically several thousand. If every byte in
that part of the file is non-null, @command{diff} considers the file to be
text; otherwise it considers the file to be binary.
Sometimes you might want to force @command{diff} to consider files to be
text. For example, you might be comparing text files that contain
null characters; @command{diff} would erroneously decide that those are
non-text files. Or you might be comparing documents that are in a
format used by a word processing system that uses null characters to
indicate special formatting. You can force @command{diff} to consider all
files to be text files, and compare them line by line, by using the
@option{--text} (@option{-a}) option. If the files you compare using this
option do not in fact contain text, they will probably contain few
newline characters, and the @command{diff} output will consist of hunks
showing differences between long lines of whatever characters the files
contain.
You can also force @command{diff} to report only whether files differ
(but not how). Use the @option{--brief} (@option{-q}) option for
this.
In operating systems that distinguish between text and binary files,
@command{diff} normally reads and writes all data as text. Use the
@option{--binary} option to force @command{diff} to read and write binary
data instead. This option has no effect on a @acronym{POSIX}-compliant system
like @acronym{GNU} or traditional Unix. However, many personal computer
operating systems represent the end of a line with a carriage return
followed by a newline. On such systems, @command{diff} normally ignores
these carriage returns on input and generates them at the end of each
output line, but with the @option{--binary} option @command{diff} treats
each carriage return as just another input character, and does not
generate a carriage return at the end of each output line. This can be
useful when dealing with non-text files that are meant to be
interchanged with @acronym{POSIX}-compliant systems.
The @option{--strip-trailing-cr} causes @command{diff} to treat input
lines that end in carriage return followed by newline as if they end
in plain newline. This can be useful when comparing text that is
imperfectly imported from many personal computer operating systems.
This option affects how lines are read, which in turn affects how they
are compared and output.
If you want to compare two files byte by byte, you can use the
@command{cmp} program with the @option{--verbose} (@option{-l})
option to show the values of each differing byte in the two files.
With @acronym{GNU} @command{cmp}, you can also use the @option{-b} or
@option{--print-bytes} option to show the @acronym{ASCII} representation of
those bytes. @xref{Invoking cmp}, for more information.
If @command{diff3} thinks that any of the files it is comparing is binary
(a non-text file), it normally reports an error, because such
comparisons are usually not useful. @command{diff3} uses the same test as
@command{diff} to decide whether a file is binary. As with @command{diff}, if
the input files contain a few non-text bytes but otherwise are like
text files, you can force @command{diff3} to consider all files to be text
files and compare them line by line by using the @option{-a} or
@option{--text} option.
@node Output Formats
@chapter @command{diff} Output Formats
@cindex output formats
@cindex format of @command{diff} output
@command{diff} has several mutually exclusive options for output format.
The following sections describe each format, illustrating how
@command{diff} reports the differences between two sample input files.
@menu
* Sample diff Input:: Sample @command{diff} input files for examples.
* Context:: Showing differences with the surrounding text.
* Side by Side:: Showing differences in two columns.
* Normal:: Showing differences without surrounding text.
* Scripts:: Generating scripts for other programs.
* If-then-else:: Merging files with if-then-else.
@end menu
@node Sample diff Input
@section Two Sample Input Files
@cindex @command{diff} sample input
@cindex sample input for @command{diff}
Here are two sample files that we will use in numerous examples to
illustrate the output of @command{diff} and how various options can change
it.
This is the file @file{lao}:
@example
The Way that can be told of is not the eternal Way;
The name that can be named is not the eternal name.
The Nameless is the origin of Heaven and Earth;
The Named is the mother of all things.
Therefore let there always be non-being,
so we may see their subtlety,
And let there always be being,
so we may see their outcome.
The two are the same,
But after they are produced,
they have different names.
@end example
This is the file @file{tzu}:
@example
The Nameless is the origin of Heaven and Earth;
The named is the mother of all things.
Therefore let there always be non-being,
so we may see their subtlety,
And let there always be being,
so we may see their outcome.
The two are the same,
But after they are produced,
they have different names.
They both may be called deep and profound.
Deeper and more profound,
The door of all subtleties!
@end example
In this example, the first hunk contains just the first two lines of
@file{lao}, the second hunk contains the fourth line of @file{lao}
opposing the second and third lines of @file{tzu}, and the last hunk
contains just the last three lines of @file{tzu}.
@node Context
@section Showing Differences in Their Context
@cindex context output format
@cindex @samp{!} output format
Usually, when you are looking at the differences between files, you will
also want to see the parts of the files near the lines that differ, to
help you understand exactly what has changed. These nearby parts of the
files are called the @dfn{context}.
@acronym{GNU} @command{diff} provides two output formats that show context
around the differing lines: @dfn{context format} and @dfn{unified
format}. It can optionally show in which function or section of the
file the differing lines are found.
If you are distributing new versions of files to other people in the
form of @command{diff} output, you should use one of the output formats
that show context so that they can apply the diffs even if they have
made small changes of their own to the files. @command{patch} can apply
the diffs in this case by searching in the files for the lines of
context around the differing lines; if those lines are actually a few
lines away from where the diff says they are, @command{patch} can adjust
the line numbers accordingly and still apply the diff correctly.
@xref{Imperfect}, for more information on using @command{patch} to apply
imperfect diffs.
@menu
* Context Format:: An output format that shows surrounding lines.
* Unified Format:: A more compact output format that shows context.
* Sections:: Showing which sections of the files differences are in.
* Alternate Names:: Showing alternate file names in context headers.
@end menu
@node Context Format
@subsection Context Format
The context output format shows several lines of context around the
lines that differ. It is the standard format for distributing updates
to source code.
To select this output format, use the
@option{--context@r{[}=@var{lines}@r{]}} (@option{-C @var{lines}})
or @option{-c} option. The
argument @var{lines} that some of these options take is the number of
lines of context to show. If you do not specify @var{lines}, it
defaults to three. For proper operation, @command{patch} typically needs
at least two lines of context.
@menu
* Example Context:: Sample output in context format.
* Less Context:: Another sample with less context.
* Detailed Context:: A detailed description of the context output format.
@end menu
@node Example Context
@subsubsection An Example of Context Format
Here is the output of @samp{diff -c lao tzu} (@pxref{Sample diff Input},
for the complete contents of the two files). Notice that up to three
lines that are not different are shown around each line that is
different; they are the context lines. Also notice that the first two
hunks have run together, because their contents overlap.
@example
*** lao 2002-02-21 23:30:39.942229878 -0800
--- tzu 2002-02-21 23:30:50.442260588 -0800
***************
*** 1,7 ****
- The Way that can be told of is not the eternal Way;
- The name that can be named is not the eternal name.
The Nameless is the origin of Heaven and Earth;
! The Named is the mother of all things.
Therefore let there always be non-being,
so we may see their subtlety,
And let there always be being,
--- 1,6 ----
The Nameless is the origin of Heaven and Earth;
! The named is the mother of all things.
! @-
Therefore let there always be non-being,
so we may see their subtlety,
And let there always be being,
***************
*** 9,11 ****
--- 8,13 ----
The two are the same,
But after they are produced,
they have different names.
+ They both may be called deep and profound.
+ Deeper and more profound,
+ The door of all subtleties!
@end example
@node Less Context
@subsubsection An Example of Context Format with Less Context
Here is the output of @samp{diff -C 1 lao tzu} (@pxref{Sample diff
Input}, for the complete contents of the two files). Notice that at
most one context line is reported here.
@example
*** lao 2002-02-21 23:30:39.942229878 -0800
--- tzu 2002-02-21 23:30:50.442260588 -0800
***************
*** 1,5 ****
- The Way that can be told of is not the eternal Way;
- The name that can be named is not the eternal name.
The Nameless is the origin of Heaven and Earth;
! The Named is the mother of all things.
Therefore let there always be non-being,
--- 1,4 ----
The Nameless is the origin of Heaven and Earth;
! The named is the mother of all things.
! @-
Therefore let there always be non-being,
***************
*** 11 ****
--- 10,13 ----
they have different names.
+ They both may be called deep and profound.
+ Deeper and more profound,
+ The door of all subtleties!
@end example
@node Detailed Context
@subsubsection Detailed Description of Context Format
The context output format starts with a two-line header, which looks
like this:
@example
*** @var{from-file} @var{from-file-modification-time}
--- @var{to-file} @var{to-file-modification time}
@end example
@noindent
@vindex LC_TIME
@cindex time stamp format, context diffs
The time stamp normally looks like @samp{2002-02-21 23:30:39.942229878
-0800} to indicate the date, time with fractional seconds, and time
zone in @uref{ftp://ftp.isi.edu/in-notes/rfc2822.txt, Internet RFC
2822 format}. (The fractional seconds are omitted on hosts that do
not support fractional time stamps.) However, a traditional time
stamp like @samp{Thu Feb 21 23:30:39 2002} is used if the
@env{LC_TIME} locale category is either @samp{C} or @samp{POSIX}.
You can change the header's content with the
@option{--label=@var{label}} option; see @ref{Alternate Names}.
Next come one or more hunks of differences; each hunk shows one area
where the files differ. Context format hunks look like this:
@example
***************
*** @var{from-file-line-numbers} ****
@var{from-file-line}
@var{from-file-line}@dots{}
--- @var{to-file-line-numbers} ----
@var{to-file-line}
@var{to-file-line}@dots{}
@end example
If a hunk contains two or more lines, its line numbers look like
@samp{@var{start},@var{end}}. Otherwise only its end line number
appears. An empty hunk is considered to end at the line that precedes
the hunk.
The lines of context around the lines that differ start with two space
characters. The lines that differ between the two files start with one
of the following indicator characters, followed by a space character:
@table @samp
@item !
A line that is part of a group of one or more lines that changed between
the two files. There is a corresponding group of lines marked with
@samp{!} in the part of this hunk for the other file.
@item +
An ``inserted'' line in the second file that corresponds to nothing in
the first file.
@item -
A ``deleted'' line in the first file that corresponds to nothing in the
second file.
@end table
If all of the changes in a hunk are insertions, the lines of
@var{from-file} are omitted. If all of the changes are deletions, the
lines of @var{to-file} are omitted.
@node Unified Format
@subsection Unified Format
@cindex unified output format
@cindex @samp{+-} output format
The unified output format is a variation on the context format that is
more compact because it omits redundant context lines. To select this
output format, use the
@option{--unified@r{[}=@var{lines}@r{]}} (@option{-U @var{lines}}),
or @option{-u} option.
The argument @var{lines} is the number of lines of context to show.
When it is not given, it defaults to three.
At present, only @acronym{GNU} @command{diff} can produce this format and
only @acronym{GNU} @command{patch} can automatically apply diffs in this
format. For proper operation, @command{patch} typically needs at
least three lines of context.
@menu
* Example Unified:: Sample output in unified format.
* Detailed Unified:: A detailed description of unified format.
@end menu
@node Example Unified
@subsubsection An Example of Unified Format
Here is the output of the command @samp{diff -u lao tzu}
(@pxref{Sample diff Input}, for the complete contents of the two files):
@example
--- lao 2002-02-21 23:30:39.942229878 -0800
+++ tzu 2002-02-21 23:30:50.442260588 -0800
@@@@ -1,7 +1,6 @@@@
-The Way that can be told of is not the eternal Way;
-The name that can be named is not the eternal name.
The Nameless is the origin of Heaven and Earth;
-The Named is the mother of all things.
+The named is the mother of all things.
+
Therefore let there always be non-being,
so we may see their subtlety,
And let there always be being,
@@@@ -9,3 +8,6 @@@@
The two are the same,
But after they are produced,
they have different names.
+They both may be called deep and profound.
+Deeper and more profound,
+The door of all subtleties!
@end example
@node Detailed Unified
@subsubsection Detailed Description of Unified Format
The unified output format starts with a two-line header, which looks
like this:
@example
--- @var{from-file} @var{from-file-modification-time}
+++ @var{to-file} @var{to-file-modification-time}
@end example
@noindent
@cindex time stamp format, unified diffs
The time stamp looks like @samp{2002-02-21 23:30:39.942229878 -0800}
to indicate the date, time with fractional seconds, and time zone.
The fractional seconds are omitted on hosts that do not support
fractional time stamps.
You can change the header's content with the
@option{--label=@var{label}} option. @xref{Alternate Names}.
Next come one or more hunks of differences; each hunk shows one area
where the files differ. Unified format hunks look like this:
@example
@@@@ @var{from-file-line-numbers} @var{to-file-line-numbers} @@@@
@var{line-from-either-file}
@var{line-from-either-file}@dots{}
@end example
If a hunk contains just one line, only its start line number appears.
Otherwise its line numbers look like @samp{@var{start},@var{count}}.
An empty hunk is considered to start at the line that follows the hunk.
If a hunk and its context contain two or more lines, its
line numbers look like @samp{@var{start},@var{count}}. Otherwise only
its end line number appears. An empty hunk is considered to end at
the line that precedes the hunk.
The lines common to both files begin with a space character. The lines
that actually differ between the two files have one of the following
indicator characters in the left print column:
@table @samp
@item +
A line was added here to the first file.
@item -
A line was removed here from the first file.
@end table
@node Sections
@subsection Showing Which Sections Differences Are in
@cindex headings
@cindex section headings
Sometimes you might want to know which part of the files each change
falls in. If the files are source code, this could mean which
function was changed. If the files are documents, it could mean which
chapter or appendix was changed. @acronym{GNU} @command{diff} can
show this by displaying the nearest section heading line that precedes
the differing lines. Which lines are ``section headings'' is
determined by a regular expression.
@menu
* Specified Headings:: Showing headings that match regular expressions.
* C Function Headings:: Showing headings of C functions.
@end menu
@node Specified Headings
@subsubsection Showing Lines That Match Regular Expressions
@cindex specified headings
@cindex regular expression matching headings
To show in which sections differences occur for files that are not
source code for C or similar languages, use the
@option{--show-function-line=@var{regexp}} (@option{-F @var{regexp}}) option.
@command{diff}
considers lines that match the @command{grep}-style regular expression
@var{regexp} to be the beginning
of a section of the file. Here are suggested regular expressions for
some common languages:
@c Please add to this list, e.g. Fortran, Pascal, Perl, Python.
@table @samp
@item ^[[:alpha:]$_]
C, C++, Prolog
@item ^(
Lisp
@item ^@@node
Texinfo
@end table
This option does not automatically select an output format; in order to
use it, you must select the context format (@pxref{Context Format}) or
unified format (@pxref{Unified Format}). In other output formats it
has no effect.
The @option{--show-function-line} (@option{-F}) option finds the nearest
unchanged line that precedes each hunk of differences and matches the
given regular expression. Then it adds that line to the end of the
line of asterisks in the context format, or to the @samp{@@@@} line in
unified format. If no matching line exists, this option leaves the output for
that hunk unchanged. If that line is more than 40 characters long, it
outputs only the first 40 characters. You can specify more than one
regular expression for such lines; @command{diff} tries to match each line
against each regular expression, starting with the last one given. This
means that you can use @option{-p} and @option{-F} together, if you wish.
@node C Function Headings
@subsubsection Showing C Function Headings
@cindex C function headings
@cindex function headings, C
To show in which functions differences occur for C and similar
languages, you can use the @option{--show-c-function} (@option{-p}) option.
This option automatically defaults to the context output format
(@pxref{Context Format}), with the default number of lines of context.
You can override that number with @option{-C @var{lines}} elsewhere in the
command line. You can override both the format and the number with
@option{-U @var{lines}} elsewhere in the command line.
The @option{--show-c-function} (@option{-p}) option is equivalent to
@option{-F '^[[:alpha:]$_]'} if the unified format is specified, otherwise
@option{-c -F '^[[:alpha:]$_]'} (@pxref{Specified Headings}). @acronym{GNU}
@command{diff} provides this option for the sake of convenience.
@node Alternate Names
@subsection Showing Alternate File Names
@cindex alternate file names
@cindex file name alternates
If you are comparing two files that have meaningless or uninformative
names, you might want @command{diff} to show alternate names in the header
of the context and unified output formats. To do this, use the
@option{--label=@var{label}} option. The first time
you give this option, its argument replaces the name and date of the
first file in the header; the second time, its argument replaces the
name and date of the second file. If you give this option more than
twice, @command{diff} reports an error. The @option{--label} option does not
affect the file names in the @command{pr} header when the @option{-l} or
@option{--paginate} option is used (@pxref{Pagination}).
Here are the first two lines of the output from @samp{diff -C 2
--label=original --label=modified lao tzu}:
@example
*** original
--- modified
@end example
@node Side by Side
@section Showing Differences Side by Side
@cindex side by side
@cindex two-column output
@cindex columnar output
@command{diff} can produce a side by side difference listing of two files.
The files are listed in two columns with a gutter between them. The
gutter contains one of the following markers:
@table @asis
@item white space
The corresponding lines are in common. That is, either the lines are
identical, or the difference is ignored because of one of the
@option{--ignore} options (@pxref{White Space}).
@item @samp{|}
The corresponding lines differ, and they are either both complete
or both incomplete.
@item @samp{<}
The files differ and only the first file contains the line.
@item @samp{>}
The files differ and only the second file contains the line.
@item @samp{(}
Only the first file contains the line, but the difference is ignored.
@item @samp{)}
Only the second file contains the line, but the difference is ignored.
@item @samp{\}
The corresponding lines differ, and only the first line is incomplete.
@item @samp{/}
The corresponding lines differ, and only the second line is incomplete.
@end table
Normally, an output line is incomplete if and only if the lines that it
contains are incomplete. @xref{Incomplete Lines}. However, when an
output line represents two differing lines, one might be incomplete
while the other is not. In this case, the output line is complete,
but its the gutter is marked @samp{\} if the first line is incomplete,
@samp{/} if the second line is.