Diff Command in Linux

Introduction

diff is a command-line utility that allows you to compare two files line by line, as well as the contents of directories.

It is most commonly used to create a patch containing the differences between one or more files, which can then be applied using the patch command.

How to Use the diff Command

The syntax of the diff command is as follows:

diff [OPTION]... FILES

The diff command can display the output in a number of formats, the most frequent of which are normal, context, and unified. Information on which lines in the files need to be changed to make them identical is included in the output. If the files are identical, no output is generated.

Use the redirection operator to store the command output to a file:

diff file1 file2 > patch

The following two files will be used in this tutorial to demonstrate how the diff command functions:

  • file1
Ubuntu
Arch Linux
Debian
CentOS
Fedora
  • file2
Kubuntu
Ubuntu
Debian
Arch Linux
Centos
Fedora

Normal Format

When the diff command is used in its simplest form on two text files without any options, it generates an output in the normal format:

diff file1 file2

The output will resemble this:

Output

0a1
> Kubuntu
2d2
< Arch Linux
4c4,5
< CentOS
---
> Arch Linux
> Centos

The standard output format includes one or more sections describing the differences. Each section appears as follows:

change-command
< from-file-line...
---
> to-file-line...

Change commands are 0a1, 2d2, and 4c4, 5. From left to right, each change command includes the following:

  • The first file's line number or a specified range of lines.
  • A special change character.
  • The second file's line count or a specified range of lines.

One of the following is a possible change in character:

  • a: Add the lines.
  • c: Change the lines.
  • d: Delete the lines.

The change command is accompanied by the complete lines that are removed (<) and added to the file (>).

Let us now explain the output:

  1. 0a1: Insert line 1 of the second file at the start of file1 (after line 0).
  • > Kubuntu: The line from the second line that is, as previously said, appended to the first file.

2. 2d2: In the first file, remove line 2. The 2 after the d sign indicates that the deleted line would appear on line 2 in the second file.

  • < Arch Linux: the deleted line.

3. 4c4,5: Substitute (change) line 5 in the first file with lines 4-5 from the second file.

  • < CentOS: The line to be replaced in the first file.
  • ---: Separator.
  • > Arch Linux and > Centos: Lines from the second file substituting the line in the first file.

Context Format

When using the context output format, the diff command displays a number of context lines around the lines that differ between the files.

The -c option directs diff to display output in the context format as follows:

diff -c file1 file2
Output

*** file1	2022-11-25 21:00:26.422426523 +0100
--- file2	2022-11-25 21:00:36.342231668 +0100
***************
*** 1,6 ****
  Ubuntu
- Arch Linux
  Debian
! CentOS
  Fedora
  
--- 1,7 ----
+ Kubuntu
  Ubuntu
  Debian
! Arch Linux
! Centos
  Fedora

The output begins with the names and timestamps of the compared files, followed by one or more sections describing the differences. Each part appears as follows:

***************
*** from-file-line-numbers ****
  from-file-line...
--- to-file-line-numbers ----
  to-file-line...
  • from-file-line-numbers and to-file-line-numbers: The line numbers or comma-separated range of lines in the first and second files, respectively.
  • from-file-line and to-file-line: The lines that differ and the lines of context:
  1. Context lines are the lines that are the same in both files and begin with two spaces.
  2. Lines beginning with the minus symbol (-) are the lines that match up to nothing in the second file. There are lines missing in the second file.
  3. Lines beginning with the plus symbol (+) are the lines that match up to nothing in the first file. There are lines missing in the first file.
  4. Lines beginning with the exclamation mark (!) are the lines that are changed between two files. There is a match for each group of lines in the first file beginning with ! in the second file.

Let us break down the output's key components:

  • In this instance, there is just one section devoted to outlining the differences.
  • The range of the lines from the first and second files that are included in this section is indicated by the numbers *** 1,6 **** and --- 1,7 ----.
  • In both files, the last empty line, Ubuntu, Debian, Fedora are the same. A double space is used at the beginning of these lines.
  • The first file's line - Arch Linux has no match in the second file. While this line is present in the second file as well, the positions are different.
  • Line + Kubuntu from the second file does not match anything in the first file.
  • Line ! CentOS from the first file and lines ! Arch Linux and ! CentOS from the second file are modified between the files.

The number of context lines is set to three by default. You can use the -C (--contexts) option to specify a different number:

diff -C 1 file1 file2
Output

*** file1	2019-11-25 21:00:26.422426523 +0100
--- file2	2019-11-25 21:00:36.342231668 +0100
***************
*** 1,5 ****
  Ubuntu
- Arch Linux
  Debian
! CentOS
  Fedora
--- 1,6 ----
+ Kubuntu
  Ubuntu
  Debian
! Arch Linux
! Centos
  Fedora

Unified Format

The context format has been improved, and the unified output format generates a smaller output.

You can tell diff to print the output in unified format by using the -u option:

diff -u file1 file2
Output

--- file1	2019-11-25 21:00:26.422426523 +0100
+++ file2	2019-11-25 21:00:36.342231668 +0100
@@ -1,6 +1,7 @@
+Kubuntu
 Ubuntu
-Arch Linux
 Debian
-CentOS
+Arch Linux
+Centos
 Fedora

The output begins with the file names and timestamps, followed by one or more sections describing the discrepancies. Each part is formatted as follows:

***************
@@ from-file-line-numbers to-file-line-numbers @@
 line-from-files...

  1. @@ from-file-line-numbers to-file-line-numbers @@: The line number or range of lines from the first and second files that are included in this section.
  2. line-from-files: The lines of context and the lines that differ:
  • Lines beginning with two spaces are context lines, which are the same in both files.
  • The lines that are removed from the first file are those that begin with the minus sign (-).
  • The lines from the first file that are added are those that begin with the plus sign (+).

Ignore case

The examples above show that the diff command is case-sensitive by default.

If you want diff to ignore the case, use the -i option.

diff -ui file1 file2
Output

--- file1	2019-11-25 21:00:26.422426523 +0100
+++ file2	2019-11-25 21:00:36.342231668 +0100
@@ -1,6 +1,7 @@
+Kubuntu
 Ubuntu
-Arch Linux
 Debian
+Arch Linux
 CentOS
 Fedora

Conclusion

For Linux system administrators, comparing text files for discrepancies is one of the most frequent tasks.

The diff command compares files individually. Type man diff on your terminal to learn more.

If you have any queries, feel free to leave a comment below, and we'll be happy to help.