Introduction
Many utilities for processing and filtering text files are available in Linux and Unix platforms. cut
is a command-line tool for cutting lines from files or piped data and printing the result to standard output. It can be used to break lines into chunks based on the delimiter, byte position, and character.
In this tutorial we'll show you how to use the cut
command with examples and thorough explanations of the most common choices. We will also address a few FAQs on cut Command in Linux.
How to Use the cut
Command
The cut
command has the following syntax:
cut OPTION... [FILE]...
When cutting out selected bits of lines, the parameters that instruct cut
whether to use a delimiter, byte location, or character are as follows:
-f
(--fields=LIST
) - Select a field, a set of fields, or a range of fields. This is by far the most popular option.-b
(--bytes=LIST
) - Choose a byte, a group of bytes, or a range of bytes.-c
(--characters=LIST
) - Select by specify a character, a set of characters, or a range of characters.
One and only one of the choices given above can be used.
Other options include:
-d
(--delimiter
) - Replaces the default "TAB" delimiter with a delimiter of your choice.--complement
- Adds a finishing touch to the selection. When this option is selected, all bytes, characters, or fields are displayed except the one that is selected.-s
(--only-delimited
) -Cut
prints only the lines with no delimiter character by default.Cut
does not print lines that do not have delimiters when this option is selected.--output-delimiter
-Cut
uses the input delimiter as the output delimiter by default. You can define an alternative output delimiter string with this option.
The cut
command can take one or more FILE
names as input. Cut
will read from the standard input if no FILE
is supplied or if FILE is -
.
The LIST
parameter can be an integer, several integers separated by commas, a range of numbers, or multiple integer ranges separated by commas when used with the -f
, -b
, and -c
options. One of the following ranges can be used for each range:
N
is the Nth field, byte, or character starting from 1.N-
from the Nth field, byte or character, to the end of the line.N-M
from the Nth to the Mth field, byte, or character.-M
from the first to the Mth field, byte, or character.
How to Cut by Field
The -f
argument is used to define the fields that should be cut when using the command. The default delimiter is "TAB" if none is supplied.
The following file will be used in the examples below. Tabs are used to separate the fields.
245:789 4567 M:4540 Admin 01:10:1980
535:763 4987 M:3476 Sales 11:04:1978
To display the first and third fields, for example, you would type:
cut test.txt -f 1,3
Output
245:789 M:4540
535:763 M:3476
Alternatively, if you wish to show data from the first to the fourth field, type:
cut test.txt -f -4
Output
245:789 4567 M:4540 Admin
535:763 4987 M:3476 Sales
How to cut based on a delimiter
To cut using a delimiter, use the command with the -d
option and the delimiter you want to use.
You would type the following command to display the first and third fields using ":" as a delimiter:
cut test.txt -d ':' -f 1,3
Output
245:4540 Admin 01
535:3476 Sales 11
As a delimiter, you can use any single character. The space character is used as a delimiter in the following example, and the second field is printed:
echo "Lorem ipsum dolor sit amet" | cut -d ' ' -f 2
Output
ipsum
How to complement the selection
Use the --complement
option to add to the selected field list. Only the fields not chosen with the -f
option will be printed.
All fields except the first and third will be printed with the following command:
cut test.txt -f 1,3 --complement
Output
4567 Admin 01:10:1980
4987 Sales 11:04:1978
How to specify an output delimiter
The --output-delimiter
option is used to specify the output delimiter. To set the output delimiter to _
for example, type:
cut test.txt -f 1,3 --output-delimiter='_'
Output
245:789_M:4540
535:763_M:3476
How to Cut by Bytes and Characters
Before we proceed any further, it's important to understand the difference between bytes and characters.
A byte is an 8-bit value that may hold 256 distinct values. The ASCII standard took into account all the letters, numbers, and symbols required to function with English when it was created. Each character is represented by one byte in the ASCII character table, which has 128 characters. When computers became more widely available, tech companies began to develop new character encodings for many languages. A simple 1 to 1 mapping was not practicable for languages with more than 256 characters. This causes issues such as sharing documents and surfing websites, necessitating the creation of a new Unicode standard that can accommodate the majority of the world's writing systems. To address these issues, UTF-8 was established. Not all characters in UTF-8 are represented by a single byte. A single byte to four bytes can be used to represent a character.
The -b
(--bytes
) option instructs the command to cut chunks from each line based on byte positions specified.
The ü
character, which occupies two bytes, is used in the following examples.
Choose the fifth byte:
echo 'drüberspringen' | cut -b 5
Output
b
Choose the 5th, 9th, and 13th bytes:
echo 'drüberspringen' | cut -b 5,9,13
bpg
Choose the range from 1st to 5th byte:
echo 'drüberspringen' | cut -b 1-5
Output
drüb
At the time of writing, the version of cut
included with GNU coreutils did not offer a character-by-character option. cut
acts similarly to the -b
option when using the -c
option.
Cut Examples
Typically, the cut
command is used in conjunction with other commands via piping. Listed below are a few examples:
Get a list of all users
cut
receives the output of the getent passwd
command, which publishes the first field using the delimiter :
.
getent passwd | cut -d ':' -f1
A list of all system users appears in the output.
View 10 most frequently used commands
cut
is used to strip the first 8 bytes from each line of the history
command output in the following example.
history | cut -c8- | sort | uniq -c | sort -rn | head
FAQs on cut Command in Linux
How can I extract a specific column from a text file using the cut
command?
To extract a specific column, use the -f
option followed by the column number(s) or range(s) of columns to extract. For example, cut -f3 myfile.txt
will extract the third column from the text file.
Can I extract multiple columns at once with the cut
command?
Yes, you can extract multiple columns at once by specifying a list of column numbers or ranges separated by commas. For example, cut -f1,3,5 myfile.txt
will extract the first, third, and fifth columns.
How do I change the output delimiter of the cut
command?
To change the output delimiter, use the -s
option along with the -d
option, followed by the desired character or string. For example, cut -d',' -s -f1 myfile.txt
will use a comma as the output delimiter.
Can I use the cut
command to extract characters from a specific position in each line?
Yes, you can use the -c
option followed by a comma-separated list of character positions or ranges. For example, cut -c1-5 myfile.txt
will extract the first five characters from each line.
How do I specify a specific range of characters to extract using the cut
command?
To specify a range of characters, use the -b
option followed by a comma-separated list of byte positions or ranges. For example, cut -b3-8 myfile.txt
will extract bytes 3 to 8 from each line.
What if my text file contains different field lengths?
By default, the cut
command treats each line as having the same field lengths. However, you can use the -s
option to suppress lines that do not contain delimiters, effectively skipping lines with different field lengths.
Is it possible to count the number of fields or characters extracted using the cut
command?
Yes, you can use the wc
command in conjunction with the cut
command to count the number of fields or characters extracted. For example, cut -f3 myfile.txt | wc -l
will count the number of fields extracted from the third column.
Conclusion
The cut
command displays specified fields from each line of a supplied file or from the standard input.
If you have any queries, please leave a comment below and we’ll be happy to respond to them.