SHELL - Lab 2
Grep:
Grep is an acronym that stands for Global Regular Expression Print.
Grep is a Linux / Unix command-line tool used to search for a string of characters in a specified file. The text search pattern is called a regular expression. When it finds a match, it prints the line with the result. The grep command is handy when searching through large log files.
Usage:
The grep command consists of three parts in its most basic form. The first part starts with grep
, followed by the pattern that you are searching for. After the string comes the file name that the grep searches through.
The simplest grep command syntax looks like this:
grep findit file
Search multiple files:
To search multiple files with the grep command, insert the filenames you want to search, separated with a space character. In our case, the grep command to match the word moon
in three files file1
, file2
, and file3
looks like this example:
grep moon file1 file2 file3
Important flags for using grep:
Flag
Example
Description
-r
grep -r text fileName
Searches also in subdirectories of the specified 'fileName'.
-n
grep -n text fileName
Prints the line number of output.
-H
grep -H text fileName
Includes the filename always in the output. By default the filename is only included if several files are searched.
-c
grep -c text fileName
Counts the occurrences of the search term in the specified files.
-l
grep -l text fileName
Lists the files which includes the search term.
-L
grep -L text fileName
Lists the files which do not include the search term.
Regex:
Linux Regular Expressions are special characters which help search data and matching complex patterns. Regular expressions are shortened as 'regexp' or 'regex'. They are used in many Linux programs like grep, bash, rename, sed, etc.
Basic Regular expressions:
Some of the commonly used commands with Regular expressions are tr, sed, vi and grep. Listed below are some of the basic Regex.
Symbol
Descriptions
.
replaces any character
^
matches start of string
$
matches end of string
*
matches up zero or more times the preceding character
\
Represent special characters
()
Groups regular expressions
?
Matches up exactly one character
Example:
Suppose we need to find lines starting with F in 'file.txt' that contains the following text:
Hi
I'm
Feeling
Fine
today
We will use the following:
grep ^F file.txt
Output:
Feeling
Fine
Interval Regular expressions:
These expressions tell us about the number of occurrences of a character in a string.
Expression
Description
{n}
Matches the preceding character appearing atleast 'n' times.
{n,m}
Matches the preceding character appearing 'n' times but not more than m.
Example:
Filter out all lines that contain character 'p' atleast 2 times consecutively in the file 'file.txt' as shown below:
apple
appple
ale
apppple
apaaple
We will use the following:
grep -E p\{2} file.txt
Output:
apple
appple
apppple
Extended regular expressions:
These regular expressions contain combinations of more than one expression.
Expression
Description
\+
Matches one or more occurrence of the previous character.
\?
Matches zero or one occurrence of the previous character.
Example:
Suppose we want to filter out lines where character 'a' precedes character 't' from the file 'file.txt' as follows:
hate
bat
ant
dent
ate
We will use the following:
grep "a\+t" file.txt
Output:
hate
bat
ate
Sed (Stream editor):
sed can be used at the command-line, or within a shell script, to edit a file non-interactively. most useful feature is to do a 'search-and-replace' for one string to another.
Replacing or substituting string : Sed command is mostly used to replace the text in a file. The below simple sed command replaces the word “unix” with “linux” in the file.
sed 's/unix/linux/' file.txt
Replacing the nth occurrence of a pattern in a line : Use the /1, /2 etc flags to replace the first, second occurrence of a pattern in a line. The below command replaces the second occurrence of the word “unix” with “linux” in a line.
sed 's/unix/linux/2' file.txt
Replacing all the occurrence of the pattern in a line : The substitute flag /g (global replacement) specifies the sed command to replace all the occurrences of the string in the line.
sed 's/unix/linux/g' file.txt
Replacing from nth occurrence to all occurrences in a line : Use the combination of /1, /2 etc and /g to replace all the patterns from the nth occurrence of a pattern in a line. The following sed command replaces the third, fourth, fifth… “unix” word with “linux” word in a line.
sed 's/unix/linux/3g' file.txt
Replacing string on a specific line number : You can restrict the sed command to replace the string on a specific line number. The following does the operation only on the 3rd line.
sed '3 s/unix/linux/' file.txt
Replacing string on a range of lines : You can specify a range of line numbers to the sed command for replacing a string. Here the sed command replaces the lines with range from 1 to 3.
sed '1,3 s/unix/linux/' geekfile.txt
AWK:
AWK is suitable for pattern search and processing. The script runs to search one or more files to identify matching patterns and if the said patterns perform specific tasks.
Syntax:
awk options 'selection _criteria {action }' input-file > output-file
To demonstrate more about AWK usage, we are going to use the text file called file.txt.

Printing specific columns:
To print the 2nd and 3rd columns, execute the command below.
awk '{print $2 "\t" $3}' file.txt

Printing all lines that match a specific pattern:
If you want to print lines that match a certain pattern, the syntax is as shown:
awk '/variable_to_be_matched/ {print $0}' file.txt
For instance, to match all entries with the letter ‘o’, the syntax will be:
awk '/o/ {print $0}' file.txt

Printing columns that match a specific pattern:
When AWK locates a pattern match, the command will execute the whole record. You can change the default by issuing an instruction to display only certain fields.
For example:
awk '/a/ {print $3 "\t" $4}' file.txt
The above command prints the 3rd and 4th columns where the letter ‘a’ appears in either of the columns

Last updated
Was this helpful?