SHELL - Lab 2
Grep:
Grep is an acronym that stands for Global Regular Expression Print.
Grep is a Linux / Unix command-line tool used to search for a string of characters in a specified file. The text search pattern is called a regular expression. When it finds a match, it prints the line with the result. The grep command is handy when searching through large log files.
Usage:
The grep command consists of three parts in its most basic form. The first part starts with grep
, followed by the pattern that you are searching for. After the string comes the file name that the grep searches through.
The simplest grep command syntax looks like this:
Search multiple files:
To search multiple files with the grep command, insert the filenames you want to search, separated with a space character. In our case, the grep command to match the word moon
in three files file1
, file2
, and file3
looks like this example:
Important flags for using grep:
Flag
Example
Description
-r
grep -r text fileName
Searches also in subdirectories of the specified 'fileName'.
-n
grep -n text fileName
Prints the line number of output.
-H
grep -H text fileName
Includes the filename always in the output. By default the filename is only included if several files are searched.
-c
grep -c text fileName
Counts the occurrences of the search term in the specified files.
-l
grep -l text fileName
Lists the files which includes the search term.
-L
grep -L text fileName
Lists the files which do not include the search term.
Regex:
Linux Regular Expressions are special characters which help search data and matching complex patterns. Regular expressions are shortened as 'regexp' or 'regex'. They are used in many Linux programs like grep, bash, rename, sed, etc.
Basic Regular expressions:
Some of the commonly used commands with Regular expressions are tr, sed, vi and grep. Listed below are some of the basic Regex.
Symbol
Descriptions
.
replaces any character
^
matches start of string
$
matches end of string
*
matches up zero or more times the preceding character
\
Represent special characters
()
Groups regular expressions
?
Matches up exactly one character
Example:
Suppose we need to find lines starting with F in 'file.txt' that contains the following text:
We will use the following:
Output:
Interval Regular expressions:
These expressions tell us about the number of occurrences of a character in a string.
Expression
Description
{n}
Matches the preceding character appearing atleast 'n' times.
{n,m}
Matches the preceding character appearing 'n' times but not more than m.
Example:
Filter out all lines that contain character 'p' atleast 2 times consecutively in the file 'file.txt' as shown below:
We will use the following:
Output:
Extended regular expressions:
These regular expressions contain combinations of more than one expression.
Expression
Description
\+
Matches one or more occurrence of the previous character.
\?
Matches zero or one occurrence of the previous character.
Example:
Suppose we want to filter out lines where character 'a' precedes character 't' from the file 'file.txt' as follows:
We will use the following:
Output:
Sed (Stream editor):
sed can be used at the command-line, or within a shell script, to edit a file non-interactively. most useful feature is to do a 'search-and-replace' for one string to another.
Replacing or substituting string : Sed command is mostly used to replace the text in a file. The below simple sed command replaces the word “unix” with “linux” in the file.
Replacing the nth occurrence of a pattern in a line : Use the /1, /2 etc flags to replace the first, second occurrence of a pattern in a line. The below command replaces the second occurrence of the word “unix” with “linux” in a line.
Replacing all the occurrence of the pattern in a line : The substitute flag /g (global replacement) specifies the sed command to replace all the occurrences of the string in the line.
Replacing from nth occurrence to all occurrences in a line : Use the combination of /1, /2 etc and /g to replace all the patterns from the nth occurrence of a pattern in a line. The following sed command replaces the third, fourth, fifth… “unix” word with “linux” word in a line.
Replacing string on a specific line number : You can restrict the sed command to replace the string on a specific line number. The following does the operation only on the 3rd line.
Replacing string on a range of lines : You can specify a range of line numbers to the sed command for replacing a string. Here the sed command replaces the lines with range from 1 to 3.
AWK:
AWK is suitable for pattern search and processing. The script runs to search one or more files to identify matching patterns and if the said patterns perform specific tasks.
Syntax:
To demonstrate more about AWK usage, we are going to use the text file called file.txt.
Printing specific columns:
To print the 2nd and 3rd columns, execute the command below.
Printing all lines that match a specific pattern:
If you want to print lines that match a certain pattern, the syntax is as shown:
For instance, to match all entries with the letter ‘o’, the syntax will be:
Printing columns that match a specific pattern:
When AWK locates a pattern match, the command will execute the whole record. You can change the default by issuing an instruction to display only certain fields.
For example:
The above command prints the 3rd and 4th columns where the letter ‘a’ appears in either of the columns
Last updated
Was this helpful?