Linux Training : 4. grep

grep

grep is used for finding strings within files or groups of files. A simple example of looking for the string "foo" in the file myfile.txt is

grep foo myfile.txt

. Grep has many modification flags that can be used to narrow the searching process or change the output process. In general, the grep command format is

grep [modifier flags] <string to find> <search path>

A few useful examples:

  • find how many lines foo appears on in myfile.txt
    • grep -c foo myfile.txt
  • find Foo or fOO or FoO in myfile.txt
    • grep -i foo myfile.txt
  • list file names that have foo in the file within the current directory
    • grep -l foo * 
  • show 3 lines before and 2 lines after the line with foo
    • grep -B3 -A2 foo myfile.txt
  • show lines that don't have foo
    • grep -v foo myfile.txt
  • find lines that begin with foo
    • grep "^foo" myfile.txt
  • find lines that end with foo
    • grep "foo$" myfile.txt
  • find file name with foo in all sub-directories
    • grep -l -R foo *
  • show the found string from a regular expression search
    • grep -o "^[[:alpha:]]\{3,6\}" myfile.txt
      grep has powerful regular expression capability. This allows searching for variable strings. A single "." will match any single character (any printable character).
  • search for lines that begin with foo and end with bar and have any alphanumeric stuff between
    • grep "^foo[[:alnum:]]bar$" myfile.txt
      This will find
      foo3bar
      fooabar
      
      but it will not find
      foo bar
      
      because the space is not an alphanumeric character. Other predefined character sets are [:alpha:], [:cntrl:], [:digit:], [:graph:], [:lower:], [:print:], [:punct:], [:space:], [:upper:], and [:xdigit:] (xdigit is hexadacimal digit 0-9A-F). These are called "bracket expressions". grep will match a single character in the set. To match more than one you append a counting modifier to the regular expression like:
             ?      The preceding item is optional and matched at most once.
             *      The preceding item will be matched zero or more times.
             +      The preceding item will be matched one or more times.
             {n}    The preceding item is matched exactly n times.
             {n,}   The preceding item is matched n or more times.
             {n,m}  The preceding item is matched at least n  times, but not more than m times.
      
  • To find foo followed by 1 or more alphabetic characters
    • grep "foo[[:alpha:]]\+" myfile.txt
      

      The "+" has to be escaped in the double quotes due to shell expansion issues as does the ?, {, } and . characters"

      Here is a text file to run some practices on using grep. Copy and save to a file called test.txt and try the following:
      1. How many lines have the string "sed"?
      2. How many lines have the string "Sed" at the beginning of the line?
      3. How many blank lines are there?
      4. What do you find searching for the string "Pro" or "pro" followed by 1 to 3 letters or digits, a single space, a word of unknown length, another space followed by an unknown number of digits?
      5. How many periods are at the end of a line?
      6. How many periods are there?




Answers below





grep answers

grep

  1. How many lines have the string "sed"?
     grep -c "sed" test.txt
    6
  2. How many lines have the string "Sed" at the beginning of the line?
    grep -c "^Sed" test.txt
    2
  3. How many blank lines are there?
    grep -c "^$" test.txt
    6
  4. What do you find searching for the string "Pro" or "pro" followed by 1 to 3 letters or digits, a single space, a word of unknown length, another space followed by an unknown number of digits?
    grep -o "[pP]ro[[:alnum:]]\{1,3\} .\+ [[:digit:]]\+" test.txt
    Proin3 vitae 12
  5. How many periods are at the end of a line?
    grep -c "\.$" test
    5
  6. How many periods are there?
    grep -o "\." test | grep -c "\."
    81

Attachments:

testtext (text/plain)