Filters
awk
-
Awk programs are equivalent to sed "instructions" and can be defined inline or in a program file (also "source files"). If no input files are specified awk can accept input from standard input.
# Inline awk $OPTIONS $PROGRAM $INPUTFILES # Program file awk $OPTIONS -f $PROGRAMFILE $INPUTFILESawk programs combine patterns and actions
Patterns can be:
- regular expressions or fixed strings
- line numbers using builtin variable
NR - predefined patterns
BEGINorEND, whose actions are executed before and after processing any lines of the data file, respectively
Convert ":" to newlines in $PATH environment variable
echo $PATH | awk 'BEGIN {RS=":"} {print}'Print the first field of all files in the current directory, taking semicolon
;as the field separator, outputting filename, line number, and first field of matches, with colon:between the filename and line numbersearch for stringawk 'BEGIN {FS=";"} /enable/ {print FILENAME ":" FNR,$1}' *MAin all files, outputting filename, line, and line number for matcheschange field separator (awk '/MA/ {OFS=" " print FILENAME OFS FNR OFS $0} *FS) to a colon (:) and runawkscrflag also works for awkawk -F: -f awkscr /etc/passwdprint the first field of each line in the input fileawk -f script files` `-fequivalent toawk '{ print $1 }' listgrep MA *({print}is implied)awk '/MA/' * | awk '/MA/ {print}' *-Fflag is followed by field separatorpipe output ofawk -F, '/MA/ { print $1 }' listfreetoawkto get free memory and total memorypipe output offree -h | awk '/^Mem|/ {print $3 "/" $2}sensorstoawkto get CPU temperaturereplace initial "fake." with "real;" in filesensors | awk '/^temp1/ {print $2}fake_isbnprint all linesawk 'sub(^fake.,"real;")' fake_isbnremove file headerawk '1 { print }' fileremove file headerawk 'NR>1' fileprint lines in a rangeawk 'NR>1 { print } fileremove whitespace-only linesawk 'NR>1 && NR < 4' fileremove all blank linesawk 'NF' fileextract fieldsawk '1' RS='' fileperform column-wise calculationsawk '{ print $1, $3}' FS=, OFS=, filecount the number of nonempty linesawk '{ SUM=SUM+$1 } END { print SUM }' FS=, OFS=, filecount the number of nonempty linesawk '/./ { COUNT+=1 } END { print COUNT }' filecount the number of nonempty linesawk 'NF { COUNT+=1 } END { print COUNT }' fileArraysawk '+$1 { COUNT+=1 } END { print COUNT }' fileIdentify duplicate linesawk '+$1 { CREDITS[$3]+=$1 } END { for (NAME in CREDITS) print NAME, CREDITS[NAME] }' FS=, fileRemove duplicate linesawk 'a[$0]++' fileRemove multiple spacesawk '!a[$0]++' fileJoin linesawk '$1=$1' fileawk '{ print $3 }' FS=, ORS=' ' file; echoawk '+$1 { SUM+=$1; NUM+=1 } END { printf("AVG=%f",SUM/NUM); }' FS=, file` | formatConvert to uppercaseawk '+$1 { SUM+=$1; NUM+=1 } END { printf("AVG=%6.1f",SUM/NUM); }' FS=, fileChange part of a stringawk '$3 { print toupper($0); }' fileSplit the second field ("EXPDATE") by spaces, storing the result into the array DATE; then print credits ($1) and username ($3) as well as the month (DATE[2]) and year (DATE[3])awk '{ $3 = toupper(substr($3,1,1)) substr($3,2) } $3' FS=, OFS=, fileawk '+$1 { split($2, DATE, " "); print $1,$3, DATE[2], DATE[3] }' FS=, OFS=, fileawk '+$1 { split($4, GRP, ":"); print $3, GRP[1], GRP[2] }' FS=, fileSearch and replace with commaawk '+$1 { split($4, GRP, /:+/); print $3, GRP[1], GRP[2] }' FS=, fileAdding dateawk '+$1 { gsub(/ +/, "-", $2); print }' FS=, fileModify a field externallyawk 'BEGIN { printf("UPDATED: "); system("date") } /^UPDATED:/ { next } 1' fileInvoke dynamically generated commandawk '+$1 { CMD | getline $5; close(CMD); print }' CMD="uuid -v4" FS=, OFS=, fileJoin dataawk '+$1 { cmd = sprintf(FMT, $2); cmd | getline $2; close(cmd); print }' FMT='date -I -d "%s"' FS=, fileAdd up all first records to {sum}, then print that number out at the endawk '+$1 { CMD | getline $5; print }' CMD='od -vAn -w4 -t x /dev/urandom' FS=, fileawk '{sum += $1} END {print sum}' file
cat
cut
grep
grep -R $TEXT $DIRECTORY
head
- Print first 8 characters of
$FILEhead -c8 $FILE
paste
-
Merge lines of files
Make a .csv file from two lists
Transpose rowspaste -d ',' file1 file2paste -s file1 file2
sed
-
sed ("Stream-oriented editor") is typically used for applying repetitive edits across all lines of multiple files. In particular it is, alongside
awkone of the two primary commands which accept regular expressions in Unix systems.sed instructions can be defined inline or in a command file (i.e. script).
Inlinesed $OPTIONS $INSTRUCTION $FILECommand filesed $OPTIONS -f $SCRIPT $FILEsed instructions are made of two components: addresses (i.e. patterns) and procedures (i.e. actions).
Run sed commands in
$SCRIPTon$FILESuppress automatic printing of pattern spacesed -f $SCRIPT $FILEsed -n # --quiet , --silentZero, one, or two addresses can precede a procedure. In the absence of an address, the procedure is executed over every line of input. With one address, the procedure will be executed over every line of input that matches.
With two addresses, the procedure will be executed over groups of lines whereby:
- The first address selects the first line in the first group
- The second address selects the next subsequent line that it matches, which becomes the last line in the first group
- If no match for the second address is found, it point to the end of the file
- After the match, the selection process for the next group begins by searching for a match to the first address
Addressing can be done in one of two ways:
- Line addressing, specifying line numbers separated by a comma (e.g.
3,7p);$represents the last line of input - Context addressing, using a regular expression enclosed by forward slashes (e.g.
/From:/p)
Edit the file in-place, but save a backup copy of the original with {suffix} appended to - the filename
-i=suffixIn some circles, sed is recommended as a replacement for other filters like head. Here, the first 10 lines of a file are displayed.
sed 10q $FILEDisplay the top 10 processes by memory or cpu usage.
ps axch -o cmd,%mem --sort=-%mem | sed 11q ps axch -o cmd:15,%cpu --sort=-%cpu | sed 11qReplace angle brackets with their HTML codes, piped in from a heredoc:
sed -e 's/</\</g' -e 's/>/\>/g' << EOF<!-- Display first two lines of file Without
-n, each line will be printed twicesed -n '1,2p' emp.lstPrepending
!to the procedure reverses the sense of the command (YUG: 450)sed -n '3,$!p' emp.lstDisplay a range of lines
Use thesed -n '9,11p' emp.lst-eflag to precede multiple instructionsDelete lines Delete second line alonesed -n -e '1,2p' -e '7,9p' -e '$p' emp.lstDelete a range of lines: from the 2nd through the 3rdsed '2d' myfileDelete a range of lines, from the first occurrence of 'second' to the line with the first occurrence of 'fourth'sed '2,3d' myfilePrint all of a file except for specific lines Suppress any line with 'test' in itsed '/second/,/fourth/d' myfilesed '/test/d' myfileSuppress from the 3rd line to EOF
sed '3,$d' myfileReplace the first instance of the
|character with:and display the first two lines [YUG:455]Replace all instances of thesed 's/|/:/ emp.lst | head -2|character with:, displaying the first two lines [YUG:455]Substitute HTML tags:sed 's/|/:/g' emp.lst | head -2These commands will replace "director" with "executive director"sed 's/<I>/<EM>/g'sed 's/director/executive director/' emp.lstsed 's/director/executive &/' emp.lstsed '/director/s//executive &/' emp.lstSearching for text
Equivalent to
grep MA *Stringing sed statements together with pipe Take lines beginning with "fake" and remove all instances of "fake.", piping them... remove all parentheses with content and count lines of output (results)sed -n '/MA/p' *Take lines of all files in CWD beginning with "fake" and remove all instances of string "fake." Then remove all parentheses with any content within them and print only the top 10 linessed -n '/^fake/s/fake\.//p' * | sed -nr 's/\(.*\)//p' | wc -lCount the number of pipes replaced by piping output tosed -ne '/^fake/p' * | sed -n 's/fake\.//p' | sed -nr 's/\(.*\)//p' | sed 11qcmp, which will use the-loption to output byte numbers of differing values, then counting the lines of output (YUG:456)-->sed 's/|/:/g' emp.lst | cmp -l - emp.lst | wc -l
tail
-
Output last lines beginning at 30th line from the start
tail -n=+30tail --lines=+30
tr
-
Change the case of a string ]
Remove a character or set of characters from a string or line of outputtr [:upper:] [:lower:]tr -d "text"
watch
- Execute
$CMDat periods of$Nseconds, watching its output CLKFCheck memory usage in megabytes (watch $CMD -n $N-m) every5seconds Enkiwatch -n 5 free -m