Best Linux Commands for Log Analysis

The Linux terminal is a wonderful tool for log analysis. It can very quickly comb through thousands of lines of logs, extract information, and alter text.

Here are a few commands and and command-line tools that I have found useful in log analysis.

cat

cat, short for “concatenate”, will output the contents of a text file into the terminal. Pair it with the pipe operator ( | ) to use the printout as input for the next command.

Send the Contents of a File to the Next Command

cat filename.txt | <next command>

grep

grep is a very powerful tool for finding patterns in a text file or files.

-o Print only the matched parts of the line instead of the entire line.
-E Treat the search term as an extended regular expression. (See Basic vs. Extended Regular Expressions).
-v Exclude all lines that match the pattern.

Search for All IPv4 Addresses in a File

grep -oE "\b([0-9]{1,3}\.){3}[0-9]{1,3}\b"
[0-9]{1,3} Searches for any number, from 0 to 9, that is 1 to 3 digits long
\. that ends with a period.
( ){3} and repeats three times. The last number in an IPv4 address doesn’t end with a period, so we add another [0-9]{1,3}.
Note: This extracts only the IP address. To include the full line, use the -o option (instead of -oE).

sed

sed, short for “stream editor,” will filter and transform text (similar to grep).

Remove All Spaces

sed 's/ //g'
‘s/A/B/flag’ The s command (or “substitute”) will replace A with B. You can use many different flags.
g (the flag) Will replace all matches (the default is to only replace the first match).
Note: This removes all spaces, not blank lines or line breaks. For example, “Snow Crash” becomes “SnowCrash

sort

sort will sort the contents of a file line by line. You should always use sort before the uniq command.

Sort by Number and Ignore Leading Spaces

sort -bnr
-b Will ignore leading spaces
-n Sort numerically
-v and reverse the order so the highest number is first.

uniq

uniq prints out only the unique lines in a file. It filters out adjacent, matching lines so you must use sort before uniq.

Display the Number of Times Each Line Occurs

uniq -c

wc

wc or “word count” will print the number of lines, words, bytes, or characters in a file. Useful for finding the number of times something occurs in a log file.

Print the Number of Lines in a File

wc -l

cut

cut will cut and display a specific part of each line in a file.

Print the First Word of Each Line

cut -d " " -f 1

-d ” ”  Specifies a space as the field delimiter, or what separates each field. The default is tab.
-f 1  Cuts and displays the first field, in this case the text before the first space.
Note: We assume there are no leading spaces. If there are, use the sort -b command to remove them.

Example

Find The Number of Unique IP Addresses in File.txt

cat File.txt | grep -oE "\b([0-9]{1,3}\.){3}[0-9]{1,3}\b" | sort -u | wc -l