The Linux terminal is a wonderful tool for log analysis. It can very quickly comb through thousands of lines of logs, extract information, and alter text.
Here are a few commands and and command-line tools that I have found useful in log analysis.
cat
cat, short for “concatenate”, will output the contents of a text file into the terminal. Pair it with the pipe operator ( | ) to use the printout as input for the next command.
Send the Contents of a File to the Next Command
cat filename.txt | <next command>
grep
grep is a very powerful tool for finding patterns in a text file or files.
-o Print only the matched parts of the line instead of the entire line.
-E Treat the search term as an extended regular expression. (See Basic vs. Extended Regular Expressions).
-v Exclude all lines that match the pattern.
Search for All IPv4 Addresses in a File
grep -oE "\b([0-9]{1,3}\.){3}[0-9]{1,3}\b"
[0-9]{1,3} Searches for any number, from 0 to 9, that is 1 to 3 digits long
\. that ends with a period.
( ){3} and repeats three times. The last number in an IPv4 address doesn’t end with a period, so we add another [0-9]{1,3}.
Note: This extracts only the IP address. To include the full line, use the -o option (instead of -oE).
sed
sed, short for “stream editor,” will filter and transform text (similar to grep).
Remove All Spaces
sed 's/ //g'
‘s/A/B/flag’ The s command (or “substitute”) will replace A with B. You can use many different flags.
g (the flag) Will replace all matches (the default is to only replace the first match).
Note: This removes all spaces, not blank lines or line breaks. For example, “Snow Crash” becomes “SnowCrash“
sort
sort will sort the contents of a file line by line. You should always use sort before the uniq command.
Sort by Number and Ignore Leading Spaces
sort -bnr
-b Will ignore leading spaces
-n Sort numerically
-v and reverse the order so the highest number is first.
uniq
uniq prints out only the unique lines in a file. It filters out adjacent, matching lines so you must use sort before uniq.
Display the Number of Times Each Line Occurs
uniq -c
wc
wc or “word count” will print the number of lines, words, bytes, or characters in a file. Useful for finding the number of times something occurs in a log file.
Print the Number of Lines in a File
wc -l
cut
cut will cut and display a specific part of each line in a file.
Print the First Word of Each Line
cut -d " " -f 1
-d ” ” Specifies a space as the field delimiter, or what separates each field. The default is tab.
-f 1 Cuts and displays the first field, in this case the text before the first space.
Note: We assume there are no leading spaces. If there are, use the sort -b command to remove them.
Example
Find The Number of Unique IP Addresses in File.txt
cat File.txt | grep -oE "\b([0-9]{1,3}\.){3}[0-9]{1,3}\b" | sort -u | wc -l