In-depth on commands
Here are several Linux commands that are especially useful for processing text directly from the command line. These tools can be combined (often using pipes) to perform complex text-manipulation tasks.
Basic File Viewing and Counting
cat
Display the entire content of a file.
Example:cat file.txthead
Show the first few lines of a file (default is 10).
Example:head -n 10 file.txttail
Show the last few lines of a file. Useful with the-f
option for live updates.
Example:tail -n 10 file.txt tail -f logfile.txtwc
Count lines, words, and characters in a file.
Example:wc -l file.txt # Count only lines
Searching and Filtering
grep
Search for patterns in files using regular expressions.
Example:grep 'pattern' file.txt grep -R 'pattern' /path/to/directory # Recursive searchegrep
/grep -E
Use extended regular expressions for more complex pattern matching.
Example:egrep 'pattern1|pattern2' file.txt
Editing and Transforming
sed
A stream editor for filtering and transforming text. Commonly used for substitutions.
Example:sed 's/old/new/g' file.txt # Replace 'old' with 'new' on every lineawk
A powerful programming language for text processing, ideal for pattern scanning and reporting.
Example:awk '{print $1}' file.txt # Print the first field (column) of each line awk -F, '{print $2}' file.txt # Use comma as field separator and print second fieldcut
Remove sections from each line of files by specifying a delimiter and field.
Example:cut -d',' -f1 file.txt # Extract the first comma-separated fieldtr
Translate or delete characters.
Example:tr 'a-z' 'A-Z' < file.txt # Convert all lowercase letters to uppercase tr -d '\r' < file.txt # Remove carriage return characters
Sorting, Merging, and Comparing
sort
Sort lines in text files.
Example:sort file.txt sort -r file.txt # Reverse sortuniq
Filter out or report repeated lines (typically used aftersort
to work correctly).
Example:sort file.txt | uniq uniq -c file.txt # Pre-count repeated lines in a sorted filepaste
Merge lines of files side-by-side.
Example:paste file1.txt file2.txtjoin
Join lines of two files on a common field (files must be sorted on the join field).
Example:join file1.txt file2.txtcomm
Compare two sorted files line by line and display common or unique lines.
Example:comm file1.txt file2.txtdiff
Compare files line by line and output the differences.
Example:diff file1.txt file2.txt
Formatting and Building Command Pipelines
fmt
Reformat text paragraphs to a desired width, making it easier to read.
Example:fmt -w 80 file.txt # Wrap text to 80 characters per linexargs
Build and execute command lines from standard input. Useful for processing lists of items.
Example:find . -name "*.txt" | xargs grep 'pattern'
Additional Tips
Combining Commands with Pipes (
|
):
You can chain these commands together to create powerful text-processing pipelines. For example:cat file.txt | grep 'error' | sort | uniq -c | sort -nrThis pipeline shows error messages sorted by frequency.
Scripting:
For more complex tasks, consider writing a shell script or using higher-level languages like Python or Perl directly from the command line.
These commands form a toolkit that, when combined effectively, can help you perform a wide range of text processing and data manipulation tasks in Linux. Experiment with them to become more comfortable with their syntax and capabilities!