AICollection Help

In-depth on commands

Here are several Linux commands that are especially useful for processing text directly from the command line. These tools can be combined (often using pipes) to perform complex text-manipulation tasks.

Basic File Viewing and Counting

  • cat
    Display the entire content of a file.
    Example:

    cat file.txt
  • head
    Show the first few lines of a file (default is 10).
    Example:

    head -n 10 file.txt
  • tail
    Show the last few lines of a file. Useful with the -f option for live updates.
    Example:

    tail -n 10 file.txt tail -f logfile.txt
  • wc
    Count lines, words, and characters in a file.
    Example:

    wc -l file.txt # Count only lines

Searching and Filtering

  • grep
    Search for patterns in files using regular expressions.
    Example:

    grep 'pattern' file.txt grep -R 'pattern' /path/to/directory # Recursive search
  • egrep/grep -E
    Use extended regular expressions for more complex pattern matching.
    Example:

    egrep 'pattern1|pattern2' file.txt

Editing and Transforming

  • sed
    A stream editor for filtering and transforming text. Commonly used for substitutions.
    Example:

    sed 's/old/new/g' file.txt # Replace 'old' with 'new' on every line
  • awk
    A powerful programming language for text processing, ideal for pattern scanning and reporting.
    Example:

    awk '{print $1}' file.txt # Print the first field (column) of each line awk -F, '{print $2}' file.txt # Use comma as field separator and print second field
  • cut
    Remove sections from each line of files by specifying a delimiter and field.
    Example:

    cut -d',' -f1 file.txt # Extract the first comma-separated field
  • tr
    Translate or delete characters.
    Example:

    tr 'a-z' 'A-Z' < file.txt # Convert all lowercase letters to uppercase tr -d '\r' < file.txt # Remove carriage return characters

Sorting, Merging, and Comparing

  • sort
    Sort lines in text files.
    Example:

    sort file.txt sort -r file.txt # Reverse sort
  • uniq
    Filter out or report repeated lines (typically used after sort to work correctly).
    Example:

    sort file.txt | uniq uniq -c file.txt # Pre-count repeated lines in a sorted file
  • paste
    Merge lines of files side-by-side.
    Example:

    paste file1.txt file2.txt
  • join
    Join lines of two files on a common field (files must be sorted on the join field).
    Example:

    join file1.txt file2.txt
  • comm
    Compare two sorted files line by line and display common or unique lines.
    Example:

    comm file1.txt file2.txt
  • diff
    Compare files line by line and output the differences.
    Example:

    diff file1.txt file2.txt

Formatting and Building Command Pipelines

  • fmt
    Reformat text paragraphs to a desired width, making it easier to read.
    Example:

    fmt -w 80 file.txt # Wrap text to 80 characters per line
  • xargs
    Build and execute command lines from standard input. Useful for processing lists of items.
    Example:

    find . -name "*.txt" | xargs grep 'pattern'

Additional Tips

  • Combining Commands with Pipes (|):
    You can chain these commands together to create powerful text-processing pipelines. For example:

    cat file.txt | grep 'error' | sort | uniq -c | sort -nr

    This pipeline shows error messages sorted by frequency.

  • Scripting:
    For more complex tasks, consider writing a shell script or using higher-level languages like Python or Perl directly from the command line.

These commands form a toolkit that, when combined effectively, can help you perform a wide range of text processing and data manipulation tasks in Linux. Experiment with them to become more comfortable with their syntax and capabilities!

Last modified: 08 February 2025