AICollection Help

tr

The tr command (short for translate or delete characters) is a simple yet highly effective utility for performing character-level transformations on text. It reads from standard input and writes to standard output, making it ideal for use in pipelines and shell scripts. In this comprehensive guide, we’ll cover everything you need to know about tr—from its basic syntax and options to practical examples and advanced tips.

Table of Contents

  1. Introduction

  2. Basic Syntax and How tr Works

  3. Command-Line Options and Parameters

  4. Practical Examples

  5. Advanced Usage and Tips

  6. Conclusion and Further Reading

Introduction

tr is a command-line utility primarily used for translating, deleting, and squeezing characters in its input. Unlike more sophisticated text processing tools such as sed or awk, tr operates on a character-by-character basis. Its simplicity and speed make it ideal for many common tasks, such as:

  • Converting text from lowercase to uppercase (or vice versa)

  • Removing unwanted characters from input streams

  • Squeezing multiple consecutive instances of a character into a single occurrence

Because tr reads from standard input and writes to standard output, it can be easily integrated into pipelines.

Basic Syntax and How tr Works

The general syntax of the tr command is:

tr [OPTIONS] SET1 [SET2]
  • SET1: The set of characters to be translated or deleted.

  • SET2: When provided, tr translates each character in SET1 to the corresponding character in SET2.

  • OPTIONS: Modify the behavior of tr (for example, deleting characters, squeezing duplicates, or using complement sets).

Key Characteristics:

  • Standard Input / Output:


    tr does not work with files directly. Instead, you must redirect input or pipe data into it.

  • Character-by-Character Processing:


    Every character in the input is examined and transformed according to the specified rules.

Command-Line Options and Parameters

tr provides several options to refine its behavior:

Translating Characters

When both SET1 and SET2 are provided, tr replaces each character in SET1 with the corresponding character in SET2. If SET2 is shorter than SET1, the last character of SET2 is reused.

Example:

# Convert lowercase to uppercase: echo "hello world" | tr 'a-z' 'A-Z'

In this example, every lowercase letter is translated to its uppercase counterpart.

Deleting Characters

The -d option instructs tr to delete all characters in SET1 from the input.

Example:

# Remove all digits from the input: echo "User123 logged in at 10:45" | tr -d '0-9'

This command removes any numeric characters from the given text.

Squeezing Repeated Characters

The -s (or --squeeze-repeats) option compresses sequences of repeated characters into a single instance.

Example:

# Squeeze multiple spaces into a single space: echo "This is a test." | tr -s ' '

This will replace runs of consecutive spaces with a single space.

Using Complement Sets

The -c option tells tr to use the complement of SET1. That means the operation applies to all characters not in SET1.

Example:

# Delete all characters except digits: echo "Phone: 555-1234" | tr -cd '0-9'

This command retains only the digits from the input.

Practical Examples

Case Conversion

Uppercase Conversion:

echo "convert to uppercase" | tr 'a-z' 'A-Z'

Lowercase Conversion:

echo "CONVERT TO LOWERCASE" | tr 'A-Z' 'a-z'

These commands perform a straightforward case transformation on the text.

Removing Specific Characters

Remove punctuation from a string:

echo "Hello, world! Welcome to 2025." | tr -d '[:punct:]'

Here, the POSIX character class [:punct:] is used to target all punctuation characters for deletion.

Squeezing Repeated Characters Example

Squeeze multiple newlines into one:

printf "Line1\n\n\nLine2\n\nLine3\n" | tr -s '\n'

This results in each set of consecutive newlines being compressed into a single newline.

Combining with Other Commands

Using tr in a pipeline:

# Convert the output of ls to uppercase and squeeze spaces ls | tr 'a-z' 'A-Z' | tr -s ' '

This command transforms the directory listing to uppercase and removes extra spaces, demonstrating how tr can be combined with other commands to process output.

Advanced Usage and Tips

  • Character Ranges and Classes:
    You can use ranges like a-z or POSIX character classes (e.g., [:digit:], [:alpha:]) to specify sets in a flexible way.

  • Handling Non-Printable Characters:
    tr can work with non-printable characters if you represent them using escape sequences (e.g., \n for newline, \t for tab).

  • Combining Options:
    You can combine the -d and -s options to delete unwanted characters and compress others in one go. For instance, deleting all non-alphanumeric characters and squeezing spaces might look like:

    echo "Data---with,,,weird@@@characters" | tr -d -c '[:alnum:] ' | tr -s ' '
  • Locale Considerations:
    Be aware that character ranges (like a-z) can be affected by the current locale settings. For predictable results, you may need to set your locale (e.g., export LC_ALL=C).

Conclusion and Further Reading

The tr command is a straightforward yet potent utility for transforming text at the character level. Whether you need to change case, remove unwanted characters, or compress repeated characters, tr offers a fast and efficient solution that fits seamlessly into shell pipelines and scripts.

Further Reading and Resources

Experiment with tr on your own text streams to see how it can simplify and streamline your text processing tasks. Happy translating!

Last modified: 08 February 2025