tr
The tr command (short for translate or delete characters) is a simple yet highly effective utility for performing character-level transformations on text. It reads from standard input and writes to standard output, making it ideal for use in pipelines and shell scripts. In this comprehensive guide, we’ll cover everything you need to know about tr—from its basic syntax and options to practical examples and advanced tips.
Table of Contents
Introduction
tr is a command-line utility primarily used for translating, deleting, and squeezing characters in its input. Unlike more sophisticated text processing tools such as sed or awk, tr operates on a character-by-character basis. Its simplicity and speed make it ideal for many common tasks, such as:
Converting text from lowercase to uppercase (or vice versa)
Removing unwanted characters from input streams
Squeezing multiple consecutive instances of a character into a single occurrence
Because tr reads from standard input and writes to standard output, it can be easily integrated into pipelines.
Basic Syntax and How tr Works
The general syntax of the tr command is:
SET1: The set of characters to be translated or deleted.
SET2: When provided,
trtranslates each character in SET1 to the corresponding character in SET2.OPTIONS: Modify the behavior of
tr(for example, deleting characters, squeezing duplicates, or using complement sets).
Key Characteristics:
Standard Input / Output:
trdoes not work with files directly. Instead, you must redirect input or pipe data into it.Character-by-Character Processing:
Every character in the input is examined and transformed according to the specified rules.
Command-Line Options and Parameters
tr provides several options to refine its behavior:
Translating Characters
When both SET1 and SET2 are provided, tr replaces each character in SET1 with the corresponding character in SET2. If SET2 is shorter than SET1, the last character of SET2 is reused.
Example:
In this example, every lowercase letter is translated to its uppercase counterpart.
Deleting Characters
The -d option instructs tr to delete all characters in SET1 from the input.
Example:
This command removes any numeric characters from the given text.
Squeezing Repeated Characters
The -s (or --squeeze-repeats) option compresses sequences of repeated characters into a single instance.
Example:
This will replace runs of consecutive spaces with a single space.
Using Complement Sets
The -c option tells tr to use the complement of SET1. That means the operation applies to all characters not in SET1.
Example:
This command retains only the digits from the input.
Practical Examples
Case Conversion
Uppercase Conversion:
Lowercase Conversion:
These commands perform a straightforward case transformation on the text.
Removing Specific Characters
Remove punctuation from a string:
Here, the POSIX character class [:punct:] is used to target all punctuation characters for deletion.
Squeezing Repeated Characters Example
Squeeze multiple newlines into one:
This results in each set of consecutive newlines being compressed into a single newline.
Combining with Other Commands
Using tr in a pipeline:
This command transforms the directory listing to uppercase and removes extra spaces, demonstrating how tr can be combined with other commands to process output.
Advanced Usage and Tips
Character Ranges and Classes:
You can use ranges likea-zor POSIX character classes (e.g.,[:digit:],[:alpha:]) to specify sets in a flexible way.Handling Non-Printable Characters:
trcan work with non-printable characters if you represent them using escape sequences (e.g.,\nfor newline,\tfor tab).Combining Options:
You can combine the-dand-soptions to delete unwanted characters and compress others in one go. For instance, deleting all non-alphanumeric characters and squeezing spaces might look like:echo "Data---with,,,weird@@@characters" | tr -d -c '[:alnum:] ' | tr -s ' 'Locale Considerations:
Be aware that character ranges (likea-z) can be affected by the current locale settings. For predictable results, you may need to set your locale (e.g.,export LC_ALL=C).
Conclusion and Further Reading
The tr command is a straightforward yet potent utility for transforming text at the character level. Whether you need to change case, remove unwanted characters, or compress repeated characters, tr offers a fast and efficient solution that fits seamlessly into shell pipelines and scripts.
Further Reading and Resources
Manual Page:
View the detailed manual by typing:man trOnline Documentation:
Community Examples:
Explore various usage examples on forums, blogs, and Q&A sites like Stack Overflow.
Experiment with tr on your own text streams to see how it can simplify and streamline your text processing tasks. Happy translating!