sed
- text-processing utility
- stream-oriented
- uses simple programming language
- has support for regular expressions
- accepts inputs from files as well as pipes
- can also accept inputs from standard input streams
Some of the ways sed can be used are:
- Text substitution,
- Selective printing of text files,
- In-a-place editing of text files,
- Non-interactive editing of text files
- etc.
sed has the following flow:
- read - read a line from input stream
- execute - execute sed command(s) on a line
- display - display results on an output stream
Read: sed reads a line from the input stream (a file, pipe, or stdin)
and stores it in an internal buffer called pattern buffer.
Execute: All sed commands are applied sequentially on the pattern
buffer. By default, sed commands are applied on all lines (globally)
unless line addressing is specified.
Display: Send the (modified) contents to the output stream.
After sending the data, the pattern buffer will be empty.
The above process repeats until the input stream is exhausted.
Pattern buffer is a private, in-memory, volatile storage area used by sed
By default, all sed commands are applied on the pattern buffer, hence the
input sream remains unchanged
There is another memory area called hold buffer which is also private,
in- memory, volatile storage area - exclusive use by sed. Data can be
stored in a hold buffer for later retrieval. At the end of each cycle, sed
removes the contents of the pattern buffer, but the contents of the
hold buffer remains persistent between sed cycles. However sed commands
cannot be directly executed on hold buffer, hence sed allows data movement between the hold buffer and the pattern buffer.
Initially both the pattern buffer and hold buffer are empty.
If no input files are provided, then SED accepts input from the standard
input stream (stdin).
If address range is not provided by default, then SED operates on each line.
Okay, I'll stop here with the conceptual information. I'd really like to provide some
actual examples, that will lay a good foundation to the conceptual comments, and
not the other way around.
In my opening comments, I made reference to sed commands, and there will
definitely be extensive covereage of those commands during this journey.
However, I'm going to begin from the absolute most basic command.
Note: You are strongly encouraged to run/execute every command that I
provide.
The contents of the file used for this first example is shown below:
Gorillas: 65 years
Chimpanzees: 45 years
Bonobos: 40 years
Orangutans: 40 years
Gibbons: 25 years
The syntax for the most basic command involving sed is:
$ sed '' filename
Note: '' represents 2 back-to-back single quotes - NOT a single double quote!!!
There is not a space between the single quotes, however, the behavior
of the command will NOT change if you put any amount of space between
the single quotes!!!
An actual example would be: $ sed " ani_info
- To keep this as simple as possible, all you need to understand about
what this command is that it is functionally equivalent to the following
command: $ cat ani_info
$ sed '' ani_info <==> $ cat ani_info
- provide identical output
Note: That simple sed command contains NO sed commands!!!!
I'll close this commentary by providing a list of the commands that are
used with/by sed:
a
\text
- Append text after a line.
a text
- Append text after a line (alternative syntax).
b label
- Branch unconditionally to label. The label may be omitted,
in which case the next cycle is started.
c
\text
- Replace (change) lines with text.
c text
- Replace (change) lines with text (alternative syntax).
d
- Delete the pattern space; immediately start next cycle.
D
- If pattern space contains newlines, delete text in the
pattern space up to the first newline, and restart cycle
with the resultant pattern space, without reading a new
line of input. - If pattern space contains no newline, start a normal new
cycle as if the d command was issued.
e
- Executes the command that is found in pattern space
and replaces the pattern space with the output; a trailing
newline is suppressed.
e command
- Executes command and sends its output to the output stream.
The command can run across multiple lines, all but the last
ending with a back-slash.
F
- (filename) Print the file name of the current input file
(with a trailing newline).
g
- Replace the contents of the pattern space with the
contents of the hold space.
G
- Append a newline to the contents of the pattern space,
and then append the contents of the hold space to that of
the pattern space.
h
- (hold) Replace the contents of the hold space with the
contents of the pattern space.
H
- Append a newline to the contents of the hold space, and
then append the contents of the pattern space to that of the
hold space.
i
\text
- insert text before a line.
i text
- insert text before a line (alternative syntax).
l
- Print the pattern space in an unambiguous form.
n
- (next) If auto-print is not disabled, print the pattern space,
then, regardless, replace the pattern space with the next
line of input. If there is no more input then sed exits without
processing any more commands.
N
- Add a newline to the pattern space, then append the next line of
input to the pattern space. If there is no more input then sed exits
without processing any more commands.
p
P
- Print the pattern space, up to the first <newline>.
q[exit-code]
- (quit) Exit sed without processing any more commands
or input.
Q[exit-code]
- (quit) This command is the same as q, but will not print
the contents of pattern space. Like q, it provides the
ability to return an exit code to the caller.
r filename
R filename
- Queue a line of filename to be read and inserted into the
output stream at the end of the current cycle, or when the
next input line is read.
s/regexp/replacement/[flags]
- (substitute) Match the regular-expression against the content
of the pattern space. If found, replace matched string with replacement.
t label
- (test) Branch to label only if there has been a successful
substitution since the last input line was read or conditional
branch was taken. The label may be omitted, in which case
the next cycle is started.
T label
- (test) Branch to label only if there have been no successful
substitutions since the last input line was read or conditional
branch was taken. The label may be omitted, in which case
the next cycle is started.
v [version]
- (version) This command does nothing, but makes sed fail
if GNU sed extensions are not supported, or if the requested
version is not available.
w filename
- Write the pattern space to filename.
W filename
- Write to the given filename the portion of the pattern space up
to the first newline
x
- Exchange the contents of the hold and pattern spaces.
y/src/dst/
- Transliterate any characters in the pattern space which
match any of the source-chars with the corresponding
character in dest-chars.
z
- (zap) This command empties the content of pattern space.
#
- A comment, until the next newline.
{ cmd ; cmd ... }
- Group several commands together.
=
- Print the current input line number (with a trailing newline).
: label
- Specify the location of label for branch commands (b, t, T).
Over the course of this journey - and I do mean journey, and not
a sprint - I promise to provide examples involving each of these
sed commands, along with the options/switches that are available
to be used with sed.
Trevor "Red Hat Evangelist" Chandler