Vi Pages - Substitution Guide

HOWTO use the ":s" command

The command ":substitute" is a very powerful command. It makes use of addresses (esp. line ranges), regular expressions (aka patterns), and flags to request for confirmation and other things. You will certainly use this quite often once you know how to use it. This guide should give you a fast intro to the substitution command, so you can hopefully make some good use of it after ten minutes.

NOTE: This guide starts off with basic VI functionality, but also aims to show some of VIM's enhancements. I use it daily and I could not live without it's extra features.

Here we go:

As the substitute command is complex - so is its definition. It involves the definition of terms range, pattern, string and some options:

		:[range]s[ubstitute]/pattern/string/[options]

Note: The separator between 's' and the pattern is the same as between pattern and string and string must always be terminated with it. However, you can chose the character. It does not need be the slash - it can also be the hash ('#'):

		:[range]s[ubstitute]#pattern#string#[options]

The substitute command consists of nine parts. You need not give all nine parameters - parameters in square brackets are optional, ie they are not required. However, the command name 's' and the separator around the pattern are always required. So the shortest substitute command is ":s///". It will be explained below (unless I forget ;-). With VIM there are three abbreviations, namely ":&", ":~", and '&' (see ":help :&", ":help :~", and ":help &").

Vi :substitute - all parts explained

Part 0 - the ex mode colon
The substitute command is a command of the unerdlying editor "ex". Therefore you must enter the enter first to switch to "ex mode". As it is required for all ex commands it is not a part of the command per se. Therefore this is "part 0".

Part 1 - The [range] aka "scope" or "block"
The first part is the range. The square brackets around it denote the fact that this "prefix parameter" is optional. As vanilla Vi does not have online help we'll take a look at VIM's helpfile with the command ":help range":

Line numbers may be specified with:			*:range*
	{number}	an absolute line number
	.		the current line			  *:.*
	$		the last line in the file		  *:$*
	%		equal to 1,$ (the entire file)		  *:%*
	*		equal to '<,'> (the Visual area)	  *:star*
	't		position of mark t (lower case)		  *:'*
	/{pattern}[/]	the next line where {pattern} matches	  *:/*
	?{pattern}[?]	the previous line where {pattern} matches *:?*
	\/		the next line where the previously used search
			pattern matches
	\?		the previous line where the previously used search
			pattern matches
	\&		the next line where the previously used substitute
			pattern matches

Each may be followed (several times) by '+' or '-' and an optional number.
This number is added or subtracted from the preceding line number.  If the
number is omitted, 1 is used.

The "/" and "?" may be preceded with another address.  The search starts from
there.  The "/" and "?" after {pattern} are required to separate the pattern
from anything that follows.

The {number} must be between 0 and the number of lines in the file.  A 0 is
interpreted as a 1, except with the commands tag, pop and read.

Examples:
	.+3		three lines below the cursor
	/that/+1	the line below the next line containing "that"
	.,$		from current line until end of file
	0;/that		the first line containing "that"

Some commands allow for a count after the command.  This count is used as the
number of lines to be used, starting with the line given in the last line
specifier (the default is the cursor line).  The commands that accept a count
are the ones that use a range but do not have a file name argument (because
a file name can also be a number).

Examples:
	:s/x/X/g 5	substitute 'x' by 'X' in the current line and four
			following lines
	:23d 4		delete lines 23, 24, 25 and 26

A range should have the lower line number first.  If this is not the case, Vim
will ask you if it should swap the line numbers.  This is not done within the
global command ":g".
Well, I do hope that this is clear to you. If not - ask me!

Part 2 s[ubstitute] - The command name.
The second part is the name of the command. As you can abbreviate every command with Vi with its unique prefix the "ubstitute" is optional, so you need not type the full name, but only an 's'. However, the 's' is required.

Part 3 - The "separator" (usually '/' or '#')
The "separator" is a character which seperates the command name from the pattern, seperates the pattern from the substitution string, and the substitution string from the options. You can chose any(?) character as a separator, but usually the slash (/) is used. When substitutions contain the slash as a literal character you should chose some other character that is not contained in the pattern or substitutions string. And if this is not possible then you have to "escape" it with a backslash (\): ":s/\/path\/filename/\/newpath\/newfilename/"

Part 4 - The pattern
A pattern is a complex construct. There is no general definition of "pattern", however; it differs depending on its use. Here we ar using a "search pattern", so VIM4.5 will give this text with ":help search_pattern":
The definition of a pattern:				*search_pattern*

Patterns may contain special characters, depending on the setting of the
'magic' option.

							*/bar* */\bar*
1. A pattern is one or more branches, separated by "\|".  It matches anything
   that matches one of the branches.  Example: "foo\|beep" matches "foo" and
   "beep".

2. A branch is one or more pieces, concatenated.  It matches a match for the
   first, followed by a match for the second, etc.  Example: "foo[0-9]beep",
   first match "foo", then a digit and then "beep".

3. A piece is an atom, possibly followed by:
      magic   nomagic
							*/star* */\star*
	*	\*	matches 0 or more of the preceding atom
							*/\+*
	\+	\+	matches 1 or more of the preceding atom {not in Vi}
							*/\=*
	\=	\=	matches 0 or 1 of the preceding atom {not in Vi}

    Examples:
       .*	.\*	matches anything, also empty string
       ^.\+$	^.\+$	matches any non-empty line
       foo\=	foo\=	matches "fo" and "foo"


4. An atom can be:
   - One of these five:
      magic   nomagic
	^	^	at beginning of pattern, matches start of line	*/^*
	$	$	at end of pattern or in front of "\|",		*/$*
			matches end of line
	.	\.	matches any single character		  */.* */\.*
	\<	\<	matches the beginning of a word			*/\<*
	\>	\>	matches the end of a word			*/\>*
	\i	\i	matches any identifier character (see		*/\i*
			'isident' option) {not in Vi}
	\I	\I	like "\i", but excluding digits {not in Vi}	*/\I*
	\k	\k	matches any keyword character (see		*/\k*
			'iskeyword' option) {not in Vi}
	\K	\K	like "\k", but excluding digits {not in Vi}	*/\K*
	\f	\f	matches any file name character (see		*/\f*
			'isfname' option) {not in Vi}
	\F	\F	like "\f", but excluding digits {not in Vi}	*/\F*
	\p	\p	matches any printable character (see		*/\p*
			'isprint' option) {not in Vi}
	\P	\P	like "\p", but excluding digits {not in Vi}	*/\P*
	\e	\e							*/\e*
	\t	\t							*/\t*
	\r	\r							*/\r*
	\b	\b							*/\b*
	~	\~	matches the last given substitute string    */~* */\~*
	\(\)	\(\)	A pattern enclosed by escaped parentheses      */\(\)*
			(e.g., "\(^a\)") matches that pattern
	x	x	A single character, with no special meaning,
			matches itself
	\x	\x	A backslash followed by a single character,	*/\*
			with no special meaning, matches the single
			character
	[]	\[]	A range. This is a sequence of characters	*/[]*
			enclosed in "[]" or "\[]".  It matches any	*/\[]*
			single character from the sequence.  If the
			sequence begins with "^", it matches any
			single character NOT in the sequence.  If two
			characters in the sequence are separated by '-', this
			is shorthand for the full list of ASCII characters
			between them.  E.g., "[0-9]" matches any decimal
			digit.  To include a literal "]" in the sequence, make
			it the first character (following a possible "^").
			E.g., "[]xyz]" or "[^]xyz]".  To include a literal
			'-', make it the first or last character.

If the 'ignorecase' option is on, the case of letters is ignored.

It is impossible to have a pattern that contains a line break.

Examples:
^beep(			Probably the start of the C function "beep".

[a-zA-Z]$		Any alphabetic character at the end of a line.

\<\I\i		or
\(^\|[^a-zA-Z0-9_]\)[a-zA-Z_]\+[a-zA-Z0-9_]*
			A C identifier (will stop in front of it).

\(\.$\|\. \)		A period followed by end-of-line or a space.
			Note that "\(\. \|\.$\)" does not do the same,
			because '$' is not end-of-line in front of '\)'.
			This was done to remain Vi-compatible.

[.!?][])"']*\($\|[ ]\)	A search pattern that finds the end of a sentence,
			with almost the same definition as the ")" command.

Technical detail:
 characters in the file are stored as  in memory.  In the display
they are shown as "^@".  The translation is done when reading and writing
files.  To match a  with a search pattern you can just enter CTRL-@ or
"CTRL-V 000".  This is probably just what you expect.  Internally the
character is replaced with a  in the search pattern.  What is unusual is
that typing CTRL-V CTRL-J also inserts a , thus also searches for a
 in the file.  {Vi cannot handle  characters in the file at all}
Now, this is pretty complex, isn't it? [todo: examples]

Part 5 - The separator (again)
This separator must be the same character as the separator used before.

Part 6 - The "substitution string"
The substitution string also has meta characters, but not as many as the "pattern" used before.

Part 7 - The separator (again)
This separator must be the same character as the separator used before.

Part 8

Part

/
: Part three: The separator. The slash seperates the command name from the following pattern. It need not be a slash, a hash ('#') will do, too. But then you will have to use a hash at the end of the pattern and after the substitution string, too. For now we will just use the slash.

{pattern}

:s/vi/VIM/
Substitute the first occurrence of "vi" with "VIM" on the current line.


More examples

Command: :.,.+7s/^/foo/
Read command as: from "this line" to "this line plus seven lines" substitute "^" (beginning of the current line) with "foo". Informal Description: Add "foo" this and the follwoing seven lines. Remember: The address '.' refers to current line, but it can be left out. So the previous command can be shortened to: :,+7s/^/foo/


Mark them!

Before you apply a substitution command on some lines you usually have to figure out which lines these are so you can use their addresses in the command, eg:

	:23,42s/foo/bar/
But how do you find theat the first line is #23 and that the last line is line #42? You probably have to move the cursor a lot and also write down the number on a piece of paper.

However, there is an easier way: Set a mark on those lines.

Setting a mark is done with the command 'm' followed by any (lowercase) letter, ie from 'a' to 'z'. The current line then is "marked" with that letter.

So, set a mark when the cursor is on the first line of the text to be operated on, and set another when the cursor is on the last line. For examples, we usually use the marks 'a' and 'b' for the first and last line, respectively. So the substitution starts on line (marked with) 'a' and it will end on the line (marked with) 'b'.

Now you can reference these addresses with 'a and 'b in the substitution command:

	:'a,'bs/foo/bar/
So - no more writing down line numbers. Good.

But this method still means to move the cursor onto those lines to set a mark on them.


VIM examples

Visual mode -> line range
I almost never want to figure out the exact line range to apply a substitution on, and that's why I use Vim's visual mode and then the operator ':'. You just select a range of lines using the comamnd 'V' and move to the last line. Then you type ':' and Vim will switch to the command line, inserting "'<,'>" as the line range. These marks stand for "first visual line" and "last visual line", respectively. Now you can type in the substitution command:
	:'<,'>s/Vi/Vim/g
Or any other command for that matter:
	:'<,'>d
But, of course, this deletion command would have been easier by using the 'd' operator in visual mode right away. ;-)