A simple introductory guide to ML/I
P.J. Brown Computing Laboratory The University Canterbury, KentFirst edition published January 1970
This edition, with revisions by R.D. Eager, June 1982
ML/I is a general purpose macro processor that is available on several different computers. It has many uses: probably the most common are the extension of existing programming languages and systematic editing.
There already exists a reference manual for ML/I (ML/I User's Manual)
-- also see online version --
and a paper describing ML/I (`The ML/I macro
processor',
Comm. ACM 10,10
(October 1967)).
However, the former is comparatively large and hence rather forbidding,
while the latter tries to show off some of the more advanced features of
ML/I.
This Guide provides a short and simple
introduction to ML/I. Obviously, it has been necessary to omit many
features and to over-simplify others, but nevertheless it is hoped that the
reader will get a feel of what ML/I is all about, and an idea of some of
its uses.
In order to show exactly what goes into ML/I and, as a result, what
comes out, examples in this Guide have been written as if the reader
were using ML/I at an interactive terminal.
The examples
represent a sequence of lines of input to ML/I, starting from scratch.
The input lines are shown in bold type, and
are numbered sequentially to aid cross-referencing.
Lines of output are not in bold type, and are labelled with the word
`Output', e.g.
Before actually using ML/I the user should find out the exact form and
manner of the input to ML/I at his installation.
ML/I operates on characters, not numbers. It is fed, as input, a string
of characters called the
source text.
There is no restriction on the form of the source text; it may, for
example, be a program in any programming language, a scientific paper, a
circular letter or some data for a program. What ML/I does to the
source text is to scan through it making systematic replacements; the
resultant text is the output. ML/I produces its output as it goes
along. Its usual sequence of operation is: read a line; perform the
necessary replacements; output the line. In some uses of ML/I the
output is subsequently fed to a compiler or an assembler.
The way of making a replacement is by means of a
macro.
A macro is defined by writing
where
x
describes what is to be replaced and
y,
which can be any string of characters, is what it is to be replaced by.
y
is called the
replacement text.
We shall now start our imaginary session with ML/I, and will
illustrate some simple macros.
The first two lines of input to ML/I are normally used to define some
special symbols. We shall supply these without explanation now, and will
refer back to them later.
We shall now define a macro:
The above definition causes every subsequent occurrence of JONES to be
replaced by SMITH, for example
Similarly, macros can be defined to replace punctuation characters, for
example
Every time a macro is recognised and replaced by its replacement text,
this is termed a
call
of the macro.
The normal way of using ML/I is first to feed it the definitions of the
macros (and other similar constructions -- see later) to be used, and
then to feed it the source text in which replacements are to be made.
However, it is possible to intersperse macro definitions with the rest
of the source text, as indeed is being done in this Guide.
To show how ML/I splits up its source text, consider the following:
Note that, in the line above, the sequence `READ' occurs within DREADED
and within READER but in neither case has it been replaced. This is
because ML/I does not, in fact, scan character by character but rather
atom
by
atom,
where an atom is a single punctuation character (i.e. any character
other than a letter or a digit) or a sequence of letters and/or digits
bounded on each side by punctuation characters. Thus in the above line,
DREADED is an atom and hence ML/I does not recognise the letters `READ'
within it.
The character `space' is treated as an atom, and so is the
character `newline', the latter being an imaginary character that occurs
at the
end of each line. As regards this newline character, users should be
careful only to use it in replacement text exactly where they want it,
for example
Note how a newline appears in the output after FIXED(15).
The macros defined so far have replaced a single atom. However, it is
possible, by using the keyword WITHS, to specify that a multi-atom
sequence be replaced. For example
Macro replacement takes place within the replacement text of other
macros as well as within the source text (in fact recursion, i.e. a
macro calling itself, is allowed). For example:
Here, within the replacement text of THE CHAIRMAN, THE PRIME
MINISTER has been replaced.
In all the examples of MCDEF, the replacement text has been enclosed
within the characters `<' and `>'. These are called
literal brackets.
They mean `copy the enclosed text as it stands'. Every user chooses as
his literal brackets a pair of
atoms (or multi-atom sequences) that do not occur
naturally in the text to be processed. In this Guide, the literal
brackets were specified in input line 2 above. If it was required to
use the multi-atom sequences
`*(' and `)*' as literal brackets, this would be done thus:
With this definition (which has, in fact, supplemented rather than
replaced the previous literal brackets) we can now write
Literal brackets have a secondary use, as illustrated by
The occurrence of REWIND within the replacement text of the REWIND macro
is enclosed in literal brackets (which are additional to those that
enclose the entire replacement text) to cause it to be copied literally
over to the output; if these literal brackets had been omitted, it
would have been taken as a recursive call of the REWIND macro, and ML/I
would have been set in an endless loop of replacing REWIND at
successively deeper levels.
The macros that have been defined so far have been of a rather
simple kind in that the thing to be replaced was always a single
pre-defined atom or series of atoms. Now we shall consider
more powerful macros, which reflect the true usefulness of ML/I.
Assume we wish to replace
for any
x,
by
Here the macro has an
argument,
i.e. an arbitrary string between two predefined delimiters (in this case
UNSTACK and semicolon). It is possible to have more than
one argument to a macro, for instance one might want to replace
by
This macro has two arguments, which are delineated by the three delimiters:
APPEND, TO LIST and full-stop.
Delimiters are numbered 0,1,2, etc. Delimiter zero (in this case APPEND)
is called the
macro name
and the last delimiter is called the
closing delimiter.
(In the case of a macro with no arguments, the macro name is also its
closing delimiter.)
When a macro is defined, the delimiters are simply listed in the order
in which they are to occur; they may be separated by one or more spaces
or newlines. This is called a
structure representation.
Thus, the structure representation of the APPEND macro is
Before specifying the replacement text of this macro, we will give some
further explanation.
When ML/I is given a definition of a macro that has arguments, each
subsequent occurrence of the macro name in the source text is taken as a
call of the macro; ML/I then searches for the first delimiter, then the
second, and so on until it has found the closing delimiter. The
arbitrary string occurring between delimiter
n and delimiter n+1
is called
argument n+1.
Thus, the first argument is argument one.
It is usually necessary to insert arguments into the replacement text of
a macro, and this is done by writing `%An.', where
n
is the number of the argument to be inserted. Hence, continuing the
definition of the APPEND macro:
As a second example, the UNSTACK macro can be defined and used thus:
It is often convenient to use the imaginary character newline as a
delimiter, particularly as a closing one. Newlines themselves are
ignored in structure representations (hence the fact that AS in the
previous definitions started on a new line was not significant), so when
it is required to specify newline as a delimiter it is necessary to use
a keyword, namely NL (similarly, most implementations of ML/I have other
keywords such as SPACE, SPACES and perhaps TAB).
The following example defines a macro CALL
with newline as its closing delimiter.
We will now go one step further in the elaboration of macros, and will
introduce one of the most important concepts in ML/I, namely optional
delimiters.
Assume one wishes to define a macro which has two alternative forms:
which is replaced by
and
which is replaced by
i.e. delimiter two can be either a plus sign or a minus sign. Options
such as this are specified in structure representations by writing
where each
branch
can itself be any structure representation. In practice, a branch is
usually a single delimiter. Hence, the structure representation of the
SET macro is written
Within the replacement text of this macro, it is necessary to test
whether delimiter two of the current call was a plus or minus sign,
and to generate code
accordingly. This is done thus:
Two new features have been introduced here. The first is
macro-time
statements, i.e. statements that are executed by ML/I when it encounters
them at the time it is macro processing. The above example shows one
such macro-time statement, the `go to' statement MCGO. As can be seen,
MCGO has an optional conditional clause (MCGO ... IF ... ).
A second new feature is the extended use of the `%' notation to include
the insertion of delimiters (e.g. %D2. meaning delimiter two) and to
place labels that are the destination of the MCGO statements
(e.g. %L1. and %L2. above). Unlike other types of insert, the
`inserting' of a label does not generate any text (i.e. it has a `null'
value).
We will now call SET to show that it works:
(Note how the JONES macro, defined back in input line 3, is still in
existence.)
We will now take the last step in the elaboration of macros, and will
describe how to use macros that have a variable number of arguments.
Assume, therefore, that it is required to define a macro called LET,
which is similar to the SET macro except that it can have, to the right
of the equals sign, an arbitrary expression involving additions and/or
subtractions. Thus typical calls of LET might be
When scanning a call of this macro, ML/I first encounters LET and
then an equals sign. It then searches for plus, minus or a
newline. If it finds either of the first two it recycles,
i.e. again looks for plus, minus or a newline. If it finds a newline,
then that is the closing delimiter, and thus
the call is complete and the replacement text can be executed.
This scheme for searching for delimiters is specified in structure
representations by the use of
nodes
(the word node is used since the delimiter structure of a macro can
conveniently be represented as a directed graph). A node is placed at
a given point in a structure representation, and can be `gone to' from
the end of any branch in the same structure representation. Nodes,
which are local to the structure representation in which they occur, are
written N1, N2, N3, etc. A node is placed just by writing its name
at the appropriate point within
a structure representation, and is gone to simply by placing its
name at the end of a branch; note that there is no explicit `go to'.
Hence the structure representation of the LET macro is written
Here the node is called N1, is placed before the alternative `plus,
minus or newline'. If either the plus branch or the minus branch is
taken, the branch ends by going back to N1. If, on the other hand, the
newline branch is taken, the next delimiter is taken as the one
following the ALL. In this case, nothing follows ALL, so this newline is the
closing delimiter.
We shall now consider the problems that arise in specifying the
replacement text for the LET macro. The main problem is that one does
not know in advance how many arguments there will be. The way to deal
with this is to write a macro-time loop that takes the arguments one by
one until they have run out. To do this, one needs variables for
counting and subscripting; ML/I caters for this need by supplying three
integer variables called T1, T2 and T3 which are local to each macro
call (there also exist permanent, global variables called P1, P2, etc.,
but these are not of immediate interest here). These variables are
called
macro variables,
and ML/I contains an assignment statement, MCSET, for manipulating them.
MCGO can be used for testing them. Macro variables or expressions
involving them can be used as subscripts; e.g. if T1 had value 3, then
%AT1. would mean `insert argument three' and %DT1-1. would mean `insert
delimiter two'. In addition, values of macro variables can be inserted
as they stand; e.g. %T1. would generate the character `3'.
Using these facilities, the replacement text of the LET macro is written
and a sample call is
As a last example, we shall show a macro which illustrates no new
concepts, but which may be of interest to readers familiar with Polish
notation. The macro converts from fully parenthesised algebraic
notation to Polish prefix notation.
In this example, the macro name is a left parenthesis and its closing
delimiter is a right parenthesis. Line 66 above shows a nested and, in
fact, recursive call of this macro.
This ends the description of macros as such, and we will end this Guide
by clearing up a few isolated features not already covered. Firstly, we
shall consider
skips and inserts.
Insert is the name for the `%' facility. All that remains to be said
about this is that, in a similar manner to literal brackets, the user
chooses as his
insert marker
(i.e. what we have used `%' for) an atom or sequence of atoms that do
not occur naturally in the text to be processed. The insert marker is
normally defined at the start of the source text, as in input line 1
above. If it was desired to use `//' as an insert marker, it would have
been defined
(WITH is similar to WITHS, but means that no spaces are allowed between
the two atoms it connects.)
Skips have already been mentioned, in that literal brackets are a
special case of a skip. A skip has, like a macro, an associated
structure representation. When ML/I encounters the name of a skip, it
`switches off' all the macros until it comes to the closing delimiter of
the skip. The only things that may be recognised within a skip are
other skips, and these are only recognised if the first skip has the `M'
(for
matched)
option set. What ML/I does to skips is controlled by two further
options: the `D' option means `copy the delimiters to the output' and
the `T' option means `copy the intervening pieces of text' (i.e. in macro
terms, the arguments). Hence if, for example, neither `D' nor `T' is
set, the skip is totally deleted. A skip is defined thus
For example
means define the quote sign as a skip name with another quote as the
closing delimiter, and set the `D' and `T' options (but not the `M'
option) for it. The following skip would, on the other hand, be totally
deleted since no options are set:
The following examples show how these two skips work:
If ML/I has found a macro name and is searching for its delimiters, it
may encounter a nested macro call or skip. In this case it `goes down a
level' and searches for the delimiters of the nested construction; only
when the closing delimiter of this is found does it return to the
original search. Thus, if one wrote the nonsense line
the second equals and the second plus would be delimiters of the LET
macro, since the first equals is within a skip and the first plus is
within a call of the UNSTACK macro. The nonsensical output from the
above has not been shown.
The reader may skip this Section, as it describes a rather
complicated feature of ML/I.
It is sometimes convenient to define a macro or skip so that, after the
macro or skip has been processed, scanning resumes
at
rather than
beyond
the closing delimiter; a closing delimiter with this property is called
an
exclusive delimiter.
An exclusive delimiter is specified as such by placing the imaginary
node N0 (N zero) after it.
For example, if it is desired to delete all lines commencing with an
asterisk, this can be done as follows:
Here the skip name is a newline immediately followed by an asterisk, and
the closing delimiter is the next newline. However, when the skip has
been processed, scanning resumes at this closing newline, so this
newline is free to form part of a further skip name
if an asterisk occurs at the
start of the next line.
In fact defining newline as part of a macro name or skip name, though
often useful, has many potential pitfalls and it is best avoided by the
novice (it will also, for the reader who is actually using ML/I at a
terminal, tend to upset the interrelation of input lines with output
ones since ML/I will need to keep `looking ahead' a line).
It is much better to use the `startline' facility described in the ML/I
User's Manual.
All the built-in ML/I statements like MCDEF, MCSKIP, MCSET, MCGO etc.,
have the generic name of
operation macro.
Operation macros are analogous to ordinary user-defined macros in the
way they are scanned, but they are different in that they perform some
predefined system action instead of effecting a replacement. There are
many operation macros that have not been covered in this Guide. One
example is MCNOTE, which prints a message together with the current line
number, and is useful for scanning documents and for generating error
messages.
Note that macro replacement takes place
everywhere
in the text scanned by ML/I (except within skips). For example, it
takes place within arguments to operation macros (e.g. in structure
representations) and within inserts. In fact, ML/I contains virtually
no restrictions on what one can do and where one can do it. This in
many ways contributes to the power of ML/I, but it does mean that the
user is not restricted as to the depths of the logical mires he can get
himself into, nor in the machine time he can use trying to make ML/I do
things it was not designed for.
We have now come to the end of this Guide. Hopefully, the reader has
reached a stage where, although his understanding of ML/I is of
necessity rather patchy and in some cases superficial, he can still use
ML/I in some simple applications and perhaps in some not-so-simple ones
too. After having some experience with ML/I, he may wish to refer to the
User's Manual to fill in some of the gaps in his knowledge.
23) THIS IS A LINE OF INPUT
Output) THIS IS THE CORRESPONDING OUTPUT
Basic principles
Macros
MCDEF x AS y
1) MCINS % .
2) MCSKIP MT, <>
3) MCDEF JONES AS <SMITH>
4) USERS SHOULD CONSULT MR. JONES
Output) USERS SHOULD CONSULT MR. SMITH
5) .... JONES .... JONES ....
Output) .... SMITH .... SMITH ....
6) MCDEF & AS <;>
7) X := Y & Y := Z&
Output) X := Y ; Y := Z;
Atoms
8) MCDEF READ AS <INPUT TO THE COMPUTER>
9) THEN READ YOUR DATA, MR. JONES
Output) THEN INPUT TO THE COMPUTER YOUR DATA, MR. JONES
10) THE DREADED READER SHOULD READ
Output) THE DREADED READER SHOULD INPUT TO THE COMPUTER
11) MCDEF INTEGER AS <FIXED(15)
12) BINARY>
13) BEGIN INTEGER A,B;
Output) BEGIN FIXED(15)
Output) BINARY A,B;
Multi-atom names
14) MCDEF THE WITHS PRIME WITHS MINISTER
15) AS <MRS. THATCHER>
16) THE PRIME MINISTER CHAIRS THE CABINET
Output) MRS. THATCHER CHAIRS THE CABINET
Calls within calls
17) MCDEF THE WITHS CHAIRMAN
18) AS <THE PRIME MINISTER OR HER DEPUTY>
19) THE CHAIRMAN SPEAKS FIRST
Output) MRS. THATCHER OR HER DEPUTY SPEAKS FIRST
Literal brackets
20) MCSKIP MT, * WITHS ( ) WITHS *
21) MCDEF CUR AS *(DOG)*
22) CUR
Output) DOG
23) MCDEF REWIND AS <PRINT 'FINISHED WITH TAPE'
24) <REWIND>>
25) REWIND
Output) PRINT 'FINISHED WITH TAPE'
Output) REWIND
Arguments
UNSTACK x;
x := STACK[PTR];
PTR := PTR - 1;
APPEND X TO LIST Y.
Y[0] := Y[0]+1;
Y[Y[0]] := X;
26) MCDEF APPEND TO WITHS LIST .
27) AS <%A2.[0] := %A2.[0]+1;
28) %A2.[%A2.[0]] := %A1.;>
29) APPEND PATIENT TO LIST WAIT.
Output) WAIT[0] := WAIT[0]+1;
Output) WAIT[WAIT[0]] := PATIENT;
30) APPEND X/Y+9 TO LIST ARRAY.
Output) ARRAY[0] := ARRAY[0]+1;
Output) ARRAY[ARRAY[0]] := X/Y+9;
31) MCDEF UNSTACK ;
32) AS <%A1. := STACK[PTR];
33) PTR := PTR - 1;>
34) L:UNSTACK OP;
Output) L:OP := STACK[PTR];
Output) PTR := PTR - 1;
35) MCDEF CALL NL
36) AS < BSR %A1.
37) >
38) CALL PIG
Output) BSR PIG
39) UNSTACK Y; LAB: CALL PIGGY
Output) Y := STACK[PTR];
Output) PTR := PTR - 1; LAB: BSR PIGGY
Optional Delimiters
SET a = b + c
LSS b
ADD c
ST a
SET a = b - c
LSS b
SUB c
ST a
OPT branch1 OR branch2 OR ... OR branchN ALL
40) MCDEF SET = OPT + OR - ALL NL
41) AS < LSS %A2.
42) MCGO L1 IF %D2. = +
43) SUB %A3.
44) MCGO L2
45) %L1. ADD %A3.
46) %L2. ST %A1.
47) >
48) SET X1 = Y1+Z1
Output) LSS Y1
Output) ADD Z1
Output) ST X1
49) SET BROWN = JONES-ROBINSON
Output) LSS SMITH
Output) SUB ROBINSON
Output) ST BROWN
Variable numbers of arguments
LET A = B + C - D + F - G
LET X = Y
LET X = X + Y + C + D
50) MCDEF LET = N1 OPT + N1 OR - N1 OR NL ALL
51) AS < LSS %A2.
52) MCSET T1 = 3
53) %L4.MCGO L2 IF %DT1-1. = +
54) MCGO L5 UNLESS %DT1-1. = -
55) SUB %AT1.
56) MCGO L3
57) %L2. ADD %AT1.
58) %L3.MCSET T1 = T1 + 1
59) MCGO L4
60) %L5. ST %A1.
61) >
62) LET A = B-C + D
Output) LSS B
Output) SUB C
Output) ADD D
Output) ST A
Specialised example
63) MCDEF ( OPT + OR - OR * OR / ALL )
64) AS <%D1. %A1. %A2.>
65) (A + B)
Output) + A B
66) ((A-(B*C))/(X/Y))
Output) / - A * B C / X Y
Skips and inserts
67) MCINS /WITH/ .
MCSKIP options, structure representation
68) MCSKIP DT, ' '
69) MCSKIP , COMMENT ;
70) 'LET SET JONES'
Output) 'LET SET JONES'
71) COMMENT LET IS A MACRO; 'COMMENT'
Output) 'COMMENT'
Searching for delimiters
72) LET A <=> = UNSTACK L + 1; + 7
Exclusive delimiters
73) MCSKIP , NL WITH * NL N0
74) AAA**OK
75) * THIS WILL VANISH
76) * SO WILL THIS
77) END
Output) AAA**OK
Output) END
Operation macros
Replacement
Concluding remarks