1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
|
_T_o_k_e_n_i_s_a_t_i_o_n_ _a_n_d_ _l_a_y_o_u_t
A Miranda script or expression is regarded as being composed of _t_o_k_e_n_s,
separated by _l_a_y_o_u_t.
A token is one of the following - an identifier, a literal, a type
variable, or a delimiter. Identifiers and literals each have their own
manual section. A type variable is a sequence of one or more stars,
thus * ** *** etc. (see basic type structure). Delimiters are the
miscellaneous symbols, such as operators, parentheses, and keywords. A
formal definition of the syntax of tokens, including a list of all the
delimiters in given under `Miranda lexical syntax'.
_R_U_L_E_S_ _A_B_O_U_T_ _L_A_Y_O_U_T
Layout consists of white space characters (spaces, tabs, newlines and
formfeeds), and comments. A comment consists of a pair of adjacent
vertical bars, together with all the text to the right of the bars on
the same line. Thus
|| this is a comment
Layout is not permitted inside tokens (except in char and string
constants, where it is significant) but may be inserted freely between
tokens to make scripts more readable. Layout is ignored by the compiler
except in two respects:
1) At least one space (or other layout characters) must be present
between two tokens that would otherwise form an instance of a single
larger token. For example in
f 19 'b'
we have a function, f, applied to a number and a character, but if we
were to omit the two intervening spaces, the compiler would read this as
a single six-character identifier, because both digits and single-quotes
are legal characters in an identifier. (Where it is not required to
force the correct tokenisation, or because of the offside rule, see
below, the presence of layout between tokens is optional.)
2) Certain syntactic objects (roughly, the right hand sides of
declarations -- for an exact account see those entities followed by a
`(;)' in the formal syntax) obey Landin's _o_f_f_s_i_d_e _r_u_l_e [Landin 1966].
This requires that every token of the object lie either directly below
or to the right of its first token. A token which breaks this rule is
said to be `offside' with respect to that object and terminates its
parse. For example in
x = 2 < a
y = f q
the 'y' is offside with respect to the right hand side of the definition
of 'x' (because it is to the left of the initial '2'). In such a case
the trailing semicolon may be omitted from the right hand side of the
equation for x.
It is because of the offside rule that Miranda scripts do not normally
contain explicit semicolons as terminators for definitions. The same
rule enables the compiler to determine the scopes of nested _w_h_e_r_e's by
looking at their indentation levels. For example in
f x = g y z
_w_h_e_r_e
y = (x+1)*(x-1)
z = p x (q y)
g r = groo (r+1)
it is the offside rule which makes it clear that the definition of 'g'
is not local to the right hand side of the definition of 'f', but those
of 'y' and 'z' are.
It is always possible to terminate a right hand side by an EXPLICIT
semicolon, instead of relying on the offside rule. For example the
above script could be written all in one line, as
f x = g y z _w_h_e_r_e y = (x+1)*(x-1); z = p x (q y);; g r = groo (r+1);
Notice that we need TWO semicolons after the definition of z - the first
terminates the rhs of the definition of `z', and the second terminates
the larger rhs of which it is a part, namely that of the definition of
`f'. If we put only one semicolon at this point, the definition of `g'
would be local to that of `f'.
This example should convince the reader that code using layout
information to show the block structure is much more readable, and this
is the normal practise.
[_R_e_f_e_r_e_n_c_e P.J. Landin "The Next 700 Programming Languages", CACM vol 9
pp157-165 (March 1966).]
Note that an additional comment convention applies in scripts whose
first character is a `>'. See separate manual entry on `literate
scripts'.
|