summaryrefslogtreecommitdiff
path: root/miralib/manual/12
diff options
context:
space:
mode:
Diffstat (limited to 'miralib/manual/12')
-rw-r--r--miralib/manual/1286
1 files changed, 86 insertions, 0 deletions
diff --git a/miralib/manual/12 b/miralib/manual/12
new file mode 100644
index 0000000..a464473
--- /dev/null
+++ b/miralib/manual/12
@@ -0,0 +1,86 @@
+_T_o_k_e_n_i_s_a_t_i_o_n_ _a_n_d_ _l_a_y_o_u_t
+
+A Miranda script or expression is regarded as being composed of _t_o_k_e_n_s,
+separated by _l_a_y_o_u_t.
+
+A token is one of the following - an identifier, a literal, a type
+variable, or a delimiter. Identifiers and literals each have their own
+manual section. A type variable is a sequence of one or more stars,
+thus * ** *** etc. (see basic type structure). Delimiters are the
+miscellaneous symbols, such as operators, parentheses, and keywords. A
+formal definition of the syntax of tokens, including a list of all the
+delimiters in given under `Miranda lexical syntax'.
+
+_R_U_L_E_S_ _A_B_O_U_T_ _L_A_Y_O_U_T
+
+Layout consists of white space characters (spaces, tabs, newlines and
+formfeeds), and comments. A comment consists of a pair of adjacent
+vertical bars, together with all the text to the right of the bars on
+the same line. Thus
+ || this is a comment
+Layout is not permitted inside tokens (except in char and string
+constants, where it is significant) but may be inserted freely between
+tokens to make scripts more readable. Layout is ignored by the compiler
+except in two respects:
+
+1) At least one space (or other layout characters) must be present
+between two tokens that would otherwise form an instance of a single
+larger token. For example in
+ f 19 'b'
+we have a function, f, applied to a number and a character, but if we
+were to omit the two intervening spaces, the compiler would read this as
+a single six-character identifier, because both digits and single-quotes
+are legal characters in an identifier. (Where it is not required to
+force the correct tokenisation, or because of the offside rule, see
+below, the presence of layout between tokens is optional.)
+
+2) Certain syntactic objects (roughly, the right hand sides of
+declarations -- for an exact account see those entities followed by a
+`(;)' in the formal syntax) obey Landin's _o_f_f_s_i_d_e _r_u_l_e [Landin 1966].
+This requires that every token of the object lie either directly below
+or to the right of its first token. A token which breaks this rule is
+said to be `offside' with respect to that object and terminates its
+parse. For example in
+ x = 2 < a
+ y = f q
+the 'y' is offside with respect to the right hand side of the definition
+of 'x' (because it is to the left of the initial '2'). In such a case
+the trailing semicolon may be omitted from the right hand side of the
+equation for x.
+
+It is because of the offside rule that Miranda scripts do not normally
+contain explicit semicolons as terminators for definitions. The same
+rule enables the compiler to determine the scopes of nested _w_h_e_r_e's by
+looking at their indentation levels. For example in
+ f x = g y z
+ _w_h_e_r_e
+ y = (x+1)*(x-1)
+ z = p x (q y)
+ g r = groo (r+1)
+
+it is the offside rule which makes it clear that the definition of 'g'
+is not local to the right hand side of the definition of 'f', but those
+of 'y' and 'z' are.
+
+It is always possible to terminate a right hand side by an EXPLICIT
+semicolon, instead of relying on the offside rule. For example the
+above script could be written all in one line, as
+ f x = g y z _w_h_e_r_e y = (x+1)*(x-1); z = p x (q y);; g r = groo (r+1);
+
+Notice that we need TWO semicolons after the definition of z - the first
+terminates the rhs of the definition of `z', and the second terminates
+the larger rhs of which it is a part, namely that of the definition of
+`f'. If we put only one semicolon at this point, the definition of `g'
+would be local to that of `f'.
+
+This example should convince the reader that code using layout
+information to show the block structure is much more readable, and this
+is the normal practise.
+
+[_R_e_f_e_r_e_n_c_e P.J. Landin "The Next 700 Programming Languages", CACM vol 9
+pp157-165 (March 1966).]
+
+Note that an additional comment convention applies in scripts whose
+first character is a `>'. See separate manual entry on `literate
+scripts'.
+