import Miranda 2.066 from upstream

author: Jakob Kaivo <jkk@ung.org> 2022-03-04 12:32:20 -0500
committer: Jakob Kaivo <jkk@ung.org> 2022-03-04 12:32:20 -0500
commit: 55f277e77428d7423ae906a8e1f1324d35b07a7d (patch)
tree: 5c1c04703dff89c46b349025d2d3ec88ea9b3819 /miralib/manual/31/9
1 files changed, 54 insertions, 0 deletions
diff --git a/miralib/manual/31/9 b/miralib/manual/31/9
new file mode 100644
index 0000000..8e6bb2d
--- /dev/null
+++ b/miralib/manual/31/9
@@ -0,0 +1,54 @@
+_I_n_p_u_t_/_o_u_t_p_u_t_ _o_f_ _b_i_n_a_r_y_ _d_a_t_a
+
+From version 2.044 Miranda stdenv.m includes a function
+	readb :: [char]->[char]
+and new sys-message constructors
+	Stdoutb     :: [char]->sys_message
+	Tofileb     :: [char]->[char]->sys_message
+	Appendfileb :: [char]->[char]->sys_message
+
+These  behave  similarly  to  (respectively)   read,   Stdout,   Tofile,
+Appendfile  but  are needed in a UTF-8 locale for reading/writing binary
+data (for further explanation see below).  In a non UTF-8 locale they do
+not  behave differently from read, Stdout etc but you might still prefer
+to use them for handling binary data, for portability reasons.
+
+The notation $:- is used for the binary version of the  standard  input.
+In  a  non UTF-8 locale $:- and $- will produce the same results.  It is
+an error to access both $:- and $- in the same evaluation.
+
+_E_x_p_l_a_n_a_t_i_o_n
+
+The locale of a  UNIX  process  is  a  collection  of  settings  in  the
+environment  which  specify, among other things, what character encoding
+is in use.  To see this information use `locale'  as  a  shell  command.
+The analogous concept in Windows is called a "code page".
+
+UTF-8 is a standard for encoding text from a wide variety  of  languages
+as  a  byte  stream,  in  which  ascii  characters  (codes  0..127)  are
+represented by themselves while  other  symbols  are  represented  by  a
+sequence of two or more bytes: a `multibyte character'.
+
+The Miranda type `char' consists of characters  in  the  range  (0..255)
+where  the  codes  above  127  represent  various  accented  letters etc
+according to the conventions of Latin-1 (i.e. ISO-8859-1, commonly  used
+for  West  European  languages).  There are national variants on Latin-1
+but since Miranda source, outside  comments  and  string  and  character
+constants, uses only ascii this does not normally cause a problem.
+
+In a UTF-8 locale: on reading string/character literals  or  text  files
+Miranda has to translate multibyte characters to the corresponding point
+in the Latin-1 range (128-255).  If the text does  not  conform  to  the
+rules  of  UTF-8,  or  includes  a  character not present in Latin-1, an
+"illegal character"  error  occurs.   On  output,  Miranda  strings  are
+translated back to UTF-8.
+
+If data being read/written is not text, but binary data  of  some  kind,
+translation  from/to  UTF-8  is not appropriate and could cause "illegal
+character" errors, and/or corruption of data.  Whence the need  for  the
+byte  oriented  I/O functions readb etc, which transfer data without any
+conversion from/to UTF-8.
+
+In a non UTF-8 locale read and readb, Tofile and Tofileb,  etc.  do  not
+differ in their results.
+
author	Jakob Kaivo <jkk@ung.org>	2022-03-04 12:32:20 -0500
committer	Jakob Kaivo <jkk@ung.org>	2022-03-04 12:32:20 -0500
commit	55f277e77428d7423ae906a8e1f1324d35b07a7d (patch)
tree	5c1c04703dff89c46b349025d2d3ec88ea9b3819 /miralib/manual/31/9