SetDefaultEncoding

Name

SetDefaultEncoding -- set default character encoding (V4.7)

Synopsis

SetDefaultEncoding(tencoding[, sencoding])

Library

system

Function

This function can be used to change the default character encoding for the text and string libraries. Note that for reasons of compatibility Hollywood maintains two different default character encodings: one for the text library and one for the string library. Under normal conditions, however, both default encodings should be set the to the same character encoding.

The default character encoding for the text library is specified in tencoding and affects functions such as Print(), NPrint(), TextOut(), and CreateTextObject().

The default character encoding for the string library needs to be specified in the sencoding parameter and affects most functions of the string library, i.e. functions such as ReplaceStr() and StrLen().

The following character encodings are currently supported by Hollywood:

#ENCODING_UTF8:: Use UTF-8 encoding. This is the default since Hollywood 7.0.
#ENCODING_ISO8859_1:: This was the default encoding before Hollywood 7.0. This can be useful in case you need to deal with binary data or strings that aren't formatted as UTF-8. Don't be confused by the name: Even though the constant is called ISO 8859-1 it can actually be used with all kinds of non-UTF-8 encodings because for most string library functions it won't make a difference if the encoding is ISO 8859-1 or some other charset as long as one character is one byte which is true for all non-UTF-8 8-bit encodings. The only commands that won't work with non-ISO-8859-1 encodings are commands like UpperStr(), LowerStr(), etc. because they will do all upper and lower case mapping based on the ISO 8859-1 charmap and other encodings will require different charmaps so those functions won't give correct results for non-ISO-8859-1 text. Since #ENCODING_ISO8859_1 can also be used with other encodings there's also the synonym constant #ENCODING_RAW which might be less misleading semantically because it doesn't suggest that strings are in ISO 8859-1 format (see below).
#ENCODING_RAW:: This is the same as #ENCODING_ISO8859_1 but using this instead of #ENCODING_ISO8859_1 might be preferable from a semantic point of view because it doesn't suggest that strings are or must be in ISO 8859-1 encoding. Instead it simply says that strings are simply treated as a sequence of raw 8-bit characters.

Inputs

tencoding: default character encoding for the text library
sencoding: default character encoding for the string library (V7.0)

Show TOC