Character encodings

13.2 Character encodings

Most of the string and text library functions accept an optional parameter specifying the character encoding to use. This parameter tells the function how the strings you pass to it are internally formatted, i.e. which character encoding they use.

Normally, you shouldn't have to use this parameter at all because starting with Hollywood 7.0 all text should be stored as UTF-8. Under certain circumstances, however, it might be necessary to use the optional character encoding parameter. For example, Hollywood strings can also contain raw binary data. This data of course isn't valid UTF-8 and thus the string functions will reject it. The only way to operate on this data then is to tell the respective functions that this isn't UTF-8 encoded data but just a raw sequence of bytes. This can be done by passing the #ENCODING_RAW constant in the character encoding parameter.

Here is an overview of the different encodings available in Hollywood:

#ENCODING_UTF8:: This is the default encoding since Hollywood 7.0 and should be used whenever you work with text.
#ENCODING_ISO8859_1:: This was the default encoding before Hollywood 7.0. It is still supported for compatibility reasons but it isn't recommended to use it.
#ENCODING_RAW:: This is a synonym for #ENCODING_ISO8859_1. It can be used to tell the string library functions to treat the string as raw binary data instead of text.
#ENCODING_AMIGA:: This specifies the system's default character set on AmigaOS and compatible systems. This constant is only supported by ConvertStr() and only on AmigaOS and compatible systems, obviously. #ENCODING_AMIGA allows you to convert between AmigaOS' default character set and UTF-8 (both ways).

You can use the SetDefaultEncoding() function to change the default character encoding for the string and text libraries. See SetDefaultEncoding for details.

Show TOC