4.11 Encodings

The following single-byte encodings are available in Polybios. Hollywood scripts can get an encoding handle by using doc:GetEncoder():

StandardEncoding
The default encoding of PDF
MacRomanEncoding
The standard encoding of macOS
WinAnsiEncoding
The standard encoding of Windows
FontSpecific
Use the built-in encoding of a font
ISO8859-2
Latin Alphabet No.2
ISO8859-3
Latin Alphabet No.3
ISO8859-4
Latin Alphabet No.4
ISO8859-5
Latin Cyrillic Alphabet
ISO8859-6
Latin Arabic Alphabet
ISO8859-7
Latin Greek Alphabet
ISO8859-8
Latin Hebrew Alphabet
ISO8859-9
Latin Alphabet No. 5
ISO8859-10
Latin Alphabet No. 6
ISO8859-11
Thai, TIS 620-2569 character set
ISO8859-13
Latin Alphabet No. 7
ISO8859-14
Latin Alphabet No. 8
ISO8859-15
Latin Alphabet No. 9
ISO8859-16
Latin Alphabet No. 10
CP1250
Microsoft Windows Codepage 1250 (EE)
CP1251
Microsoft Windows Codepage 1251 (Cyrl)
CP1252
Microsoft Windows Codepage 1252 (ANSI)
CP1253
Microsoft Windows Codepage 1253 (Greek)
CP1254
Microsoft Windows Codepage 1254 (Turk)
CP1255
Microsoft Windows Codepage 1255 (Hebr)
CP1256
Microsoft Windows Codepage 1256 (Arab)
CP1257
Microsoft Windows Codepage 1257 (BaltRim)
CP1258
Microsoft Windows Codepage 1258 (Viet)
KOI8-R
Russian Net Character Set

The following multi-byte encodings are available in Polybios:

GB-EUC-H
EUC-CN encoding
GB-EUC-V
Vertical writing version of GB-EUC-H
GBK-EUC-H
Microsoft Code Page 936 (lfCharSet 0x86) GBK encoding
GBK-EUC-V
Vertical writing version of GBK-EUC-H
ETen-B5-H
Microsoft Code Page 950 (lfCharSet 0x88) Big Five character set with ETen extensions
ETen-B5-V
Vertical writing version of ETen-B5-H
90ms-RKSJ-H
Microsoft Code Page 932, JIS X 0208 character
90ms-RKSJ-V
Vertical writing version of 90ms-RKSJ-V
90msp-RKSJ-H
Microsoft Code Page 932, JIS X 0208 character (proportional)
EUC-H
JIS X 0208 character set, EUC-JP encoding
EUC-V
Vertical writing version of EUC-H
KSC-EUC-H
KS X 1001:1992 character set, EUC-KR encoding
KSC-EUC-V
Vertical writing version of KSC-EUC-V
KSCms-UHC-H
Microsoft Code Page 949 (lfCharSet 0x81), KS X 1001:1992 character set plus 8822 additional hangul, Unified Hangul Code (UHC) encoding (proportional)
KSCms-UHC-HW-H
Microsoft Code Page 949 (lfCharSet 0x81), KS X 1001:1992 character set plus 8822 additional hangul, Unified Hangul Code (UHC) encoding (fixed width)
KSCms-UHC-HW-V
Vertical writing version of KSCms-UHC-HW-H
UTF-8
UTF-8 encoding.

A Hollywood script has to invoke one of the following functions before using multi-byte encodings:

doc:UseCNSEncodings()
It makes simplified Chinese encodings (GB-EUC-H, GB-EUC-V, GBK-EUC-H, GBK-EUC-V) become available.
doc:UseCNTEncodings()
Makes traditional Chinese encodings (ETen-B5-H, ETen-B5-V) become available.
doc:UseJPEncodings()
Makes Japanese encodings (90ms-RKSJ-H, 90ms-RKSJ-V, 90msp-RKSJ-H, EUC-H, EUC-V) become available.
doc:UseKREncodings()
Makes Korean encodings (KSC-EUC-H, KSC-EUC-V, KSCms-UHC-H, KSCms-UHC-HW-H, KSCms-UHC-HW-V) become available.
doc:UseUTFEncodings()
Makes UTF-8 encoding become available.


Show TOC