Name
CharOffset -- convert byte to character offset (V7.0)
Synopsis
coff = CharOffset(s$, boff[, encoding])
Function
This function returns the offset, in characters, of the character at the offset specified by boff, in bytes and starting from 0, inside string s$.

The optional encoding parameter can be used to set the character encoding to use. This defaults to the default string encoding set using SetDefaultEncoding(). See Character encodings for details.

In the UTF-8 character encoding a single character may need a storage space of up to 4 bytes. In the ISO 8859-1 character encoding there is no difference between byte and character sizes. Hence, it doesn't really make sense to call this function with the character encoding set to #ENCODING_ISO8859_1.

To convert a character offset into a byte offset use the ByteOffset() function. See ByteOffset for details.

Inputs
s$
input string
boff
byte offset to be mapped to a character offset (starting from 0)
encoding
optional: character encoding to use (defaults to default string encoding)
Results
coff
character offset of the specified character
Example
coff = CharOffset("äöü", 2)
Print(coff)
If Hollywood is in Unicode mode, this will return 1 because the "ä" character takes up 2 bytes in UTF-8 code space. In ISO 8859-1 there is no difference between characters and bytes, so 1 will be returned in that case.

Show TOC