UTF8 - ReFreezed/LuaWebGen GitHub Wiki
Note: The documentation has moved to the LuaWebGen website. Information here may be out of date!
[v1.3] The UTF-8 module, available through the utf8
global, contains some UTF-8 related helper functions.
Note: Positions and lengths are given in bytes, unless otherwise specified.
string = utf8.codepointToString( codepoint )
utf8.codepointToString( codepoint, outputArray )
Convert a single Unicode codepoint to a string, optionally adding the result to an array. Raises an error if the codepoint is outside the valid range.
length = utf8.getCharacterLength( string [, position=1 ] )
Get the amount of bytes the character at position takes up (between 1 and 4). Returns nil if the string is invalid at position. Examples:
local s = "aÜx"
print(utf8.getCharacterLength(s, 1)) -- 1 (a)
print(utf8.getCharacterLength(s, 2)) -- 2 (Ü)
print(utf8.getCharacterLength(s, 4)) -- 1 (x)
codepoint, length = utf8.getCodepointAndLength( string [, position=1 ] )
Get the codepoint for, and amount of bytes taken up by, the character at position. Returns nil if the string is invalid at position.
length = utf8.getLength( string [, startPosition=1 ] )
Get the total length of a string in characters starting at startPosition. Returns nil and the first error position if the string isn't a valid UTF-8 string. Example:
print(utf8.getLength("aÜx")) -- 4
print(utf8.getLength("a\255x")) -- nil, 2
startPosition = utf8.getStartOfCharacter( string, position )
Get the position where the character at position begins. Returns nil if the string is invalid at position. Example:
print(utf8.getStartOfCharacter("aÜx", 3)) -- 2