Strings - kevinlawler/kona GitHub Wiki
A character atom is a single character ("c"
). A character vector is a list of zero or more characters ("cde"
). Symbols are different from both. The count of a symbol atom is always one, but a symbol may consist of more than one character (`sym
). Symbols are similar to null-terminated C strings, and character vectors are similar to char arrays, with an associated length. The null char may not be part of a symbol name, but both characters and character vectors may contain nulls.
"c" /character
"cde" /character vector
,"c" /character vector with one element
"" /character vector of length zero
`c /symbol
`cde /also a symbol
`f`g`h`cde /symbol vector
K supports a few common escape characters ("\\"
, "\""
, …). ASCII values may be entered directly using octal codes. For example, to create a character atom consisting of a horizontal tab, you can use "\t"
, or use "\011"
.
The backtick syntax limits what characters can be a part of a symbol name. For instance, it makes it impossible to include a backtick in a symbol name, since the parser already expects that to indicate the start another symbol. For this reason there is an alternate symbol syntax. This syntax is used for entering and displaying symbols with unusual characters. The syntax consists of a backtick followed by a quoted string.
`"a"
`a
`"ab`cd"
`"ab`cd"
`"127.0.0.1"
`"127.0.0.1"
`"abc\0def" /anything after the null is ignored
`abc
Symbols are stored internally using string interning, like symbols in Lisp or atoms in Prolog/Erlang. This means that a symbol is never represented by the character data comprising its name, but rather by an identifier shared throughout the entire K process. All instances of a symbol use the same identifier, and symbol comparison is O(1). The identifiers last for the life of the process.