Strings - Petewg/harbour-core GitHub Wiki
String functions could be considered those that receive a string argument, as their main parameter, and do things (or some kind of whatever processing) with this string. [This is not necessarily an "official" definition, if any; it's just the way I do understand them, as well as, a (simplistic) criterion for they be included in this group].
-
AllTrim(
<cString>
) β cTrimString
removes leading and trailing spaces from<cString>
; note that the term 'spaces' includes not onlyspace character
Chr(32) but alsocarriage return
Chr(13),line feed
Chr(10), andtab character
Chr(9), meaning that, if present, they are removed as well. -
At(
<cSearchFor>, <cIntoString>
) β nPosition
returns the position in<cIntoString>
where<cSearchFor>
is found or0
(zero) when<cSearchFor>
is not found. NOTE: Search is case sensitive! -
hb_At(
<cSearchFor>, <cIntoString>, [<nStart>], [<nEnd>]
) β nPosition
returns a positive numeric value (or zero if nothing found) which is the position into<cIntoString>
where<cSearchFor>
is found. Searching starts from<nStart>
(default=1) up to<nEnd>
(default=Len(cIntoString)). Note: searching is case sensitive! -
hb_AtI(
<cSearchFor>, <cIntoString>, [<nStart>], [<nEnd>]
) β nPos
Same as hb_At() above, but case-Insensitive. -
hb_ATokens(
<cString>, [<cDelimiter>|lEOL], [<lSkipStrings>], [<lDoubleQuoteOnly>]
) β aTokens
returns an array filled with all individual tokens of a string, that is, all the separate sub-strings that are delimited by either<cDelimiter>
or byEOL
(end of line) if<lEOL>
has been passed (instead of<cDelimiter>
) and evaluates to.T.
-
If neither
<cDelimiter>
nor<lEOL>
specified then as delimiter is used an empty space (default). -
If
<lSkipStrings>
is .T. (default=.F.), the quoted sub-strings (if any, inside cString) are not tokenized (i.e. are not searched for<cDelimiter>
). -
If
<lDoubleQuoteOnly>
is .T. only the double quote"
is considered as a quote sign. This argument is meaningful only when<lSkipStrings>=.T.
-
NOTES:
-
- tokenization is case sensitive, in the (rare) case that
<cDelimiter>
is a letter-character.
- tokenization is case sensitive, in the (rare) case that
-
- The delimiters are removed from tokens.
-
-
See also: more tokenization functions
-
-
hb_BAt(
<cSubString>, <cString>, [<nFrom>], [<nTo>]
) β nPosition
returns the position number into<cString>
where the<cSubString>
is found, or zero if it is not found.
works like hb_At() but operates on byte/binary strings and is Code-Page independent. -
hb_BCode(
<cText>
) β nCode
returns the value of the 1-st byte in a given<cText>
string. (code-page independent byte/binary operation) -
hb_BPeek(
<cText>, <n>
) β nCode
return value ofn-th
byte in given string. -
hb_BPoke(
[@]<cText>, <n>, <nVal>
) β cText
change n-th byte in given string to<nVal>
and return modified text. -
hb_BRAt(
<cSubString>, <cString>, [<nFrom>], [<nTo>]
) β nPosition
Same as hb_RAt() but for raw/binary strings. -
hb_BRight(
<cString>, <nCount>
) β cSubstring -
hb_BStuff(
<cString>, <nAt>, <nDel>, <cIns>
) β cResult -
hb_BSubStr(
<cString>, <nStart>, <nCount>
) β cSubstring -
hb_LeftEq(
<cString>, <cSubString>
) β lEqual
returns .T. when all characters (with the given order) of<cSubString>
matching to leftmost (same length) part of<cString>
. Basicaly it's equivalent to:( Left( <cString1>, Len( <cString2> ) ) == <cString2> )
, but faster and shorter. Can be used f.e. to check if<cString>
begins with<cSubString>
. NOTE: Case sensitive! -
hb_LeftEqI(
<cString>, <cSubString>
) β lEqual
returns .T. when all characters (with the given order) of<cSubString>
matching to leftmost (same length) part of<cString>
. Same as 'hb_LeftEq()' but case insensitive! -
hb_RAt(
<cSearchFor>, <cIntoString>, [<nStart>], [<nEnd>]
) β nPosition
It finds the last (rightmost) match of<cSearchFor>
into<cIntoString>
in the range<nStart>
-<nEnd>
. IOW, the search is performed from the right to left. -
hb_MemoRead(
<cFileName>
) β cString
Returns the contents of<cFileName>
(file of any size, limited only by system memory resources) as a character string. If<cFileName>
is not found, the function returns an empty string. If<cFileName>
does not contain a path, only the current directory is searched, (i.e. SET DEFAULT or SET PATH are ignored).
This function is identical to MemoRead() except it won't truncate the last byte (on non-UNIX compatible systems) if it's anEOF
char. -
hb_MemoWrit(
<cFileName>, <cString>
) β lSuccess
writes (or save) a memo field or character string to a text file on disk. If not specified a path,<cFileName>
is written to the current directory (ignores SET DEFAULT). If<cFileName>
already exists, it is overwritten. Returns.T.
on success or.F.
on failure.
NOTE: unlike MemoWrit(), this function never adds an EOFChr( 26 )
character at the end of the created file. -
hb_StrCDecode(
<cStr> [, @<lCont> ]
) β cDecodedString | NIL
decodes a string usingC
compiler rules.
If second parameter<lCont>
is passed by reference then it allows to decode multiline strings. In such case, if string ends with unclosed""
quoting then<lCont>
is set to.T.
, so a next call to this function with<lCont>
parameter continues string decoding.
The function returns decoded string orNIL
on syntax error. -
hb_StrClear(
@<cVar>
) β lResult
safely erases the content of a string variable, replacing every byte of it withChr(0)
. -
hb_StrDecodEscape(
<cEscSeqStr>
) β cDecodedString
decodes a string containing\
escape sequences. -
hb_StrFormat(
<%cFormat1...%cFormatN>, <nParam1, ..., nParamN>
) β cString
returns a string with the values of thenParam1, ..., nParamN
embedded and formatted according to<%cFormat1...%cFormatN>
format specifiers. It is similar tosprintf()
function inC-language
, but stripped down to fit the Harbour data types; currently the format specifiers that are recognized/processed are:%d
,%f
,%x
,%X
,%s
and%c
.Specifier Parameter type/value description %d
integer as a signed decimal number %f
non integer (float) decimal number in normal fixed-point notation. %x
,X
unsigned integer as a hexadecimal number. x
uses lower-case letters andX
uses upper-case.%s
string %c
ASCII code (numeric) of single character Note: the percent character
%
is a template character which is removed from the resulting string. example usage:? hb_StrFormat("- an integer : %d " + hb_eol() + ; "- a float : %3.5f " + hb_eol() + ; // 3.5 is int. & dec. digits "- a hex : %x " + hb_eol() + ; "- a HEX : %X " + hb_eol() + ; "- a string : %s " + hb_eol() + ; "- a character: %c " + hb_eol() , ; 12345, 42.421, 65535, 65535, "Hello harbour!", 65 )
(copy above code and paste in Harbour Plyground to run it).
-
hb_StrIsUTF8(
<cString>
) β .T|.F.
check if a string contains UTF8 characters(?). -
hb_StrReplace(
<cString>, [<cSource>|<acSource>|<hReplace>, [<cDest>|<acDest>]
) β cResult
replaces different sub-strings in given string. If 2nd parameter is string then each character in<cString>
which exists in<cSource>
at<n>
position is replaced by corresponding character at<n>
position in<cDest>
or string from<acDest>[<n>]
. If 2nd parameter is array then each<cString>
sub-string which exists in<acSource>
at<n>
position is replaced by corresponding character at<n>
position in<cDest>
or string from<acDest>[<n>]
. If<n>
is longer then LEN() of<cDest>
or<acDest>
then given character/sub-string is removed from result. -
hb_StrShrink(
<cString> [, <nCount>]
) β cNewString
Shrinks the string by 'eating'<nCount>
characters from right side of the string.
It's similar toLeft( cString, Len( cString ) - nCount )
but simpler and faster.
If optional parameter<nCount>
is omitted, it defaults to1
. If<nCount>
is equal to or greater than Len(<cString>
) then a null string is returned. -
hb_StrToExp(
<cString> [, <lEscaped>]
) β cExpression
converts string item to valid expression which can be compiled by macro compiler. String may contain any characters. However, if<lEscaped>
passed as .T., the string returned will be prefixed by the escape literal 'e' (e.g.: e"String"). -
hb_StrToHex(
<cString> [, <cSeparator> ]
) β cHexValues
converts a string (or buffer) into a string of it's corresponding hexadecimal values optionally separated by<cSeparator>
. f.e.:hb_StrToHex( "harbour", ":" )
=>"68:61:72:62:6F:75:72"
or =>"686172626F7572"
when no":"
separator given. -
hb_strToUTF8(
<cStr> [, <cFromCPID> ]
) β cUTF8Str
converts ("translate")<cStr>
to UTF-8. It's similar tohb_Translate( cStr, cFromCPID, "UTF8EX" )
.
<cFromCPID>
is a Harbour codepage ID, f.e.: "EN", "ES", "ESWIN" in which<cStr>
is currently encoded and will be used as basis for translation and if not given then default HVM codepage (set by hb_cdpSelect()) is used. -
hb_StrXor(
<cString>, <cnBytes>
) β cResult
Performs a XOR operation in each character of<cString>
with the string or number supplied in<cnBytes>
. -
hb_TokenCount(
<cString> [, <cDelimiter>, <lSkipStrings>, <lDoubleQuoteOnly> ]
) β nTokens
returns the number of tokens (i.e. individual words between delimeters or spaces if no delimiters given) in the string.
NOTE: If<cDelimiter>
specified and<cDelimiter>
not found in<cString>
this function returns1
.
See also: hb_Atokens() for more detailed info about parameters and default values. -
hb_TokenGet(
<cString>, <nToken> [, <cDelimiter>, <lSkipStrings>, <lDoubleQuoteOnly>]
) β cToken
returns the<nToken>
'th token in the string.
If there is no<nToken>
it returns null string. It will also return a null string if<nToken>
==NIL
or0
.
See also: hb_Atokens() for more detailed info about parameters and default values. -
hb_TokenPtr(
<cString>, [@]<nPos> [, <cDelim>, <lSkipStrings>, <lDoubleQuoteOnly>]
) β cToken
returns the first token found in a string. First, it will search for a token separator starting from<nPos>
character and will return the following token (or portion of it). It is0-based
, which means you have to pass<nPos>=0
to get the 1st token. If<nPos>
is passed by@reference
, it will hold the position from where the next token search will start. -
hb_Translate(
<cString>, [<FromCodePageID>], [<ToCodePageID>]
) β cConvertedString
changes character encoding of given string from one code-page to the other.
If either of<FromCodePageID>
or<ToCodePageID>
is ommited then current (hb_vmCDP) codepage is used. If none of them is specified or is invalid or not available, an RTE occurs, which means that before invoking the function, the code page module(s) needed for conversion must have been linked (using aREQUEST HB_CODEPAGE_XXXXX
statement).
For a complete list of code-pages supported by Harbour see hb_cdpList() function.hb_Translate() usage example:
REQUEST hb_codepage_elwin, hb_codepage_utf8ex LOCAL cUTF8String := hb_Translate( "ΟΞ± ΟάνΟΞ± ΟΡί", "ELWIN", "UTF8" ) ? cUTF8String // prints string converted from codepage `cp-1253` to `UTF-8`
-
hb_UAt(
<cSubString>, <cString>, [<nFrom>], [<nTo>]
) β nAt
Unicode counterpart of 'hb_At()'. -
hb_UCode(
<cText>
) β nCode
return Unicode value of 1-st character (not byte) in given string. Similar to 'hb_UTF8Asc()'. -
hb_ULen(
<cText>
) β nChars
returns the length of unicode string<cText>
, in characters. -
hb_ULeft(
<cString>, <nCount>
) β cSubstring
same as 'Left()' but applicable to UTF8 encoded text. -
hb_UPadC(
<exp>, <nLength>, [<cFillChar>]
) β cPaddedString
same as 'PadC()' but Unicode oriented. -
hb_UPadL(
<exp>, <nLength>, [<cFillChar>]
) β cPaddedString same as 'PadL()' but Unicode oriented. -
hb_UPadR(
<exp>, <nLength>, [<cFillChar>]
) β cPaddedString
same as 'PadR()' but Unicode oriented. -
hb_UPeek(
<cText>, <n>
) β nCode
return unicode value of<n>
-th character in given string -
hb_UPoke(
[@]<cText>, <n>, <nVal>
) β cText
change<n>-th
character in given string to unicode<nVal>
one and return modified text. -
hb_URight(
<cString>, <nCount>
) β cSubstring -
hb_UStuff(
<cString>, <nAt>, <nDel>, <cIns>
) β cResult
Unicode counterpart of 'Staff()'. -
hb_USubStr(
<cString>, <nStart>, <nCount>
) β cSubstring
Unicode counterpart of 'SubStr()'. -
hb_utf8Asc(
<cExp>
) β nUTF8CharCode
same as 'Asc()' but applicable to UTF8 encoded text. Similar to 'hb_UCode()'. -
hb_utf8At(
<...>
) β nPos
same as hb_At() but applicable to UTF8 encoded text. -
hb_utf8Chr(
<n>
) β cUTF8Char
same as 'Chr()' but applicable to UTF8 encoded text. -
hb_utf8Left(...) β cString
same as 'Left()' but applicable to UTF8 encoded text. -
hb_utf8Len(`' ) β nLen same as 'Len()' but applicable to UTF8 encoded text.
-
hb_utf8Peek(
<cText>, <n>
) β nCode
return UTF8 value of<n>-th
character in given string. -
hb_utf8Poke(
[@]<cText>, <n>, <nVal>
) β cText
replace<n>-th
character in given string to UTF8<nVal>
one and return modified text. -
hb_utf8RAt() β nPos
same as 'hb_RAt()' but applicable to UTF8 encoded text. -
hb_utf8Right() β nPos
same as 'Right()' but applicable to UTF8 encoded text. -
hb_utf8StrTran() β cString
same as 'StrTran()' but applicable to UTF8 encoded text. -
hb_utf8Stuff() β cString
same as 'Stuff()' but applicable to UTF8 encoded text. -
hb_utf8SubStr() β cString
same as 'SubStr()' but applicable to UTF8 encoded text. -
hb_utf8ToStr(
<cUTF8Str> [, <cCPID>]
) β cStr
it performs "translation" from UTF-8 to<cCPID>
Harbour codepage id, f.e.: "EN", "ES", "ESWIN" etc.
<cUTF8Str>
supposed to be a UTF-8 encoded string.
When<cCPID>
is not given then the default HVM codepage (i.e. that set by hb_cdpSelect()) is used. -
hb_WildMatch(
<cPattern>, <cValue> [, <lExact>]
) β lMatch
compares<cValue>
with<cPattern>
which may contain wildcard characters (?*
).
When optional parameter<lExact>
is TRUE, then it will check if whole<cValue>
is covered by<cPattern>
else it will check if<cPattern>
is a prefix of<cValue>
. If thecPattern
is an empty string, function returns.T.
-
hb_WildMatchI(
<cPattern>, <cValue>
) β lMatch
similar to above 'hb_WildMatch()' but case insensitive. -
Left(
<cString>, <nCount>
) β cSubString
extracts the first<nCount>
characters from the left side of<cString>
. If<nCount>
is larger than the length of<cString>
, entire<cString>
is returned. -
Len(
<cString>|<aArray>|<hHash>
) β nLength
returns the number of items contained in a string, array or hash. -
Lower(
<cString>
) β cLowerString
returns a replica of but with all alphabetic characters converted to lowercase. All other characters remain the same as in the original string. -
LTrim(
<cString>
) β cTrimString
removes all leading spaces of<cString>
. If the string contains spaces only, a null string (""
) is returned. -
MemoEdit(
<cString>, [<nTop>], [<nLeft>], [<nBottom>], [<nRight>], [<lEditMode>], [<cUserFunction>], [<nLineLength>], [<nTabSize>], [<TextBufferRow>], [<nTextBufferColumn>], [<nWindowRow>], [<nWindowColumn>]
) β cTextBuffer -
MemoLine(
<cString>, [<nLineLength>], [<nLineNumber>], [<nTabSize>], [<lWrap>]
) β cLine -
MemoRead(
<cFileName>
) β cString
Returns the contents of<cFileName>
(file of any size limited only by system memory rersouces) as a character string or empty string if<cFileName>
not found. If<cFileName>
does not contain a path, only the current directory is searched, (SET DEFAULT or SET PATH are ignored). -
MemoTran(
<cString>, [<cReplaceHardCR>], [<cReplaceSoftCR>]
) β cNewString
returns a copy of<cString>
where allCR/LF
(carriage return / line feed) pairs are replaced. If<cReplaceHardCR>
not specified defaults to semicolon. If<cReplaceSoftCR>
not specified, defaults to single space. -
MemoWrit(
<cFileName>, <cString>
) β lSuccess
writes (or save) a memo field or character string to a text file on disk. If not specified a path,<cFileName>
is written to the current directory (ignores SET DEFAULT). If<cFileName>
already exists, it is overwritten. NOTE: this function always adds an EOFChr( 26 )
character at the end of the created file. -
MLCount(
<cString>, [<nLineLength>], [<nTabSize>], [<lWrap>]
) β nLines -
MLCToPos(
<cText>, <nWidth>, <nLine>, <nCol>, [<nTabSize>], [<lWrap>]
) β nPosition -
MLPos(
<cString>, <nLineLenght>, <nLine>, [<nTabSize>], [<lWrap>]
) β nPosition -
MPosToLC(
<cText>, <nWidth>, <nPos>, [<nTabSize>], [<lWrap>]
) β aLineColumn
returns an array containing the line and the column values for the specified<nPos>
byte position. -
RAt(
<cSearch>, <cTarget>
) β nPosition
searches the string<cString>
from right to left for the character string<cSearch>
- NOTE: it is case-sensitive. -
Replicate(
<cString>, <nCount>
) β cRepeatedString
returns a character string made up of<nCount>
times the<String>
. -
Right(
<cString>, <nCount>
) β cSubString
the right-to-left counterpart of Left() function (refer..). -
RTrim(
<cString>
) β cTrimedString
removes all trailing spaces from a string. -
SoundEx(
<cString>
) β cSoundExString
converts<cString>
to a four-character code used to find similar-sounding words or names. The first character of the code is the first character of<cString>
and the rest three are coded numbers. Vowels are ignored unless they are the first letter of the string. It's possibly useful function to search for sound-alike english words. Does NOT support characters with codes above 127 in ASCII table. -
Space(
<nNumber>
) β cSpaceChars
returns a string that containsnNumber
space character(s) (ASCII character 32). -
StrTran(
<cString>, <cFindString>, [<cReplaceWith>], [<nStart>], [<nOccurences>]
) β cNewString
it searches into<cString>
for any occurrence of<cFindString>
, and replaces it with<cReplaceWith>
. If<cRepLaceWith>
is not specified, a NULL byte will replace the<cFindString>
or in other words the<cFindString>
will be removed.<nStart>
is the starting occurrence to be replaced (default is 1st occurence).<nOccurrences>
is the number of occurrences to be replaced (default is all occurences). -
Stuff(
<cString>, <nStart>, <nDelete>, <cInsert>
) β cNewString
inserts and/or deletes characters in a string. Basically, inserts<cInsert>
character(s) at position<nPos>
and concurrently deletes<nDelete>
character(s) starting from<nStart>
and beyond. Evidently, if<nDelete
is0
no character is deleted. Likewise if<cInsert>
is null string""
, nothing is inserted. -
SubStr(
<cString>, <nStart>, [<nCount>]
) β cSubstring
returns a sub-string extracted from the string<cString>
.<nStart>
is the position of<cString>
from which it'll start.<nCount>
is the number of characters to be returned and if not given, all characters from<nStart>
up to the end of<cString>
will be returned. -
Trim(
<cString>
) β cTrimedString
removes all trailing spaces from a string. Identical to RTrim(). -
Upper(
<cString>
) β cUpperString
converts all characters of<cString>
to uppercase.