Strings - Petewg/harbour-core GitHub Wiki

String functions could be considered those that receive a string argument, as their main parameter, and do things (or some kind of whatever processing) with this string. [This is not necessarily an "official" definition, if any; it's just the way I do understand them, as well as, a (simplistic) criterion for they be included in this group].


πŸ”™ Functions-by-category

  • AllTrim( <cString> ) ➜ cTrimString
    removes leading and trailing spaces from <cString>; note that the term 'spaces' includes not only space character Chr(32) but also carriage return Chr(13), line feed Chr(10), and tab character Chr(9), meaning that, if present, they are removed as well.

  • At( <cSearchFor>, <cIntoString> ) ➜ nPosition
    returns the position in <cIntoString> where <cSearchFor> is found or 0 (zero) when <cSearchFor> is not found. NOTE: Search is case sensitive!

  • hb_At(<cSearchFor>, <cIntoString>, [<nStart>], [<nEnd>]) ➜ nPosition
    returns a positive numeric value (or zero if nothing found) which is the position into <cIntoString> where <cSearchFor> is found. Searching starts from <nStart> (default=1) up to <nEnd> (default=Len(cIntoString)). Note: searching is case sensitive!

  • hb_AtI(<cSearchFor>, <cIntoString>, [<nStart>], [<nEnd>]) ➜ nPos
    Same as hb_At() above, but case-Insensitive.

  • hb_ATokens(<cString>, [<cDelimiter>|lEOL], [<lSkipStrings>], [<lDoubleQuoteOnly>]) ➜ aTokens
    returns an array filled with all individual tokens of a string, that is, all the separate sub-strings that are delimited by either <cDelimiter> or by EOL (end of line) if <lEOL> has been passed (instead of <cDelimiter>) and evaluates to .T.

    • If neither <cDelimiter> nor <lEOL> specified then as delimiter is used an empty space (default).

    • If <lSkipStrings> is .T. (default=.F.), the quoted sub-strings (if any, inside cString) are not tokenized (i.e. are not searched for <cDelimiter>).

    • If <lDoubleQuoteOnly> is .T. only the double quote " is considered as a quote sign. This argument is meaningful only when <lSkipStrings>=.T.

    • NOTES:

        1. tokenization is case sensitive, in the (rare) case that <cDelimiter> is a letter-character.
        1. The delimiters are removed from tokens.
    • See also: more tokenization functions

  • hb_BAt(<cSubString>, <cString>, [<nFrom>], [<nTo>]) ➜ nPosition
    returns the position number into <cString> where the <cSubString> is found, or zero if it is not found.
    works like hb_At() but operates on byte/binary strings and is Code-Page independent.

  • hb_BCode(<cText>) ➜ nCode
    returns the value of the 1-st byte in a given <cText> string. (code-page independent byte/binary operation)

  • hb_BPeek(<cText>, <n>) ➜ nCode
    return value of n-th byte in given string.

  • hb_BPoke([@]<cText>, <n>, <nVal>) ➜ cText
    change n-th byte in given string to <nVal> and return modified text.

  • hb_BRAt(<cSubString>, <cString>, [<nFrom>], [<nTo>]) ➜ nPosition
    Same as hb_RAt() but for raw/binary strings.

  • hb_BRight(<cString>, <nCount>) ➜ cSubstring

  • hb_BStuff(<cString>, <nAt>, <nDel>, <cIns>) ➜ cResult

  • hb_BSubStr(<cString>, <nStart>, <nCount>) ➜ cSubstring

  • hb_LeftEq(<cString>, <cSubString>) ➜ lEqual
    returns .T. when all characters (with the given order) of <cSubString> matching to leftmost (same length) part of <cString>. Basicaly it's equivalent to: ( Left( <cString1>, Len( <cString2> ) ) == <cString2> ), but faster and shorter. Can be used f.e. to check if <cString> begins with <cSubString>. NOTE: Case sensitive!

  • hb_LeftEqI(<cString>, <cSubString>) ➜ lEqual
    returns .T. when all characters (with the given order) of <cSubString> matching to leftmost (same length) part of <cString>. Same as 'hb_LeftEq()' but case insensitive!

  • hb_RAt( <cSearchFor>, <cIntoString>, [<nStart>], [<nEnd>] ) ➜ nPosition
    It finds the last (rightmost) match of <cSearchFor> into <cIntoString> in the range <nStart>-<nEnd>. IOW, the search is performed from the right to left.

  • hb_MemoRead(<cFileName>) ➜ cString
    Returns the contents of <cFileName> (file of any size, limited only by system memory resources) as a character string. If <cFileName> is not found, the function returns an empty string. If <cFileName> does not contain a path, only the current directory is searched, (i.e. SET DEFAULT or SET PATH are ignored).
    This function is identical to MemoRead() except it won't truncate the last byte (on non-UNIX compatible systems) if it's an EOF char.

  • hb_MemoWrit( <cFileName>, <cString>) ➜ lSuccess
    writes (or save) a memo field or character string to a text file on disk. If not specified a path, <cFileName> is written to the current directory (ignores SET DEFAULT). If <cFileName> already exists, it is overwritten. Returns .T. on success or .F. on failure.
    NOTE: unlike MemoWrit(), this function never adds an EOF Chr( 26 ) character at the end of the created file.

  • hb_StrCDecode( <cStr> [, @<lCont> ] ) ➜ cDecodedString | NIL
    decodes a string using C compiler rules.
    If second parameter <lCont> is passed by reference then it allows to decode multiline strings. In such case, if string ends with unclosed "" quoting then <lCont>is set to .T., so a next call to this function with <lCont> parameter continues string decoding.
    The function returns decoded string or NIL on syntax error.

  • hb_StrClear( @<cVar> ) ➜ lResult
    safely erases the content of a string variable, replacing every byte of it with Chr(0).

  • hb_StrDecodEscape( <cEscSeqStr> ) ➜ cDecodedString
    decodes a string containing \ escape sequences.

  • hb_StrFormat( <%cFormat1...%cFormatN>, <nParam1, ..., nParamN> ) ➜ cString
    returns a string with the values of the nParam1, ..., nParamN embedded and formatted according to <%cFormat1...%cFormatN> format specifiers. It is similar to sprintf() function in C-language, but stripped down to fit the Harbour data types; currently the format specifiers that are recognized/processed are: %d, %f, %x, %X, %s and %c.

    Specifier Parameter type/value description
    %d integer as a signed decimal number
    %f non integer (float) decimal number in normal fixed-point notation.
    %x, X unsigned integer as a hexadecimal number. x uses lower-case letters and X uses upper-case.
    %s string
    %c ASCII code (numeric) of single character

    Note: the percent character % is a template character which is removed from the resulting string. example usage:

    ? hb_StrFormat("- an integer : %d " + hb_eol() + ;
                   "- a float    : %3.5f " + hb_eol() + ; // 3.5 is int. & dec. digits
                   "- a hex      : %x " + hb_eol() + ;
                   "- a HEX      : %X " + hb_eol() + ;
                   "- a string   : %s " + hb_eol() + ;
                   "- a character: %c " + hb_eol() , ;
                  12345, 42.421, 65535, 65535, "Hello harbour!", 65 )

    (copy above code and paste in Harbour Plyground to run it).

  • hb_StrIsUTF8( <cString> ) ➜ .T|.F.
    check if a string contains UTF8 characters(?).

  • hb_StrReplace( <cString>, [<cSource>|<acSource>|<hReplace>, [<cDest>|<acDest>] ) ➜ cResult
    replaces different sub-strings in given string. If 2nd parameter is string then each character in <cString> which exists in <cSource> at <n> position is replaced by corresponding character at <n> position in <cDest> or string from <acDest>[<n>]. If 2nd parameter is array then each <cString> sub-string which exists in <acSource> at <n> position is replaced by corresponding character at <n> position in <cDest> or string from <acDest>[<n>]. If <n> is longer then LEN() of <cDest> or <acDest> then given character/sub-string is removed from result.

  • hb_StrShrink( <cString> [, <nCount>] ) ➜ cNewString
    Shrinks the string by 'eating' <nCount> characters from right side of the string.
    It's similar to Left( cString, Len( cString ) - nCount ) but simpler and faster.
    If optional parameter <nCount> is omitted, it defaults to 1. If <nCount> is equal to or greater than Len(<cString>) then a null string is returned.

  • hb_StrToExp( <cString> [, <lEscaped>] ) ➜ cExpression
    converts string item to valid expression which can be compiled by macro compiler. String may contain any characters. However, if <lEscaped> passed as .T., the string returned will be prefixed by the escape literal 'e' (e.g.: e"String").

  • hb_StrToHex( <cString> [, <cSeparator> ] ) ➜ cHexValues
    converts a string (or buffer) into a string of it's corresponding hexadecimal values optionally separated by <cSeparator>. f.e.: hb_StrToHex( "harbour", ":" ) => "68:61:72:62:6F:75:72" or => "686172626F7572" when no ":" separator given.

  • hb_strToUTF8( <cStr> [, <cFromCPID> ] ) ➜ cUTF8Str
    converts ("translate") <cStr> to UTF-8. It's similar to hb_Translate( cStr, cFromCPID, "UTF8EX" ).
    <cFromCPID> is a Harbour codepage ID, f.e.: "EN", "ES", "ESWIN" in which <cStr> is currently encoded and will be used as basis for translation and if not given then default HVM codepage (set by hb_cdpSelect()) is used.

  • hb_StrXor( <cString>, <cnBytes> ) ➜ cResult
    Performs a XOR operation in each character of <cString> with the string or number supplied in <cnBytes>.

  • hb_TokenCount( <cString> [, <cDelimiter>, <lSkipStrings>, <lDoubleQuoteOnly> ] ) ➜ nTokens
    returns the number of tokens (i.e. individual words between delimeters or spaces if no delimiters given) in the string.
    NOTE: If <cDelimiter> specified and <cDelimiter> not found in <cString> this function returns 1.
    See also: hb_Atokens() for more detailed info about parameters and default values.

  • hb_TokenGet( <cString>, <nToken> [, <cDelimiter>, <lSkipStrings>, <lDoubleQuoteOnly>] ) ➜ cToken
    returns the <nToken>'th token in the string.
    If there is no <nToken> it returns null string. It will also return a null string if <nToken> == NIL or 0.
    See also: hb_Atokens() for more detailed info about parameters and default values.

  • hb_TokenPtr( <cString>, [@]<nPos> [, <cDelim>, <lSkipStrings>, <lDoubleQuoteOnly>] ) ➜ cToken
    returns the first token found in a string. First, it will search for a token separator starting from <nPos> character and will return the following token (or portion of it). It is 0-based, which means you have to pass <nPos>=0 to get the 1st token. If <nPos> is passed by @reference, it will hold the position from where the next token search will start.

  • hb_Translate( <cString>, [<FromCodePageID>], [<ToCodePageID>] ) ➜ cConvertedString
    changes character encoding of given string from one code-page to the other.
    If either of <FromCodePageID> or <ToCodePageID> is ommited then current (hb_vmCDP) codepage is used. If none of them is specified or is invalid or not available, an RTE occurs, which means that before invoking the function, the code page module(s) needed for conversion must have been linked (using a REQUEST HB_CODEPAGE_XXXXX statement).
    For a complete list of code-pages supported by Harbour see hb_cdpList() function.

    hb_Translate() usage example:

      REQUEST hb_codepage_elwin, hb_codepage_utf8ex  
      LOCAL cUTF8String := hb_Translate( "τα πάντα ρΡί", "ELWIN", "UTF8" ) 
      ? cUTF8String  // prints string converted from codepage `cp-1253` to `UTF-8`
  • hb_UAt( <cSubString>, <cString>, [<nFrom>], [<nTo>] ) ➜ nAt
    Unicode counterpart of 'hb_At()'.

  • hb_UCode( <cText> ) ➜ nCode
    return Unicode value of 1-st character (not byte) in given string. Similar to 'hb_UTF8Asc()'.

  • hb_ULen( <cText> ) ➜ nChars
    returns the length of unicode string <cText>, in characters.

  • hb_ULeft( <cString>, <nCount> ) ➜ cSubstring
    same as 'Left()' but applicable to UTF8 encoded text.

  • hb_UPadC(<exp>, <nLength>, [<cFillChar>]) ➜ cPaddedString
    same as 'PadC()' but Unicode oriented.

  • hb_UPadL(<exp>, <nLength>, [<cFillChar>]) ➜ cPaddedString same as 'PadL()' but Unicode oriented.

  • hb_UPadR(<exp>, <nLength>, [<cFillChar>]) ➜ cPaddedString
    same as 'PadR()' but Unicode oriented.

  • hb_UPeek( <cText>, <n> ) ➜ nCode
    return unicode value of <n>-th character in given string

  • hb_UPoke( [@]<cText>, <n>, <nVal> ) ➜ cText
    change <n>-th character in given string to unicode <nVal> one and return modified text.

  • hb_URight( <cString>, <nCount> ) ➜ cSubstring

  • hb_UStuff( <cString>, <nAt>, <nDel>, <cIns> ) ➜ cResult
    Unicode counterpart of 'Staff()'.

  • hb_USubStr( <cString>, <nStart>, <nCount> ) ➜ cSubstring
    Unicode counterpart of 'SubStr()'.

  • hb_utf8Asc( <cExp> ) ➜ nUTF8CharCode
    same as 'Asc()' but applicable to UTF8 encoded text. Similar to 'hb_UCode()'.

  • hb_utf8At(<...>) ➜ nPos
    same as hb_At() but applicable to UTF8 encoded text.

  • hb_utf8Chr( <n> ) ➜ cUTF8Char
    same as 'Chr()' but applicable to UTF8 encoded text.

  • hb_utf8Left(...) ➜ cString
    same as 'Left()' but applicable to UTF8 encoded text.

  • hb_utf8Len(`' ) ➜ nLen same as 'Len()' but applicable to UTF8 encoded text.

  • hb_utf8Peek( <cText>, <n> ) ➜ nCode
    return UTF8 value of <n>-th character in given string.

  • hb_utf8Poke( [@]<cText>, <n>, <nVal> ) ➜ cText
    replace <n>-th character in given string to UTF8 <nVal> one and return modified text.

  • hb_utf8RAt() ➜ nPos
    same as 'hb_RAt()' but applicable to UTF8 encoded text.

  • hb_utf8Right() ➜ nPos
    same as 'Right()' but applicable to UTF8 encoded text.

  • hb_utf8StrTran() ➜ cString
    same as 'StrTran()' but applicable to UTF8 encoded text.

  • hb_utf8Stuff() ➜ cString
    same as 'Stuff()' but applicable to UTF8 encoded text.

  • hb_utf8SubStr() ➜ cString
    same as 'SubStr()' but applicable to UTF8 encoded text.

  • hb_utf8ToStr( <cUTF8Str> [, <cCPID>] ) ➜ cStr
    it performs "translation" from UTF-8 to <cCPID> Harbour codepage id, f.e.: "EN", "ES", "ESWIN" etc.
    <cUTF8Str> supposed to be a UTF-8 encoded string.
    When <cCPID> is not given then the default HVM codepage (i.e. that set by hb_cdpSelect()) is used.

  • hb_WildMatch( <cPattern>, <cValue> [, <lExact>] ) ➜ lMatch
    compares <cValue> with <cPattern> which may contain wildcard characters (?*).
    When optional parameter <lExact> is TRUE, then it will check if whole <cValue> is covered by <cPattern> else it will check if <cPattern> is a prefix of <cValue>. If the cPattern is an empty string, function returns .T.

  • hb_WildMatchI(<cPattern>, <cValue>) ➜ lMatch
    similar to above 'hb_WildMatch()' but case insensitive.

  • Left( <cString>, <nCount> ) ➜ cSubString
    extracts the first <nCount> characters from the left side of <cString>. If <nCount> is larger than the length of <cString>, entire <cString> is returned.

  • Len( <cString>|<aArray>|<hHash> ) ➜ nLength
    returns the number of items contained in a string, array or hash.

  • Lower(<cString>) ➜ cLowerString
    returns a replica of but with all alphabetic characters converted to lowercase. All other characters remain the same as in the original string.

  • LTrim(<cString>) ➜ cTrimString
    removes all leading spaces of <cString>. If the string contains spaces only, a null string ("") is returned.

  • MemoEdit(<cString>, [<nTop>], [<nLeft>], [<nBottom>], [<nRight>], [<lEditMode>], [<cUserFunction>], [<nLineLength>], [<nTabSize>], [<TextBufferRow>], [<nTextBufferColumn>], [<nWindowRow>], [<nWindowColumn>]) ➜ cTextBuffer

  • MemoLine(<cString>, [<nLineLength>], [<nLineNumber>], [<nTabSize>], [<lWrap>]) ➜ cLine

  • MemoRead( <cFileName> ) ➜ cString
    Returns the contents of <cFileName> (file of any size limited only by system memory rersouces) as a character string or empty string if <cFileName> not found. If <cFileName> does not contain a path, only the current directory is searched, (SET DEFAULT or SET PATH are ignored).

  • MemoTran(<cString>, [<cReplaceHardCR>], [<cReplaceSoftCR>]) ➜ cNewString
    returns a copy of <cString> where all CR/LF (carriage return / line feed) pairs are replaced. If <cReplaceHardCR> not specified defaults to semicolon. If <cReplaceSoftCR> not specified, defaults to single space.

  • MemoWrit( <cFileName>, <cString> ) ➜ lSuccess
    writes (or save) a memo field or character string to a text file on disk. If not specified a path, <cFileName> is written to the current directory (ignores SET DEFAULT). If <cFileName> already exists, it is overwritten. NOTE: this function always adds an EOF Chr( 26 ) character at the end of the created file.

  • MLCount(<cString>, [<nLineLength>], [<nTabSize>], [<lWrap>]) ➜ nLines

  • MLCToPos(<cText>, <nWidth>, <nLine>, <nCol>, [<nTabSize>], [<lWrap>]) ➜ nPosition

  • MLPos(<cString>, <nLineLenght>, <nLine>, [<nTabSize>], [<lWrap>]) ➜ nPosition

  • MPosToLC(<cText>, <nWidth>, <nPos>, [<nTabSize>], [<lWrap>]) ➜ aLineColumn
    returns an array containing the line and the column values for the specified <nPos> byte position.

  • RAt(<cSearch>, <cTarget>) ➜ nPosition
    searches the string <cString> from right to left for the character string <cSearch> - NOTE: it is case-sensitive.

  • Replicate( <cString>, <nCount> ) ➜ cRepeatedString
    returns a character string made up of <nCount> times the <String>.

  • Right(<cString>, <nCount>) ➜ cSubString
    the right-to-left counterpart of Left() function (refer..).

  • RTrim(<cString>) ➜ cTrimedString
    removes all trailing spaces from a string.

  • SoundEx(<cString>) ➜ cSoundExString
    converts <cString> to a four-character code used to find similar-sounding words or names. The first character of the code is the first character of <cString> and the rest three are coded numbers. Vowels are ignored unless they are the first letter of the string. It's possibly useful function to search for sound-alike english words. Does NOT support characters with codes above 127 in ASCII table.

  • Space( <nNumber> ) ➜ cSpaceChars
    returns a string that contains nNumber space character(s) (ASCII character 32).

  • StrTran(<cString>, <cFindString>, [<cReplaceWith>], [<nStart>], [<nOccurences>]) ➜ cNewString
    it searches into <cString> for any occurrence of <cFindString>, and replaces it with <cReplaceWith>. If <cRepLaceWith> is not specified, a NULL byte will replace the <cFindString> or in other words the <cFindString> will be removed. <nStart> is the starting occurrence to be replaced (default is 1st occurence). <nOccurrences> is the number of occurrences to be replaced (default is all occurences).

  • Stuff(<cString>, <nStart>, <nDelete>, <cInsert>) ➜ cNewString
    inserts and/or deletes characters in a string. Basically, inserts <cInsert> character(s) at position <nPos> and concurrently deletes <nDelete> character(s) starting from <nStart> and beyond. Evidently, if <nDelete is 0 no character is deleted. Likewise if <cInsert> is null string "", nothing is inserted.

  • SubStr(<cString>, <nStart>, [<nCount>]) ➜ cSubstring
    returns a sub-string extracted from the string <cString>. <nStart> is the position of <cString> from which it'll start. <nCount> is the number of characters to be returned and if not given, all characters from <nStart> up to the end of <cString> will be returned.

  • Trim(<cString>) ➜ cTrimedString
    removes all trailing spaces from a string. Identical to RTrim().

  • Upper(<cString>) ➜ cUpperString
    converts all characters of <cString> to uppercase.


πŸ”™ Functions-by-category

⚠️ **GitHub.com Fallback** ⚠️