GREL String Functions - tfmorris/OpenRefine GitHub Wiki
String functions supported by OpenRefine Expression Language (GREL)
See also: All GREL Functions.
Returns the length of s as a number.
Returns boolean indicating whether s starts with sub. For example, startsWith("food", "foo") returns true, whereas startsWith("food", "bar") returns false. You could also write the first case as "food".startsWith("foo").
Returns boolean indicating whether s ends with sub. For example, endsWith("food", "ood") returns true, whereas endsWith("food", "odd") returns false. You could also write the first case as "food".endsWith("foo").
Returns boolean indicating whether s contains sub. For example, contains("food", "oo") returns true whereas contains("food", "ee") returns false. You could also write the first case as "food".contains("oo").
Returns s converted to lowercase.
Returns s converted to uppercase.
Returns s converted to titlecase. For example, toTitlecase("Once upon a midnight dreary") returns the string Once Upon A Midnight Dreary.
Returns a copy of the string, with leading and trailing whitespace removed. For example, trim(" island ") returns the string island.
Returns a copy of s with sep removed from the end if s ends with sep; otherwise, just returns s. For example, chomp("hardly", "ly") and chomp("hard", "ly") both return the string hard.
Returns the substring of s starting from character index from and upto character index to. If to is omitted, it's understood as the end of the string s. For example, substring("profound", 3) returns the string found, and substring("profound", 2, 4) returns the string of.
Character indexes start from zero. Negative character indexes are understood as counting from the end of the string. For example, substring("profound", 1, -1) returns the string rofoun.
See substring function above.
See substring function above.
Returns the index of sub first ocurring in s as a character index; or -1 if s does not contain sub. For example, indexOf("internationalization", "nation") returns 5, whereas indexOf("internationalization", "world") returns -1.
Returns the index of sub last ocurring in s as a character index; or -1 if s does not contain sub. For example, lastIndexOf("parallel", "a") returns 3 (pointing at the second character "a").
Returns the string obtained by replacing f with r in s. f can be a regular expression, in which case r can also contain capture groups declared in f.
For a simple example, replace("The cow jumps over the moon and moos", "oo", "ee") returns the string The cow jumps over the meen and mees.
Returns the string obtained by replacing any character in s that is also in f with the character r. For example, replaceChars("commas , and semicolons ; are separators", ",;", "**") returns the string commas ** and semicolons ** are separators.
Attempts to match the string s in its entirety against the regex pattern p and returns an array of capture groups. For example, match("230.22398, 12.3480", /\.(\d\d\d+)/) returns an array of 1 string: 3480. match("230.22398, 12.3480", /.**\.(\d+).*\.(\d+)/) returns an array of 2 strings: 22398 and 3480.
Returns o converted to a number.
Returns the array of strings obtained by splitting s at wherever sep is found in it. sep can be either a string or a regular expression. For example, split("fire, water, earth, air", ",") returns the array of 4 strings: "fire", " water", " earth" , and " air". The double quotation marks are shown here only to highlight the fact that the spaces are retained.
Returns the array of strings obtained by splitting s into substrings with the given lengths. For example, splitByLengths("internationalization", 5, 6, 3) returns an array of 3 strings: inter, nation, and ali.
Returns the array of strings obtained by splitting s by the separator sep. Handles quotes properly. Guesses tab or comma separator if sep is not given. Also, value.escape('javascript') is useful for previewing unprintable chars prior to using smartSplit.
smartSplit(value,"\n") //split cell at Carriage Return or New Line char
Returns an array of strings obtained by splitting s into groups of consecutive characters where the characters within each group share the same unicode type, and consecutive groups differ in their unicode types.
Returns an array of strings [ a, frag, b ] where a is the substring within s before the first occurrence of frag in s, and b is the substring after frag in s. For example, partition("internationalization", "nation") returns 3 strings: inter, nation, and alization. If s does not contain frag, it returns an array of [ s, "", "" ] (the first string is the original s and the second and third strings are empty).
If omitFragment is true, frag is not returned. That is, the result is an array of only 2 elements.
Returns an array of strings [ a, frag, b ] where a is the substring within s before the last occurrence of frag in s, and b is the substring after frag in s. For example, partition("parallel", "a") returns 3 strings: par, a, and llel. If s does not contain frag, it returns an array of [ s, "", "" ] (the first string is the original s and the second and third strings are empty).
For strings, returns the portion where they differ. For dates, it returns the difference in given time units.
Escapes s in the given escaping mode: html, xml, csv, url, javascript.
Unescapes s in the given escaping mode: html, xml, csv, url, javascript.
Returns the MD5 hash of s.
Returns the SHA-1 hash of s.
Returns the a phonetic encoding of s, optionally indicating which encoding to use (defaults to DoubleMetaphone).
Returns s reinterpreted thru the given encoder. Supported encodings here.
Returns the fingerprint of s, a derived string that aims to be a more canonical form of it (this is mostly useful for finding clusters of strings related to the same information).
Returns an array of the word ngrams of s.
Returns the n-gram fingerprint of s.
Returns an array of strings describing each character of s in their full unicode notation.
Returns an array of strings describing each character of s in their full unicode notation.
Quotes s so that it can be used as a Freebase key. More info: MQL key escaping
Unquotes the MQL quoted string key back to its original form. More info: MQL key escaping