Strings are null terminated sequences of bytes representing sequences of 
characters. 
The usual ASCII characters are represented with a single bytes. Some 
characters are represented with multiple bytes. Most Lush functions deal 
with strings as sequences of bytes without regard to their character 
interpretation. Exceptions to this rule are indicated when appropriate. 
The textual representation of a string is composed of the characters 
enclosed between double-quotes. A string may contain macro-characters, 
parentheses, semi-colons, as well as any other character. A line 
terminating backslash indicates a multi-line string. 
The following ``C style'' escape sequences are recognized inside a 
string: 
-  \\ for a single 
backslash, 
 
-  \" for a double quote, 
 
-  \n , \r 
, \t , \b 
, \f respectively for a linefeed 
character (Ascii LF), a carriage return (Ascii CR), a tab character 
(Ascii TAB), a backspace character (Ascii BS), and a formfeed character 
(Ascii FF), 
 
-  \e for a 
end-of-file character (Stdio's EOF), 
 
-  \^? for a control character 
control-? , 
 
-  \ooo for a byte whose octal 
representation is ooo . 
 
-  \xhh for a byte whose hexadecimal 
representation is hh . 
 
-  \uhhhh or 
\Uhhhhhh for the representation of unicode character 
hhhh or hhhhhh in the 
current locale. If no such representation exists, the utf8 
representation is used. 
 
| 3.7.0. Basic String Functions 
 |  | 
Like most Lush functions, the basic functions operating on strings do 
not modify their arguments. They create instead a new string on the 
basis of their arguments. 
See: (> n1 
n2 ) 
| 3.7.0.0. (concat  s1 ... sn) 
 | [DX] | 
Concatenates strings s1 to 
sn . 
Example: 
? (concat "hello" " my friends")
= "hello my friends"
Returns the number of bytes in string s 
. 
Example: 
? (len "abcd")
= 4
| 3.7.0.2. (mid  s n [l]) 
 | [DX] | 
Returns a substring of s composed of 
l bytes starting at byte position n 
. The position n is a number between 1 
and the byte length of the string minus 1. When argument 
l is ommitted, function mid 
returns characters until the end of the string s 
. 
Example: 
? (mid "alphabet" 3 2)
= "ph"
? (mid "alphabet" 3)
= "phabet"
| 3.7.0.3. (right  s n) 
 | [DX] | 
Returns a string composed with n 
rightmost bytes of s . 
Example: 
? (right "alphabet" 3)
= "bet"
Returns a string composed with the n 
leftmost bytes of s . 
Example: 
? (left "alphabet" 3)
= "alp"
| 3.7.0.5. (strins s1 n s2) 
 | [DX] | 
Insert string s2 at byte 
n into the string s1 , and 
returns the result. When n is equal to 
0, the strins function actually 
concatenates s2 and 
s1 . 
Example: 
? (strins "alphabet" 3 "***")
= "alp***habet"
| 3.7.0.6. (strdel s1 n l) 
 | [DX] | 
Removes l bytes from string 
s1 starting at byte offset n 
. 
Example: 
? (strdel "alphabet" 3 2)
= "alabet"
| 3.7.0.7. (index s r [n]) 
 | [DX] | 
Searches the first occurrence of the string s 
in the string r , starting at byte 
position n . 
index returns the position of the first match. If such an 
occurrence cannot be found, it returns the empty list. 
Example: 
? (index "pha" "alpha alphabet alphabetical" 4)
= 9
Returns string s with all characters 
converted to uppercase according to the current locale. 
Example: 
? (upcase "alphabet")
= "ALPHABET"
| 3.7.0.9. (upcase1 s) 
 | [DX] | 
Returns string s with first character 
converted to uppercase according to the current locale. 
Example: 
? (upcase1 "alphabet")
= "Alphabet"
| 3.7.0.10. (downcase s) 
 | [DX] | 
Returns string s with all characters 
converted to lowercase according to the current locale. 
Example: 
? (downcase "alPHABet")
= "alphabet"
Returns the numerical value of s 
considered as a number. Returns the empty list if 
s does not represent a decimal or hexadecimal number. 
Example: 
? (val "3.14")
= 3.14
? (val "abcd")
= ()
? (val "0xABCD")
= 43981
Returns the decimal string representation of the number 
n . 
Example: 
? (str (2* 3.14))
= "6.28"
| 3.7.0.13. (strhex n) 
 | [DX] | 
Returns the hexadecimal string representation of integer number 
n . 
Example: 
? (strhex 18)
= "0x12"
| 3.7.0.14. (strgptr p) 
 | [DX] | 
Returns the hexadecimal string representation of pointer 
p preceded by an ampersand. 
Returns the value the first byte of string s 
. This function causes an error if s 
is an empty string. 
Example 
? (asc "abcd")
= 97
Returns a string containing a single byte whose value is 
n . Integer n must be in 
range 0 to 255. 
Example 
? (chr 48)
= "0"
| 3.7.0.17. (isprint s) 
 | [DX] | 
Returns t if string 
s contains only printable charactersa according to the 
current locale. 
Example: 
? (isprint "alpha bet")
= t
? (isprint "alpha\^Cbet")
= ()
Returns a string representation for the lisp object 
l . pname is able to give a 
string representation for numbers, strings, symbols, lists, etc... 
Example: 
? (pname (cons 'a '(b c)))
= "(a b c)"
| 3.7.0.19. (sprintf format ... args ... ) 
 | [DX] | 
Like the C language function sprintf , 
this function returns a string similar to a format string 
format . The following escape sequences, however are replaced 
by a representation of the corresponding arguments of 
sprintf : 
-  "%%" is replaced by a 
single \%. 
 
-  "%l" is replaced by a 
representation of a lisp object. 
 
-  "%[-][n]s" is replaced by 
a string, right justified in a field of length n 
if n is specified. When the optional 
minus sign is present, the string is left justified. 
 
-  "%[-][n]d" is replaced by 
an integer, right justified in a field of n 
characters, if n is specified. When 
the optional minus sign is present, the string is left justified. 
 
-  "%[-][n[.m]]c" 
where c is one of the characters 
e , f or 
g , is replaced by a floating point number in a 
n character field, with m 
digits after the decimal point. e 
specifies a format with an exponent, f 
specifies a format without an exponent, and g 
uses whichever format is more compact. When the optional minus sign is 
present, the string is left justified. 
 
Example: 
? (sprintf "%5s(%3d) is equal to %6.3f\n" "sqrt" 2 (sqrt 2))
= " sqrt(  2) is equal to  1.414\n"
| 3.7.0.20. (strip s) 
 | [DE] (sysenv.lsh) | 
This function deletes the leftmost and rightmost spaces in string 
s . 
(strip "  This sentences is full   of spaces.   ")
| 3.7.0.21. (stripl s) 
 | [DE] (sysenv.lsh) | 
This function deletes the leftmost spaces in string 
s . 
(stripl "  This sentences is full   of spaces.   ")
| 3.7.0.22. (stripr s) 
 | [DE] (sysenv.lsh) | 
This function deletes the rightmost spaces in string 
s . 
(stripr "  This sentences is full   of spaces.   ")
| 3.7.1. Regular Expressions (regex) 
 |  | 
A regular expression describes a family of strings built according to 
the same pattern. A regular expression is represented by a string which 
``matches'' (using certain conventions) any string in the family. TL 
provides four regular expression primitives ( 
regex-match , regex-extract 
, regex-seek , and 
regex-subst ) and several library functions. 
The conventions for describing regular expressions in Lush are quite 
similar to those used by the egrep 
unix utility: 
-  An ordinary character matches itself. Some 
characters, ( ) 
\ [ 
] | 
. ? 
* and \ have a special 
meaning, and should be quoted by prepending a backslash 
\ . The string "\\\\" 
actually is composed of two backslashes (because backslashes in strings 
should be escaped!), and thus matches a single backslash. 
 
-  A dot . matches any byte. 
 
-  A caret ^ matches the beginning 
of the string. 
 
-  A dollar sign $ matches the end 
of the string. 
 
-  A range specification matches any specified byte. For example, 
regular expression [YyNn] matches 
Y y 
N or n , regular expression 
[0-9] matches any digit, regular expression 
[^0-9] matches any byte that is not a digit, regular 
expression []A-Za-z] matches a closing 
bracket, or any uppercase or lowercase letter. 
 
-  The concatenation of two regular expressions matches the 
concatenation of two strings matches regular expression. Regular 
expressions can be grouped with parenthesis, and modified by the 
? + and 
* characters. 
 
-  A regular expression followed by a question mark 
? matches 0 or 1 instance of the single regular expression. 
 
-  A regular expression followed by a plus sign 
+ matches 1 or more instances of the single regular 
expression. 
 
-  A regular expression followed by a star * 
matches 0 or more instances of the single regular expression. 
 
-  Finally, two regular expressions separated by a bar | match any 
string matching the first or the second regular expression. 
 
Parenthesis can be used to group regular expressions. For instance, the 
regular expression "(+|-)?[0-9]+(\.[0-9]*)?" 
matches a signed number with an optional fractional part. Furthermore, 
there is a ``register'' associated with each parenthesized part of a 
regular expression. The matching routines use these registers to keep 
track of the characters matched by the corresponding part of the regular 
expression. This is useful with functions 
regex-extract and regex-subst 
. 
| 3.7.1.0. (regex-match r s) 
 | [DX] | 
Returns t if regular expression 
r exactly matches the entire string s 
. Returns the empty list otherwise. 
Example: 
? (regex-match "(+|-)?[0-9]+(\\.[0-9]*)?" "-56")
= t
| 3.7.1.1. (regex-extract r s) 
 | [DX] | 
If regular expression r matches the 
entire string s , this function 
returns a list of strings representing the contents of each register, 
that is to say the characters matched by each section of the regular 
expression r delimited by parenthesis. 
This is useful for extracting specific segments of a string. 
If the regular expression r does not 
match the string s , function 
regex-extract returns the empty list. If the regular 
expression r matches the string but 
does not contain parenthesis, this function retirns a list containing 
the initial string s . 
Example: 
? (regex-extract "(+|-)?([0-9]+)(\\.[0-9]*)?" "-56.23")
= ("-" "56" ".23")
| 3.7.1.2. (regex-seek r s [start]) 
 | [DX] | 
Searchs the first substring in s that 
matches the regular expression r , 
starting at position start in 
s . If the argument start 
is not provided, string s is searched 
from the beginning. 
If such a substring is found, regex-seek 
returns a list (begin length) , where 
begin is the index of the first character of the substring, 
and length is the length of the 
subscript. The instruction (mid s begin length) 
may be used to extract this substring. 
If no such substring exists, regex-seek 
returns the empty list. 
Example: 
? (regex-seek "(+|-)?[0-9]+(\\.[0-9]*)?," "a=56.2, b=57,")
= (3 5)
| 3.7.1.3. (regex-subst r s str) 
 | [DX] | 
Replaces all substring matching regular expression 
r in string str by string 
s . 
A ``register'' is associated to each piece of the regular expression 
r enclosed within parenthesis. Registers are numbered from 
%0 to %9 . During each 
match, the substring of str matching 
each piece of the regular expression is stored into the corresponding 
register. 
During the replacement process, characters %0 
to %9 in the replacement string 
s are substited the content of the corresponding register. (A 
single % is denoted as 
%% ). 
Example: 
? (regex-subst "([a-h])([1-8])" "%1%0" "e2-e4, d7-d5, d2-d4, d5xd4?")
= "2e-4e, 7d-5d, 2d-4d, 5dx4d?"
| 3.7.1.4. (regex-rseek r s [n [gr]]) 
 | [DE] (sysenv.lsh) | 
This function seeks recursively the first occurence of 
r in s . and returns the 
list made of the locations. 
When argument n is provided, it seeks 
and returns the locations of the n 
first occurences and it returns () on 
failure. 
Optional regex gr defines the allowed 
garbage stuff before and between occurences. When 
n is not provided, this function checks the garbage stuff 
after the occurences too. If unallowed garbage stuff is found, the 
function returns () . By default, any 
garbage stuff is allowed. 
Since even void garbage is checked, a caret "^" is often added to 
gr . 
| 3.7.1.5. (regex-split  r s [n [gr [neg]]]) 
 | [DE] (sysenv.lsh) | 
This function splits a string s into 
occurences of r . 
When integer n is provided, this 
function provides only the n first 
occurences. 
When regex gr is provided, garbage is 
checked (see function regex-rseek ). 
When neg is provided and non nil, this 
function returns the garbage stuff instead. When both 
n and neg are provided and 
non nil, the n garbages before and 
between the n first occurences are 
returned. 
| 3.7.1.6. (regex-skip  r s [n [gr [neg]]]) 
 | [DE] (sysenv.lsh) | 
This function skips the n first 
occurences of regex r in a string 
s . 
When n is equal to 0, it returns 
s . When n is lower than 0, 
it generates an error. When n is 
either nil or undefined, it is set to 1. 
When neg is either nil or undefined, 
it returns the right residual of s 
just following the n th occurence. 
When neg is not nil, it returns the 
right residual of s begining with the 
n th occurence. 
When regex gr is provided, garbage is 
checked (see function regex-rseek ). 
| 3.7.1.7. (regex-count r s) 
 | [DE] (sysenv.lsh) | 
This function recursively seeks the occurences of regex 
r in string s and returns 
the number of occurences found. 
| 3.7.1.8. (regex-tail r s [n [gr [neg]]]) 
 | [DE] (sysenv.lsh) | 
This function seeks recursively the occurences of regex 
r in string s . 
When neg is either nil or undefined, 
it returns the right residual of s 
begining before the n th last 
occurence. 
When neg is non nil, it returns the 
right residual of s begining after the 
n th last occurence (and thus begining before the 
n th garbage. 
When n is either nil or undefined, it 
is set to 1. 
When regex gr is provided, garbage is 
checked (see function regex-rseek ). 
| 3.7.1.9. (regex-member rl s) 
 | [DE] (sysenv.lsh) | 
This function returns the first member of list rl 
which is a matching regex for string s 
. 
| 3.7.2. International Strings 
 |  | 
Lush contains partial support for multibyte strings using an encoding 
specified by the locale. This is work in progress. 
| 3.7.2.0. (locale-to-utf8 s) 
 | [DX] | 
Converts a string from locale encoding to UTF-8 encoding. This is a best 
effort function: The unmodified string is returned if the conversion is 
impossible, either because the string s 
is incorrect, or because the system does not provide suitable conversion 
facilities. 
| 3.7.2.1. (utf8-to-locale-to s) 
 |  | 
Converts a string from UTF-8 encoding to locale encoding. This is a best 
effort function: The unmodified string is returned if the conversion is 
impossible, either because the string s 
is incorrect, or because the system does not provide suitable conversion 
facilities. 
| 3.7.2.2. (explode-chars s) 
 | [DX] | 
Returns a list of integers with the wide character codes of all 
characters in the string. This function interprets multibyte sequences 
according to the encoding specified by the current locale. 
Example (under a UTF8 locale): 
? (explode-chars "\xe2\x82\xac")
= (8364)
| 3.7.2.3. (implode-chars l) 
 | [DX] | 
Returns a string composed of the characters whose wide character code 
are specified by the list of integers l 
. Multibyte characters are generated according to the current locale. 
For instance, under a UTF8 locale, 
Example 
? (implode-chars '(8364 50 51 46 53 32 61 32 32 162 50 51 53 48))
= "€23.5 =  ¢2350"
| 3.7.2.4. (explode-bytes s) 
 | [DX] | 
Returns a list of integers representing the sequence of bytes in string 
s , regardless of their character interpretation. 
Example 
? (explode-bytes "€")
= (226 130 172)
| 3.7.2.5. (implode-bytes l) 
 | [DX] | 
Assemble a string composed of the bytes whose value is specified by the 
list of integers l , regardless of 
their multibyte representation. 
Example 
? (implode-bytes '(226 130 172 50 51))
= "€23"