Go to the first, previous, next, last section, table of contents.


Literals

In KSM-Scheme, literals represent characters, strings, numbers, booleans, and other objects.

Characters

Characters are written using the notation #\<character> or #\<character name>. For example,

#\a             ; lower case letter
#\A             ; upper case letter
#\(             ; left parenthesis
#\              ; the space character
#\space         ; the preferred way to write a space
#\newline       ; the newline character

Case is significant in #\<character>, but not in #\<character name>. If <characer> in #\<character> is alphabetic, then the character following <character> must be a delimiter character such as a space or parenthesis. This rule resolves the ambiguous case where, for example, the sequence of characters #\space could be taken to be either a representation of the space character or a representation of the character #\s followed by a representation of the identifier pace.

In KSM-Scheme, characters can also be written in the form of #\U{XXXX}, which represents a character with Unicode code value of XXXX (hexadecimal).

#\U{63}       ; small letter a
#\U{D}        ; carriage return
#\U{3042}     ; hiragana letter a
#\U{feff}     ; zero width no-break space

Strings

Strings are sequences of characters. Strings are written as sequences of characters enclosed within doublequotes ("). A doublequote can be written inside a string only by escaping it with a backslash (\), as in

"The word \"recursion\" has many meanings."

A backslash can be written inside a string only by escaping it with another backslash. In KSM-Scheme, additional escaping sequences are present, as follows.

"\""      ==> "  (a string composed of a doublequote character)
"\\"      ==> \
"\n"      ==> newline
"\r"      ==> carriage return
"\t"      ==> tab
"\U{XXXX} ==> character corresponding to Unicode 
              code value XXXX (hexadecimal)

When a backslash is followed by a character that is not listed above, backslash looses its special meaning and represents a backslash character by itself.

"\b"  ==> a string composed of two characters ('\' and 'b').

In KSM-Scheme, a string constant may include a newline character. In other words, a string constant may continue from one line to the next.

"line 1
line 2"    ==> a string including a newline 

The length of a string is the number of characters that it contains. This number is an exact, non-negative integer that is fixed when the string is created. The valid indexes of a string are the exact non-negative integers less than the length of the string. The first character of a string has index 0, the second has index 1, and so on.

In KSM-Scheme, a string is encoded by UTF-8 (Unicode Transformation Format, 8-bit form". This means that if all the characters in a string are ASCII characters (that is, #\U{00} through #\U{7F}) the length of the string is equal with the byte size of the string. However, if it includes a character out of this range, the byte size of the string is larger than its length.

In phrases such as "the characters of string beginning with index start and ending with index end," it is understood that the index start is inclusive and the index end is exclusive. Thus if start and end are the same index, a null substring is referred to, and if start is zero and end is the length of string, then the entire string is referred to.

Numbers

A number may be written in binary, octal, decimal, or hexadecimal by the use of a radix prefix. The radix prefixes are #b (binary), #o (octal), #d (decimal), and #x (hexadecimal). With no radix prefix, a number is assumed to be expressed in decimal.

#b1011    ==> 11   (decimal)
#o1011    ==> 521  (decimal)
#d1011    ==> 1011 (decimal)
#x1011    ==> 4113 (decimal)

A numerical constant may be specified to be either exact (that is, integer) or inexact (that is, floating point number) by a prefix. The prefixes are #e for exact, and #i for inexact. An exactness prefix may appear before or after any radix prefix that is used. If the written represen representation of a number has no exactness prefix, the constant may be either inexact or exact. It is inexact if it contains a decimal point or an exponent.

12345     ==> integer (exact)
123.45    ==> floating point number (inexact)
123.4e5   ==> floating point number
#eb1011   ==> 11 (decimal, exact)
#ib1011   ==> 11.0 (decimal, inexact)
#bi1011   ==> 11.0 (same as above)

A rational number is written in the form <integer>/<positive integer>. Between <integer>, '/', and <positive integer>, no space is allowed. Radix prefixes are not allowed to write <integer>. Rational numbers are supported only if GMP library is available ("libgmp.so" and "gmp.h" exist).

1/2       ==> one half (rational)
2/4       ==> 1/2 (same as above)

A complex number is written in the form <number>+<number>i. That is, a number followed immediately by '+', another number and 'i'. First number specifies the real part and the second specifies the imaginary part. Radix prefixes are not allowed to write <number>.

1+2i
0+-2i       ==> real part is 0; imaginary part is -2
1.2+3.4i
-2.3+4.5i   ==> real part is -2.3; imaginary part is 4.5


Go to the first, previous, next, last section, table of contents.