Mathematical model of Rebol series - r3n/rebol-wiki GitHub Wiki
This article:
- defines a mathematical model of Rebol series
- discusses the current state of the interpreter
- proposes changes
Definition: we say that a mathematical function S is a Rebol series if:
S datatype
is one of the series! datatypes
S displacement
is an integer value
S payload
is a payload.
Series! datatypes are: string!, binary!, block!, ...
Definition: we say that mathematical function D is a payload if for every executionTime t
D t size
is a nonnegative integer value N and for every integer value i such that 0 <= i < N
D t i
is a Rebol value.
Note that N = 0 is allowed, in which case the payload size is 0 and the payload is empty. Note also that payloads depend on execution time, i.e., they can change in time.
The goal of this section is to define some simple functions to establish basic terminology necessary to discuss the current state of the interpreter.
Convention: from now on, unless otherwise stated, symbol t will denote the executionTime value of the time when the evaluation of the expression started.
The at-head? function is a function accepting a Rebol series S and returning a logic value determining whether the series is at its head.
(at-head? S) = #[true]
, if (S displacement) = 0
(at-head? S) = #[false]
, if (S displacement) <> 0
Tests suggest that this model matches the head? function defined in the rebol/version == 2.7.8.3.1 interpreter. The at-head? name has been chosen just for compatibility with the function naming convention in the subsequent cases.
The head function is a function accepting a Rebol series S and returning its head series T.
T is a Rebol series having the same datatype as S, displacement 0 and the same payload as S, i.e.,
(T datatype) = (S datatype) (T displacement) = 0 (T payload) = (S payload)
Tests suggest that this model of the head function is implemented in, e.g., the rebol/version == 2.7.8.3.1 interpreter.
The pre-head? function is a function accepting a Rebol series S and returning a logic value determining whether the series is at a pre-head position.
(pre-head? S) = #[true]
, if S displacement is negative
(pre-head? S) = #[false]
, if S displacement is nonnegative
This function is not available in the interpreter, even though the rebol/version == 2.7.8.3.1 interpreter can construct such series, as this example demonstrates:
index? #[block! [a] -2] ; == -2
This function can be defined as a mezzanine in the rebol/version == 2.7.8.3.1 interpreter; for example:
pre-head?: func [
"Returns TRUE if SERIES is at a pre-head position"
series [series!]
] [
0 >= index? :series
]
The at-tail? function is a function accepting a Rebol series S and returning a logic value determining whether the series is at its tail.
(at-tail? S) = #[true]
, if (S displacement) = (do S payload t size)
(at-tail? S) = #[false]
, if (S displacement) <> (do S payload t size)
This function does not exist in the current interpreter. It can be defined as a mezzanine:
at-tail?: func [
"Returns TRUE if SERIES is exactly at its tail position."
series [series!]
] [
same? :series tail :series
]
The tail function is a function accepting a Rebol series S and returning its tail series T.
T is a Rebol series with the same datatype as S, a displacement equal to the current payload size and the same payload as S, i.e.,
(T datatype) = (S datatype) (T displacement) = (do S payload t size) (T payload) = (S payload)
Tests suggest that this model is implemented in the rebol/version == 2.7.8.3.1 interpreter as the tail function.
The past-tail? function is a function accepting a Rebol series S and returning a logic value determining whether the series is at a past-tail position.
(past-tail? S) = #[true]
, if (S displacement) > (do S payload t size)
(past-tail? S) = #[false]
, if (S displacement) <= (do S payload t size)
This function does not exist in the current interpreter, while the interpreter can construct such series as this example demonstrates:
s: tail [a] clear head s same? s tail s ; == false
This function can be defined as a mezzanine:
past-tail?: func [
"Returns TRUE if SERIES is at a past-tail position"
series [series!]
] [
and~ tail? :series not same? :series tail :series
]
The in-bounds? function is a function accepting a Rebol series S and returning a logic value determinig whether the position of the series is in bounds, i.e., neither pre-head nor past-tail.
(in-bounds? S) = #[true]
, if ((S displacement) >= 0) and ((S displacement) <= (do S payload t size))
(in-bounds? S) = #[false]
otherwise.
This function is not defined in the current interpreter, but it can be defined as a mezzanine:
in-bounds?: func [
"Returns TRUE if SERIES is neither pre-head nor past-tail."
series [series!]
] [
not any [pre-head? :series past-tail? :series]
]
The constrain function is a function accepting a Rebol series S and returning a Rebol series T with a position constrained to range bounded by series head and tail positions.
T = head S
, if pre-head? S
T = S
, unless (pre-head? S) or (past-tail? S)
T = tail S
otherwise.
This function can be defined as a mezzanine:
constrain: func [
"Returns SERIES constrained to the range bounded by the SERIES head and tail positions"
series [series!]
] [
case [
pre-head? :series [head :series]
past-tail? :series [tail :series]
'else [:series]
]
]
The length-of function is a function accepting a Rebol series S and returning a nonnegative integer determining the length of the series.
(length-of S) = (max 0 (do S payload t size) - (max 0 S displacement))
The length? function in the rebol/version == 2.7.8.3.1 interpreter does not work like that as this test demonstrates:
length? #[block! [] -2] ; == 3
, while the length-of function would yield 0.
Definition: Pre-head series are series displaced backwards from their head, i.e., series having a negative displacement.
In the rebol/version == 2.7.8.3.1 interpreter these inconsistencies can be found:
- When discussed (if discussed at all), pre-head series are called "Illegal series". That is inconsistent. If it is a bug for such a series to exist, a consistent interpreter shall not produce it.
- Pre-head series are treated obscurely, as "Series that must not be mentioned in the documentation".
- Simple functions and reflectors (like the ones defined above) for handling and detecting pre-head series are missing in the interpreter.
- The
moldfunction is not able to handle pre-head series triggering an error (whilemold/allworks without a quirk). This is a serious problem - themoldfunction is used to print error reports and must not trigger an error when printing an error report unless we want the interpreter to crash. - The
pickfunction acknowledges pre-head positions allowing picking at such positions without triggering an error. - The
back,skipandatfunctions refuse to go to pre-head positions, ignoring that such positions exist and are acknowledged bypick.
There are two different methods trying to handle the inconsistencies
The rebol/version == 2.100.111.3.1 interpreter has been amended so that it does not produce pre-head series now. However, this leads to
s: load "#[block! [a] -3]" mold/all s ; == "#[block! [a] 2]"
- The inconsistency is that if it is "illegal" for the interpreter to create the
#[block! [a] -3]block, why does theloadfunction accept the source string as "legal" and produce a result that is incompatible with the specification instead of triggering an error? - As discussed below, this provision is not usable for past-tail series meaning that for past-tail series a different (incompatible and therefore inconsistent with this) provision should take place.
- As noted above this provision is incompatible with the behaviour of the
pickfunction. It would be possible to change the behaviour of the function, but there is a broad agreement between Rebol users that the current behaviour of the function when picking at pre-head positions is convenient.
This provision is less comfortable than the alternative proposed below.
Example: Let's assume that a user wants to use a series of values [d] indexed by numbers in the [3..4] range. To be able to do it using the rebol/version == 2.100.111.3.1 interpreter the user needs to define:
my-series: [#[none] #[none] d e]
Now:
pick my-series 3 ; == d pick my-series 4 ; == e
In rebol/version == 2.7.8.3.1 the following works:
index? my-series: #[block! [d e] -1] ; == -1 pick my-series 3 ; == d pick my-series 4 ; == e
Comparing the two alternatives we see that in rebol/version == 2.7.8.3.1 we did not have to insert the extraneous #[none] values to the block to achieve the same effect, i.e., the rebol/version == 2.7.8.3.1 solution is faster and more memory efficient.
In addition to that, the length? function yields:
length? my-series ; == 4
, which does not look desirable. The length-of function defined above would yield a more natural result 2 if used for the rebol/version == 2.7.8.3.1 case.
This includes:
- Mention pre-head series in the documentation (a definition suffices, in my opinion).
- Stop calling pre-head series "Illegal series" (this is the "easiest to implement" part of the proposal).
- Define the simple functions described above (quite easy as well).
- Amend the
moldfunction to not trigger an error. An expression likemold #[block! [a] -2]can yield "[a]". - Amend the
length?function to work like thelength-offunction above. - Define new
back-anyfunction going to the previous position even when it is a pre-head position. - Define new
skip-anyfunction going to the specified position even if it is a pre-head position.
- consistence
- comfortable indexing (see below)
Definition: Past-tail series are series having displacement greater than the payload size.
Example: In rebol/version == 2.7.8.3.1 interpreter the help string of the tail? function states:
"Returns TRUE if a series is at its tail."
However, the the actual behaviour is:
s: skip [a b c d] 3 clear skip s -2 same? s tail s ; == false tail? s ; == true
, which, in my opinion, contradicts the help string. In the rebol/version == 2.100.111.3.1 interpreter the help string of the tail? function has been corrected, but the 'tail' and 'tail?' function names alone (when not reading their help strings) still look misleading. Also, we obtain:
index? s ; == 2
, while
mold/all s ; == "#[block![a]4]"
, which shows the correct result.
Example:
mold/all load "#[block! [a] 4]" ; == "#[block![a]2]"
- The inconsistency is that if it is "illegal" for the interpreter to create the
#[block! [a] 4]block, why does theloadfunction accept the string as "legal" and produce a result that is incompatible with the specification instead of triggering an error? This is strange knowing that the interpreter does (and cannot stop, in fact) produce the#[block! [a] 4]series as demonstrated above. - In
rebol/version == 2.7.8.3.1interpreter theindex?function pretends that the past-tail positions it encounters aren't past-tail. This just impairs the reflexivity of the language, because the user can obtain correct informations using other means as demonstrated. - The
tail?function yields#[true]for past-tail, which looks misleading even if the help string of the function is amended (some users may not read help strings). - Past-tail series are treated as "Series that must not be mentioned in the documentation".
- When discussed (if discussed at all), past-tail series are called "Illegal series". That is inconsistent. If it is a bug for such a value to exist, a consistent interpreter shall not produce it.
- Simple functions and reflectors defined above designed for handling and detecting past-tail series are missing in the interpreter.
- The
moldfunction is not able to handle past-tail series triggering an error (mold/allworks without a quirk). This is a serious problem - themoldfunction is used to print error reports and must not trigger an error when printing an error report unless we want the interpreter to crash. - The
pickfunction acknowledges past-tail positions allowing picking at such positions without triggering an error. - The
mext,skipandatfunctions refuse to go to past-tail positions, ignoring that such positions exist and are acknowledged bypick.
Just the incorrect behaviour of the index? function has been corrected in rebol/version == 2.100.111.3.1 interpreter. Due to the fact that series payloads can be cleared, any series with positive displacement can become past-tail. The interpreter cannot detect which series become past-tail when a payload is cleared, because that would make the remove and clear operations slower and more complicated than acceptable. Thus, it is not possible to amend the interpreter to not produce past-tail series at all. Therefore, the only serious alternative is to handle past-tail series consistently, which includes:
- Mention past-tail series in the documentation (a definition suffices, in my opinion).
- Stop calling past-tail series "Illegal series" (this is the easiest part of the proposal).
- Define the simple functions described above (quite easy as well).
- Amend the
loadfunction to yield the specified result when handling the "#[block!]" source string. - Amend the
moldfunction to not trigger an error when handling past-tail series. An expression likemold #[block! [a] 4]can yield "[a]". - Amend the
length?function to work like thelength-offunction above. - Define new
next-anyfunction going to the next position even when it is a past-tail position. - Define the
skip-anyfunction so that it goes to the specified position even if it is a past-tail position.
- consistence
- comfortable indexing (see below)
Before discussing indexing, let's take a "philosophical detour"
Some opponents of 0 state: "0 does not even exist, it should not be allowed as an index, then"
It is fair to admit that the question whether 0 "exists" may indeed be resolved by stating that it actually does not "exist". However, that does not mean the opponents of 0 could start to celebrate. The situation is the same as with the number 1, e.g. Does 1 "exist"? Not "much", we cannot find it in the real world any easier than we can find 0. Note that many Greek mathematicians did not consider 1 to be "a number", so to them 2 was "the smallest number" (they used the term "multitude" and it is easy to see why they considered inappropriate to call 1 "multitude").
It suffices to say that 1 is not something that "exists" in the common sense of the word, it is just an abstract notion invented by clever people to point to the common property of a hand containing one apple, a letter consisting of one page, a room in which one man is sitting, a net containing just one fish, a purse containing just one coin, etc.
In a similar sense 0 is an abstract notion pointing to the common property of a hand containing no apple, an empty envelope, a room in which nobody is sitting, a net containing no fish, an empty purse, etc.
Unsurprisingly, the answer is "yes".
Example:
block: [a b] pick block 1 ; == a pick block 2 ; == b
Perhaps surprisingly, the answer is "no".
Example:
index? block: #[block! [a b] 0] ; == 0 pick block 2 ; == a pick block 3 ; == b
As we use indices from the 2 <= index < 4 range to pick the values of the block, we are actually using 2-based indexing.
rebol/version == 2.100.111.3.1 actually does.
Example:
block: #[block! [a b] 2] pick block 0 ; == a pick block 1 ; == b
, i.e., we are using indices in the 0 <= index < 2 range to pick the values of the block, which means that we are using 0-based indexing.
However, rebol/version == 2.7.8.3.1 does not have 0-based indexing:
block: #[block! [a b] 2] pick block -1 ; == a pick block 1 ; == b
, i.e., in this case, the "index space" consists of two ranges, in fact. The first is the range containing just -1 (, i.e., -1-based), while the second is the range containing just 1 (, i.e., 1-based). This also demonstrates that rebol/version == 2.7.8.3.1 does not have true -1-based, or negative-based indexing, because the pick function uses a discontinuous range made of two arithmetically incompatible (using different index arithmetic) segments.
I think that it is good to discuss why this happened to rebol/version == 2.7.8.3.1. The main reason seems to be the "1 pointing at the current position" indexing. This, when using the "common indexing principle" assigning the previous position the previous integer number induces the "0 pointing backwards" case. "0 pointing backwards" looks incomfortable/unusual as some users object. However, the approach using smell of garlic to take away the smell of onion (removing 0 from the set of possible index values) did not bring any good. Hurting the index arithmetic this way is not just incomfortable as the "0 pointing backwards" case was, it is inconsistent. (Even the Rebol designer felt victim to the problem in his code.)
- In
rebol/version == 2.7.8.3.1theindex?function is not injective (one to one) function, because it assigns the tail index to all past-tail positions. This contradicts the principle of indexing that a one to one correspondence between indices and positions shall be established. Inrebol/version == 2.100.111.3.1this bug has been corrected. - The
atfunction is not injective (one to one) because it assigns the same series position to two different indices 0 and 1. Note that this is, in fact, a different property than the previous one. A mathematical object having both properties would not even be representable as a function. This bug is detectable in bothrebol/version == 2.7.8.3.1andrebol/version == 2.100.111.3.1. - In
rebol/version == 2.7.8.3.1thepickfunction subdivides its range to two arithmetically incompatible segments. This has been corrected inrebol/version == 2.100.111.3.1. - In
rebol/version == 2.100.111.3.1the index 0 "points backwards" when thepickfunction is used. This is not an inconsistency, but it contradicts the expectations of users unaccustomed to 0 "pointing backwards" meaning that this property is inconvenient for such users.
Task:
Define a head-index? function obtaining a series s and an index value i yielding an index value j such that the pick s i expression will be equivalent to the pick head s j expression.
Solutions:
This is the solution working when the index? function and the pick function both use indexing method compatible with the skip function indexing method:
head-index?: func [s [series!] i [integer!]] [i + index? s]
This is the solution working when the index? function and the pick function use the indexing methods from R3:
head-index?: func [s [series!] i [integer!]] [i - 1 + index? s]
This is the solution working in the present state of R2:
head-index?: func [s [series!] i [integer!]] [case [i < 0 [i - index? s] i > 0 [i - 1 + index? s] i = 0 [0]]]
Summary:
- The "skip indexing" uses the simplest index arithmetic, allowing the solution to be the simplest one.
- The index arithmetic used in
rebol/version == 2.100.111.3.1is simple as well, but not as simple as the optimal indexing. - The non-arithmetic indexing used in
rebol/version == 2.7.8.3.1does not support any simple solution and it provokes errors even when experienced users are writing code.
The curr function is a function accepting a Rebol series S and returning the Rebol value at its current position.
(curr S) = (do S payload t S displacement)
, if in-bounds? S
(curr S) = #[none]
otherwise.