Indexing - r3n/rebol-wiki GitHub Wiki
This article is meant as an illustration of the principles behind indexing. Unless stated otherwise, the examples were tested using:
rebol/version ; == 2.7.8.3.1
To start the discussion in a "philosophical" way, let's discuss one of the "philosophical" themes first.
Some opponents of 0 state: "0 does not even exist, it should not be allowed as an index, then"
It is fair to admit that the question whether 0 "exists" may indeed be resolved by stating that it actually does not "exist". However, that does not mean the opponents of 0 could start to celebrate. The situation is the same as with the number 1, e.g. Does 1 "exist"? Not "much", we cannot find it in the real world any easier than we can find 0. Note that many Greek mathematicians did not consider 1 to be "a number", so to them 2 was "the smallest number" (they used the term "multitude" and it is easy to see why they considered inappropriate to call 1 "multitude").
It suffices to say that 1 is not something that "exists" in the common sense of the word, it is just an abstract notion invented by clever people to point to the common property of a hand containing one apple, a letter consisting of one page, a room in which one man is sitting, a net containing just one fish, a purse containing just one coin, etc.
In a similar sense 0 is as an abstract notion pointing to the common property of a hand containing no apple, an empty envelope, a room in which nobody is sitting, a net containing no fish, an empty purse, etc.
For the purpose of referring to positions (elements) in a series like Rebol blocks, strings, etc. we need an index value.
The index value should be an integer. This statement may look uncontroversial, but it may be useful to mention the reasons why:
An element in a series may have a next element, its successor in the series. The successor element is such an element that there is no element between the element and its successor in the series.
In integer arithmetic there is also a successor operation. The successor of an integer is the integer increased by one.
The correspondence between the successor operations in the series and in the range of integers (the "successor to successor correspondence") is the reason why integers are used for indexing.
When the successor to successor correspondence is maintained, other operations are in correspondence as well, for example:
- skipping two elements in a series corresponds to adding 2 to the index,
- skipping one element in a series corresponds to adding 1 to the index,
- skipping no element in a series (remaining at the same place, in fact) corresponds to adding 0 to the index
- skipping one element backwards corresponds to subtracting 1 from the index (or adding -1 to the index),
- skipping two elements backwards corresponds to subtracting 2 from the index (or adding -2 to the index)
If the successor to successor correspondence is kept, knowing the index of one element in a series determines the indices of all other elements. Usually, the element with the smallest index value is called the base element. If the index of the "base element" is x, the index is called x-based.
The next function is not an "indexing operator" not doing any indexing for us (not defining any correspondence between indices and positions in a series). However, as discussed above, the "successor to successor correspondence" is crucial for indexing. Therefore, it is wise to examine whether the NEXT function can serve as a successor operator.
The help string of the next function says:
"Returns the series at its next position."
Let's test how it works:
s: [a] index? s ; == 1 mold/all s ; == [a] next-s: next s index? next-s ; == 2 mold/all next-s ; == #[block! [a] 2] same? next-s s ; == false
In this case we took series s and used the next function to obtain its successor next-s. We obtained a new, distinct, "successor" series, as all our examinations proved.
Also, there is one more property worth mentioning. The next-s series is a tail, as the following test reveals:
same? next-s tail next-s ; == true
However, if we continue using the next function on next-s:
next-next-s: next next-s
we get:
index? next-next-s ; == 2 mold/all next-next-s ; == #[block! [a] 2] same? next-next-s next-s ; == true
This illustrates that when supplying the tail series, the next function does not yield its successor, instead it just yields the series tail.
To determine whether the next function is the successor operator we need to find out whether it could yield the successor of the next-s series. If yes, then the next function isn't actually the successor operator, and its help string is misleading.
It turns out that our concerns are justified. To prove it, let's define:
succ: func [
"Returns the series at its next position."
series [series!]
/local t
] [
t: tail :series
insert/dup t #"^(00)" 1 + (index? :series) - (index? :t)
series: next :series
clear :t
:series
]
Now our testing reveals:
succ-next-s: succ next-s mold/all succ-next-s ; == "#[block![a]3]" same? succ-next-s next-s ; == false
, i.e., we really obtained the successive position confirming our concerns.
Our new succ function is actually our proposal how the new next function implementation should behave. Let's list some of the advantages of the new behaviour.
The help string of the succ function isn't misleading. This is not a big advantage, however. For the original next function implementation to not have a misleading help string it would suffice to adjust the help string to "Returns the series at its next position, stops at tail."
This is a bigger advantage. To demonstrate it, let's see how the current next function behaves:
same? s: tail [a] back next s ; == false
, while:
same? s: tail [a] back succ s ; == true
Together with the improved back function it generates the new, improved skip function, which definines new (essentially "any base") indexing. (see the discussion in the SKIP section)
Defining the new curr function and pick function different index access methods become compatible (finally).
The back function help string states:
"Returns the series at its previous position."
Similarly as with the next function we may want to determine whether the help string is misleading or not. We can find out that the back function does not yield the preceding position in case it obtains a head series.
Now we may want to find out whether it is actually possible to obtain a position preceding the head. Like in the case of the next function, such series exist in Rebol:
index? #[block! [1 2] -2] ; == -2
Therefore, it makes sense to redefine the back function as proposed for the next function, i.e., so that the function yields a predecessor of the given series without stopping at head.
Such a function already exists in R2, it is called first, but there are several issues with it:
- the function name assumes that the current position is (relatively to itself) indexed by 1. If the successor to successor correspondence is kept, index 0 (relative to the current position) would need to be assigned to the position immediately preceding the current position. Opponents of this (relative) indexing method correctly object that it is counterintuitive and uncomfortable for 0 to "point one step backwards from the current position". The simple indexing arithmetic example below demonstrates that such choice makes index arithmetic more complicated than necessary. Last but not least, the function name doesn't need to assume any (relative) index value.
- the behaviour of the
firstfunction was found uncomfortable when compared to the behaviour of thepickfunction in that it triggered an error for out-of-range (past-head, tail or past-tail) positions.
The index? function uses 1-based indexing, the "base element" being the series head. The series head has index 1. The correspondence between series successor and the integer successor (addition of 1) looks like being kept.
When testing the above succ-next-s series we obtain:
index? succ-next-s ; == 2
, while we know that all other indicators (the mold/all output, the same? function) demonstrate that it is not true.
Similarly as we patched the next function we can patch also the index? function:
index?': func [
{Returns the index number of the current position in the series.}
series [series! port!]
/local t i
] [
t: tail :series
while [all [tail? :series not :series =? :t]] [
insert tail :series #"^(00)"
]
i: index? :series
clear :t
i
]
Now we obtain:
index?' next-s ; == 2 index?' next-next-s ; == 3
This function uses different indexing than the index? function, the indexing of positions is "relative to the given series argument". If we examine the function thoroughly, we find that "the base element" (the element with the smallest index) can actually (depending on the situation) have any index not exceeding 1. Example:
s: skip [a b c d e f] 3 at s -2 ; == [b c d e f] at s -3 ; == [a b c d e f]
However, the at function uses "ambiguous indexing":
at s 1 ; == [d e f g] at s 0 ; == [d e f g]
, since it assigns two different indices (0 and 1) to one position. The ambiguous indexing cannot maintain the successor to successor correspondence. Moreover, it is the only ambiguous-type indexing used, i.e., it is not compatible with any other indexing used.
The skip function uses indexing relative to the series argument as well, but, fortunately, it does not use ambiguous indexing like the at function. Maintains the correspondence between series positions and integers (its integer argument is called "offset", but the argument name is not relevant).
Similarly as the next function, though, it does not skip past tail, nor does it skip past head (interestingly, past-head positions are equally possible in R2 as past-tail positions). While this property looks like logically justifiable, there is a problem that this property is not compatible with indexing used by the pick function indexing.
Note that if skip were able to skip past head as well as past tail, any-based indexing would be possible. That is of advantage for some code, Rebol would be one of languages allowing any bounds for series (when using the appropriate skip), not just nonpositive minimal index and positive maximal index.
The pick function uses indexing relative to the given series. It is able to refer to the past-head:
pick [a b] -1 ; == none
yielding #[none] in such case or past tail:
pick [a b] 3 ; == none
yielding #[none] as well.
However, the pick function does not maintain the successor to successor correspondence. Example:
s: skip [a b c d e f g] 3 pick s -1 ; == c pick s 1 ; == d
R2 assigns index -1 to the c element, while it does not assign 0 (the successor of -1) to the d element, assigning 1 to it instead. Because of that, the index arithmetic (the arithmetic correspondence between indices and positions in the series) is broken.
Define a head-index? function obtaining a series s and an index value i yielding an index value j such that the pick s i expression will be equivalent to the pick head s j expression. Here are the solutions comparing different approaches:
The solution working when the index? function and the pick function both use indexing method compatible with the skip function indexing method:
head-index?: func [s [series!] i [integer!]] [i + index? s]
(This solution is the simplest.)
The solution working when the index? function and the pick function use the indexing methods from R3:
head-index?: func [s [series!] i [integer!]] [i - 1 + index? s]
(This solution is the second simplest.)
The solution working in the present state of R2:
head-index?: func [s [series!] i [integer!]] [case [i < 0 [i - index? s] i > 0 [i - 1 + index? s] i = 0 [0]]]
(This solution is the most complicated and it is neither "simple" nor "arithmetic". Neither it can be, having to deal with a successor to successor incompatible non-arithmetic indexing method.)
| Function | Properties | Problems |
|---|---|---|
| NEXT | go to the successive position | unable to go past-tail |
| BACK | go to preceding position | unable to go past-head |
| SKIP | relative to the SERIES, nonpositive-based | unable to go past-tail or -head |
| AT | relative to the SERIES | ambiguous, unable to go past-tail or -head |
| INDEX? | 1-based | lying for past-tail positions |
| PICK | relative to the SERIES, goes past-tail, -head | not s-to-s |
| POKE | relative to the SERIES, does not go past-tail, -head, otherwise compatible with PICK | not s-to-s |
The next and back functions are not indexing functions. However, if they don't yield true successor and predecessor, they cannot be used for construction of a successor to successor compatible index, i.e. of a system with fundamentally correct indexing arithmetic.
The skip function based on true successor and predecessor functions would generate the proper indexing arithmetic automatically, but when based on incorrect principles, the indexing arithmetic generated by this function will also become incorrect (having to handle many unnecessary and incomfortable exceptions).
The at function is even worse:
- it generates ambiguous indexing adding more exceptions to the already existing set of exceptions from the
skipfunction - it uses index shift (addition of 1) of the series position generating some new boundary exceptions
- due to the newly introduced sets of exceptions and due to shifting indexing generated by the
atfunction is also incompatible with the indexing generated by theskipfunction
pick function is similarly bad as the indexing generated by the at function:
- it generates "indexing hole" introducing new exceptions to the already existing set of exceptions from the
skipfunction - it uses index shift (addition of 1) of the series position generating the same additional boundary exceptions as the
atfunction - the
head-index?function is quite complicated, most probably not circumventing all the indexing bugs nevertheless - the indexing is incompatible with the indexing generated by the
skipfunction - the indexing is incompatible with the indexing generated by the
atfunction
index? function is less general than other functions (not defining relative indexing, just absolute indexing, both the indexing generated by the at function as well as the indexing generated by the pick function generalize it, although they are mutually incompatible), yet it introduces unacceptable bugs for past-tail positions.
For the mathematical model of Rebol series and for the definitions of the new, improved indexing system see the Mathematical_model_of_Rebol_series article.