Iterators overview - MIPT-ILab/mipt-mips GitHub Wiki
C++ standart template library (STL) provides a lot of containers with different semantics: variable-length arrays (std::vector
), static-length arrays (std::array
), linked lists (std::list
and std::forward_list
), hash tables etc.
They are implemented in different ways, but some operations are common: apply the same function to all elements, find an element, accumulate values using addition or other function etc. Additionally, there is a constant need to get a reference to an element for reading value, swapping, insertion, or deletion. All of these tasks may generalized with iterators
Assume you have a C-style array:
struct Entry* arr = (struct Entry*)malloc(sizeof(struct Entry*) * length);
Let's assume we want to get a sum of some Entry fields. To do this, we usually write a loop and check every elements (or, in other words, iterating it):
unsigned i;
unsigned sum = 0;
for (i = 0; i != length; ++i)
sum += arr[i].field; // sum += (arr + i)->field
Advanced C programmer would avoid address calculation in a loop and uses pointers instead:
struct Entry* begin = arr;
struct Entry* end = arr + length;
struct Entry* ptr;
for (ptr = begin; ptr != end; ++ptr)
sum += ptr->field;
That points are actually iterators. What operations are possible with them? Here is a short list:
-
ptr++
,++ptr
— moving to the next element -
ptr--
,--ptr
— moving to the previous element -
*ptr
,ptr->
— dereferencing -
ptr1 - ptr2
— subtraction (distance)
In C++, we may overload operators for classes. Thus, we may introduce a class Iterator which has the same semantics as the pointer. Let's start with std::vector
which is just a wrapper over C-style heap array. Its iterator may be just a pointer with some wrapping.
template<typename E>
class iterator {
E* ptr;
public:
Iterator<E> operator++() { ++ptr; return *this; }
const E& operator*() const { return *ptr; }
const E* operator->() const { return ptr; }
// ...
};
std::vector<Entry> my_vec(length);
// ...
std::vector<Entry>::iterator begin = my_vec.begin();
std::vector<Entry>::iterator end = my_vec.end();
std::vector<Entry>::iterator it;
unsigned sum = 0;
for (it = begin; it != end; ++it)
sum = it->field;
The big advantage of iterators in C++ is automatization. The code above could be simplified:
// Variant 1
unsigned sum = 0;
for (auto it = my_vec.begin(); it != my_vec.end(); ++it)
sum += it->field;
// Variant 2
unsigned sum = 0;
for (const auto& entry : my_vec) // automated begin(), end(), and dereference
sum += entry.field;
// Variant 3
unsigned sum = std::accumulate(my_vec.begin(), my_vec.end(), 0); // automated summation
We demonstrated the simplest iterator for std::vector
. But what about linked list? Let's go back to C once again. Assume we have a following data structure:
struct Node {
void* data;
struct Node* prev;
struct Node* next;
};
struct List {
struct Node* head; // that contains nullptr as 'prev'
struct Node* tail; // that contains nullptr as 'next'
};
// Some interfaces
struct Node* get_next(struct Node* node) { return node->next; }
struct Node* get_prev(struct Node* node) { return node->prev; }
void* get_data(struct Node* node) { return node->data; }
struct Node* get_head(struct List* list) { return list->head; }
struct Node* get_tail(struct List* list) { return list->tail; }
These interface are 100% analogues to std::list
iterators. The only thing we need is to carefully wrap them into C++ operator overloading:
template<class E>
struct Node {
E* data;
Node<E>* prev;
Node<E>* next;
}
template<class E>
class ListIterator {
Node<E>* ptr;
public:
ListIterator<E> operator++() { ptr = ptr->next; return *this; }
ListIterator<E> operator--() { ptr = ptr->prev; return *this; }
const E& operator*() const { return *(ptr->data); }
const E* operator->() const { return ptr->data; }
// ...
};
template<class E>
class List {
Node<E>* head;
Node<E>* tail;
public:
ListIterator<E> begin() { return ListIterator(head); }
ListIterator<E> end() { return ListIterator(end); }
};
Thus, summing all elements in std::list
is the same code as for std::vector
! As a result, you may write similar algorithms for any containers!
std::list<Entry> my_list;
// ...
unsigned sum = 0;
for (const auto& entry : my_list)
sum += entry.field;
You may iterate other containers as well, for example std::set
and std::unordered_set
which may be implemented as trees or hash tables. Iterator of std::map
and std::unordered_map
is more complicated, as result of its dereference is std::pair<Key, Value>:
for (auto it = my_map.begin(); it != my_map.end(); ++it)
std::cout << "key is " << it->first << " , value is " << it->second << std::endl;
In C++, Iterator is a concept of class which supports following operations:
- basic dereferencing (
operator*
): iterator must now if it points to valid or invalid element - increment (
operator++
,operator++(int)
): creates of iterator to the consequent element
Derived concepts are InputOperator and OutputOperator. In addition, they must support following:
- equality comparability (
operator==
,operator!=
):==
returns true if the same object is iterated - dereferencing (
operator*
,operator->
): returns reference to underlying element for input/output - dereferenceable increment (
operator++
,operator++(int)
): returns an iterator to the consequent element
Such iterators are used for std::istreambuf
and std::ostreambuf
classes respectively
ForwardIterator is derived from InputIterator and OutputIterator (if mutable). It is used by std::forward_list
BidirectionalIterator is an extension of ForwardIterator by addition of decrement operations (operator--
). It is used by majority of STL containers: std::list
, std::map
, std::unordered_map
, std::set
, std::unordered_set
etc.
RandomAccessIterator following operations must be defined:
- addition/subtraction with numbers: advances iterator to amount of elements forward or backward
- subtraction of iterators: returns distance between elements in container
- greater-less comparison of iterators: expresses precedence relations
Obviously, such functions are available only for random access structures, like std::vector
and std::array
.
For details, you may check the table on C++ reference site.