ESTL Notes - yszheda/wiki GitHub Wiki
empty
is a constant-time operation for all standard containers, but for somelist
implementations,size
takes linear time.
Item 7. When using containers of newed pointers, remember to delete the pointers before the container is destroyed.
Many string implementations employ reference counting behind the scenes. a strategy that eliminates some unnecessary memory allocations and copying of characters and that can improve performance for many applications.
Sutter's article. "Optimizations That Aren't (In a Multithreaded World)"
For vector
and string
, growth is handled by doing the moral equivalent of a realloc whenever more space is needed.
- Allocate a new block of memory that is some multiple of the container's current capacity. In most implementations,
vector
andstring
capacities grow by a factor of two each time. i.e. their capacity is doubled each time the container must be expanded. - Copy all the elements from the container's old memory into its new memory.
- Destroy the objects in the old memory.
- Deallocate the old memory.
The reserve member function allows you to minimize the number of reallocations that must be performed, thus avoiding the costs of real location and iterator/pointer/reference invalidation.
-
size()
tells you how many elements are in the container. It does not tell you how much memory the container has allocated for the elements it holds. -
capacity()
tells you how many elements the container can hold in the memory it has already allocated. This is how many total elements the container can hold in that memory, not how many more elements it can hold. -
resize(size_t n)
forces the container to change to n the number of elements it holds. After the call toresize
,size
will return n. -
reserve(size_t n)
forces the container to change its capacity to at least n. provided n is no less than the current size. (If n is less than the current capacity, vector ignores the call and does nothing, string may reduce its capacity to the maximum of size() and n. but the string's size definitely remains unchanged.)
two common ways to use reserve to avoid unneeded reallocations:
- The first is applicable when you know exactly or approximately how many elements will ultimately end up in your container.
- The second way is to reserve the maximum space you could ever need. then, once you've added all your data, trim off any excess capacity.
vector<int> v;
void doSomething(const int* pInts, size_t numlnts);
if (!v.empty()) {
doSomething(&v[0], v.size());
}
string s;
void doSomething(const char *pString);
// This works even if the string is of length zero.
doSomething(s.c_str());
// C API: this function takes a pointer to an array of at most arraySize
// chars and writes data to it. It returns the number of chars written,
// which is never more than maxNumChars.
size_t fillString(char 'pArray, size_t arraySize);
// create a vector whose size is maxNumChars
vector<char> vc(maxNumChars);
size_t charsWritten = fillString(&vc[0], vc.size());
string s(vc.begin(), vc.begin()+charsWritten);
// shrink-to-fit
class Contestant {...};
vector<Contestant> contestants;
vector<Contestant>(contestants).swap(contestants);
// "make the capacity as small as this implementation is willing to make it given the current size of the container."
string s;
// make s large, then erase most of its character
string(s).swap(s); // do a "shrink-to-fit" on s
vector<Contestant> v;
string s;
// use v and s
vector<Contestant>().swap(v); //clear v and minimize its capacity
string().swap(s); // clear s and minimize its capacity
what should you use when you need a vector<bool>
?
-
deque<bool>
is an STL container that really containsbool
s. - bitset. If you can live without iterators and dynamic changes in size, you'll probably find that a bit-set will serve you well.
- Operationally, the notion of equality is based on
operator==
.- Equivalence is based on the relative ordering of object values in a sorted range. In the general case, the comparison function for an associative container isn't
operator<
or evenless
, it's a user-defined predicate.
why the standard associative containers are based on equivalence instead of equality
- The standard associative containers are kept in sorted order, so each container must have a comparison function (
less
, by default) that defines how to keep things sorted.
// TODO
It applies equally well to containers of objects that act like pointers, e.g., smart pointers and iterators.
return value of a comparison function indicates whether one value precedes another in the sort order defined by that function. Equal values never precede one another, so comparison functions should always return false for equal values.
- For
set
andmap
: duplicate copies in containers- For
multiset
andmultimap
: copies cannot be idetified byequal_range
, which identifies a range of equivalent values
Technically speaking, comparison functions used to sort associative containers must define a “strict weak ordering” over the objects they compare.
Refer to Austern's 《Generic Programming and the STL》
programs that attempt to change the value of a key in these containers won't compile, because the elements in an object of type
map<K, V>
ormulti-map<K, V>
are of typepair<const K, V>
.
if you do change an element in a
set
ormultiset
, you must be sure not to change a key part — a part of the element that affects the sortedness of the container.
- If portability is not a concern, you want to change the value of an element in a
set
ormultiset
, and your STL implementation will let you get away with it. go ahead and do it. Just be sure not to change a key part of the element, i.e., a part of the element that could affect the sortedness of the container.- If you value portability, assume that the elements in
set
s andmultiset
s cannot be modified, at least not without a cast.
Bottom line: storing data in a sorted vector is likely to consume less memory than storing the same data in a standard associative container, and searching a sorted vector via binary search is likely to be faster than searching a standard associative container when page faults are taken into account.
map::operator[]
is designed to facilitate "add or update" functionality.
// ADD
map<int, Widget> m;
m[1] = 1.50;
// is functionally equivalent to this
typedef map<int, Widget> IntWidgetMap;
pair<lntWidgetMap::iterator, bool> result = m.insert(lntWidgetMap::value_type(1, Widget()));
result.first->second = 1.50;
//-----//
// we'd be better off replacing our use of operator[] (including its attendant construction plus assignment) with a straightforward call to insert
m.insert(lntWidgetMap::value_type(1,1.50));
// UPDATE
// use operator[] to update k's value to be v
m[k] = v;
// use insert to update k's value to be v
// this costs us a pair constructor and destructor
m.insert(IntWidgetMap::value_type(k, v)).first->second = v;
template<typename MapType, typename KeyArgType, typename ValueArgtype>
typename MapType::iterator efficientAddOrUpdate(MapType& m,
const KeyArgType& k,
const ValueArgtype& v)
{
typename MapType::iterator Ib = m.lower_bound(k);
if (lb != m.end() && !(m.key_comp()(k, lb->first))) {
lb->second = v;
return Ib;
} else {
typedef typename MapType::value_type MVT;
return m.insert(lb, MVT(k, v));
}
}
some container member functions that take iterator
s as parameters insist on iterator
s: const_iterator
s won't do.
typedef deque<int> IntDeque; // convenience typedefs
typedef lntDeque::iterator Iter;
typedef lntDeque::const_iterator Constlter;
Constlter ci; // ci is a const_iterator
// make ci point into d
Iter i(d.begin()); // initialize i to d.begin()
advance(i, distance<ConstIter>(i, ci)); // figure the distance between i and ci (as const_iterators), then move i that distance
-
How to convert const_iterators to iterators using std::distance and std::advance
-
What is the difference between const_iterator and non-const iterator in the C++ STL?
vector<int> v;
// put 1-5 in the vector
for (int i = 1; i <= 5; ++i) {
v.push_back(i);
}
// make ri point to the 3
vector<int>::reverse_iterator ri = find(v.rbegin(), v.rend(), 3);
// make i the same as ri's base
vector<int>::iterator i(ri.base());
// erase the element pointed to by
// ri; this should always compile
v.erase((++ri).base());
The
operator<<
functions on whichistream_iterator
s depend perform formatted input, and that means they must undertake a fair amount of work on your behalf each time you call one. They have to create and destroy sentry objects (specialiostream
objects that perform setup and cleanup activities for each call tooperator<<
), they have to check stream flags that might affect their behavior (e.g.. skipws), they have to perform comprehensive checking for read errors, and. if they encounter a problem, they have to check the stream's exception mask to determine whether an exception should be thrown. Those are all important activities if you're performing formatted input, but if all you want to do is grab the next character from the input stream, it's overkill.
ifstream inputFile("interestingData.txt");
inputFile.unset(ios::skipws); //disable the skipping of whitespace in inputFile
string fileData((istream_iterator<char>(inputFile)), istream_iterator<char>()};
ifstream inputFile("interestingData.txt");
string fileData((istreambuf_iterator<char>(inputFile)), istreambuf_iterator<char>());
When using
reserve
to improve the efficiency of a series of insertions, always remember thatreserve
increases only a container'scapacity
: the container'ssize
remains unchanged. Even after callingreserve
, you must use an insert iterator (e.g., one of the iterators returned fromback_inserter
,front_inserter
, orinserter
) with an algorithm when you want that algorithm to add new elements to avector
orstring
.
vector<int> values;
vector<int> results;
results.reserve(results.size() + values.size());
//write the results of the transmogrifications to uninitialized memory
// behavior is undefined!
transform(values.begin(), values.end(), results.end(), transmogrify);
- http://en.cppreference.com/w/cpp/algorithm/partial_sort
- http://en.cppreference.com/w/cpp/algorithm/nth_element
In a stable sort, if two elements in a range have equivalent values, their relative positions are unchanged after sorting.
partial_sort
,nth_element
is not stable.
- If you need to perform a full sort on a
vector
,string
,deque
, orarray
, you can use sort orstable_sort
.- If you have a
vector
,string
,deque
, orarray
and you need to put only the top n elements in order,partial_sort
is available.- If you have a
vector
,string
,deque
, orarray
and you need to identify the element at position n or you need to identify the top n elements without putting them in order.nth_element
is at your beck and call.- If you need to separate the elements of a standard sequence container or an array into those that do and do not satisfy some criterion, you're probably looking for
partition
orstable_partition
.- If your data is in a
list
, you can usepartition
andstable_partition
directly, and you can use list-sort in place ofsort
andstable_sort
. If you need the effects offered bypartial_sort
ornth_element
, you'll have to approach the task indirectly.
- One indirect approach is to copy the elements into a container with random access iterators, then apply the desired algorithm to that.
- Another is to create a container of
list::iterator
s, use the algorithm on that container, then access the list elements via the iterators.
- A third is to use the information in an ordered container of iterators to iteratively splice the list's elements into the positions you'd like them to be in.
resources (time and space) less->more
partition
stable_partition
nth_element
partial_sort
sort
stable_sort
template<class ForwardIterator, class T>
ForwardIterator remove(ForwardIterator first, ForwardIterator last, const T& value);
because
remove
cannot know the container holding the elements on which it is operating, it is not possible forremove
to eliminate elements from a container. That explains the otherwise baffling observation that removing elements from a container never changes the number of elements in the container
remove
doesn't "really" remove anything, because it can't.Very briefly,
remove
moves elements in the range it's given until all the "unremoved" elements are at the front of the range (in the same relative order they were in originally). It returns an iterator pointing one past the last "unremoved" element. The "removed" values aren't necessarily in v any longer at all. remove doesn't change the order of the elements in a range so that all the "removed" ones are at the end, it arranges for all the "unremoved" values to be at the beginning. Though the Standard doesn't require it, the elements beyond the new logical end of the range typically retain their old values.
You should follow
remove
byerase
if you really want to remove something.
vector<int> v;
v.erase(remove(v.begin(), v.end(), 99), v.end());
Passing remove's return value as the first argument to the range form of erase is so common, it's idiomatic. In fact, remove and erase are so closely allied, the two are merged in the
list
member functionremove
.
list::remove
really removes thingsfor
list
s, calling the remove member function is more efficient than applying the erase-remove idiom.
list<int> li; //create a list
// put some values into it
// eliminate all elements with value 99:
// this really removes elements, so li's
// size may change
li.remove(99);
list::unique
really removes adjacent duplicates (also with greater efficiency than would erase-unique).
class Widget{
public:
…
bool isCertified() const;
}
vector<Widget*> v;
v.push_back(new Widget);
// Memory LEAK!!
v.erase( remove_if(v.begin(), v.end(), not1(mem_fun(&Widget::isCertified))), v.end());
// if *pWidget is an uncertified Widget,
// delete the pointer and set it to null
void delAndNullifyUncertified(Widget*& pWidget)
{
if (!pWidget->isCertified()) {
delete pWidget;
pWidget = NULL;
}
}
// delete and set to null all ptrs to uncertified widgets
for_each(v.begin(), v.end(),
delAndNullifyUncertified);
// eliminate null ptrs from v
v.erase(remove(v.begin(), v.end(), static_cast<Widget*>(0)), v.end());
If you pass a sorted range to an algorithm that also takes a comparison function, be sure that the comparison function you pass behaves the same as the one you used to sort the range.
// create a vector, put some
// data into it
vector<int> v;
// sort it into descending order
sort(v.begin(), v.end(), greater<int>());
// WRONG: search for a 5 in the vector, assuming it's sorted in ascending order!
bool a5Exists = binary_search(v.begin(), v.end(), 5);
// CORRECT: search for a 5, using greater as the comparison function
bool a5Exists = binary_search(v.begin(), v.end(), 5, greater<int>());
Item 35. Implement simple case-insensitive string comparisons via mismatch or lexicographical compare.
- http://en.cppreference.com/w/cpp/algorithm/mismatch
- http://en.cppreference.com/w/cpp/algorithm/lexicographical_compare
- stricmp() - Compare Strings without Case Sensitivity
- http://en.cppreference.com/w/cpp/algorithm/copy
- Why there is no std::copy_if algorithm?
- are they adding copy_if to c++0x?
template<typename Inputlterator,
typename Outputlterator,
typename Predicate>
Outputlterator copy_if(Inputlterator begin,
Inputlterator end,
Outputlterator destBegin,
Predicate p)
{
while (begin != end) {
if (p(*begin)) *destBegin++ = *begin;
++begin;
}
return destBegin;
}
-
"numeric algorithms" in
-
accumulate
-
inner_product
-
adjacent_difference
-
partial_sum
// accumulate requires only input iterators, so you can use it even with istream_iterators and istreambuf_iterators
// print the sum of the ints in cin
cout << "The sum of the ints on the standard input is" << accumulate(istream_iterator<int>(cin), istream_iterator<int>(), 0);
function pointers are passed by value. STL function objects are modeled after function pointers, so the convention in the STL is that function objects, too, are passed by value (i.e. copied) when passed to and from functions.
- First, your function objects need to be small. Otherwise they will be too expensive to copy.
- Second, your function objects must be monomorphic (i.e., not polymorphic) — they must not use virtual functions.
template <typename T>
class BPFCImpl
{
public:
virtual void operator(const T& val) const;
virtual ~BPFCImpl();
private:
Widget w;
int x;
};
template <typename T>
class BPFC : public unary_functor<T,void>
{
public:
void operator()(const T& val) const
{
pImpl->operator(val); // forward it to BPFCImpl
}
private:
BPFCImpl<T>* pImpl;
};
- A predicate is a function that returns bool (or something that can be implicitly converted to bool).
- A pure function is a function whose return value depends only on its parameters.
- A predicate class is a functor class whose
operator()
function is a predicate, i.e., itsoperator()
returns true or false. As you might expect, any place the STL expects a predicate, it will accept either a real predicate or an object of a predicate class.
- [deprecated]http://en.cppreference.com/w/cpp/utility/functional/ptr_fun
- http://en.cppreference.com/w/cpp/utility/functional/unary_function
- http://en.cppreference.com/w/cpp/utility/functional/binary_function
- [deprecated] http://en.cppreference.com/w/cpp/utility/functional/mem_fun
- [deprecated] http://en.cppreference.com/w/cpp/utility/functional/ptr_fun
- [deprecated] http://en.cppreference.com/w/cpp/utility/functional/mem_fun_ref
- Efficiency: Algorithms are often more efficient than the loops programmers produce.
- Correctness: Writing loops is more subject to errors than is calling algorithms.
- Maintainability: Algorithm calls often yield code that is clearer and more straightforward than the corresponding explicit loops.
- First, the member functions are faster.
- Second, they integrate better with the containers (especially the associative containers) than do the algorithms.
STL algorithms determine whether two objects have the same value by checking for equality, while associative containers use equivalence as their "sameness" test.
For the standard associative containers, then, choosing member functions over algorithms with the same names offers several benefits.
- First, you get logarithmic-time instead of linear-time performance.
- Second, you determine whether two values are "the same" using equivalence, which is the natural definition for associative containers.
- Third, when working with
map
s andmultimap
s, you automatically deal only with key values instead of with (key-value) pairs.
std::list
:
- Each of the algorithms that
list
specializes (remove
,remove_if
,unique
,sort
,merge
, andreverse
) copies objects, butlist
-specific versions copy nothing: they simply manipulate the pointers connectinglist
nodes.list
'sremove
,remove_if
, andunique
member functions honestly get rid of elements: no subsequent call toerase
is necessary.- the
sort
algorithm can't be applied tolist
s. Being only bidirectional iterators,list
's iterators can't be passed tosort
.- the
merge
algorithm can't be applied tolist
s. The algorithm isn't permitted to modify its source ranges, butlist
-merge always modifies thelist
s it works on.
If you have an unsorted range, your choices are
count
orfind
.
count
andfind
algorithms both search using equality, whilebinary_search
,lower_bound
,upper_bound
, andequal_range
employ equivalence.
To test for the existence of a value in a sorted range, use
binary_search
.
If you have a sorted range and your question is, "Is it there, and if so, where is it?" you want
equal_range
.
equal_range
returns a pair of iterators, the first equal to the iteratorlower_bound
would return, the second equal to the oneupper_bound
would return (i.e., the one-past-the-end iterator for the range of values equivalent to the one searched for), equal_range, then, returns a pair of iterators that demarcate the range of values equivalent to the one you searched for.First, if the two iterators are the same, that means the range of objects is empty: the value wasn't found.
vector<Widget> vw;
sort(vw.begin(), vw.end());
typedef vector<Widget>::iterator VWIter; // convenience typedefs
typedef pair<VWIter, VWIter> VWIterPair;
VWIterPair p = equal_ range(vw.begin(), vw.end(), w);
if (p.first != p.second) {
// if equal_range didn’t return an empty range
// found it, p.first points to the first one and p.second points to one past the last
} else {
// not found, both p.first and p.second point to the insertion location for the value searched for
}
The second thing to note about equal_range's return value is that the distance between its iterators is equal to the number of objects in the range, i.e., the objects with a value equivalent to the one that was searched for.
VWIterPair p = equal_range(vw.begin(), vw.end(), w);
cout << "There are " << distance(p.first, p.second) << " elements in vw equivalent to w.";
lower_bound
answers the question: "Is it there? If so, where is the first copy, and if it's not, where would it go?"
upper_bound
is also useful if you want to insert things into a sorted range so that objects with equivalent values are stored in the order in which they were inserted.
class Person {
public:
const string& name() const;
};
struct PersonNameLess: public binary_function<Person, Person, bool> {
bool operator()(const Persons lhs, const Person& rhs) const
{
return lhs.name() < rhs.name();
}
};
list<Person> Ip;
lp.sort(PersonNameLess());
Person newPerson;
lp.insert(upper_bound(lp.begin(), lp.end(), newPerson, PersonNameLess()), newPerson);
write-only code: it's easy to write, but it's hard to read and understand.
- Almost all the containers are declared in headers of the same name, i.e., vector is declared in . list is declared in , etc. The exceptions are
<set>
and<map>
.<set>
declares bothset
andmultiset
, and<map>
declares bothmap
andmultimap
. - All but four algorithms are declared in
<algorithm>
. The exceptions areaccumulate
,inner_product
,adjacent_difference
, andpartial_sum
. Those algorithms are declared in<numeric>
. - Special kinds of iterators, including
istream_iterator
s and istreambuf_iterator
s, are declared in<iterator>
. - Standard functors (e.g.,
less<T>
) and functor adapters (e.g.,not1
,bind2nd
) are declared in<functional>
.