STL performance improvements

This is essentially a collection of remarks and ideas, which suggest improvements regarding the performance and the generality of some STL components. The items of this collection have been divided in two basic categories. The first category comprises the items which suggest algorithmic improvements, whereas the second category comprises the items which merely suggest coding improvements.

Remarks for algorithmic improvements


Performance enhancement of the search_n STL algorithm

  • Abstract: Early in the year 2004, I have noticed an interesting opportunity for significant performance improvement of the search_n STL algorithm, [i] in the most notable STL implementations of that time. Namely, it was possible to implement a search_n specialization for random access iterators, using an efficient internal algorithm, which offered time complexity significantly lower compared to the search_n implementations of that time.

  • Details: For more details about this search_n algorithmic improvement, please read my corresponding article in CodeProject: Can search_n be more efficient?

  • Outcome: Both the STLport and libstdc++ STL implementations have followed my suggestions. (STLport Changelog, libstdc++ Changelog)

Performance issues of the find_first_of STL algorithm

  • Abstract: The find_first_of generic STL algorithm [i] is a generalization of the older strpbrk C library function, more versatile and superior in theory. However, during the year 2007, I have noticed that the find_first_of algorithm had a serious performance drawback in the most notable STL implementations of that time. Not only find_first_of was significantly slower than the strpbrk function under the same circumstances, but it also had inferior time complexity compared to its C library counterpart. Furthermore, I have also noticed that it was both feasible and practical to eliminate the above performance drawback, by implementing a find_first_of specialization for one-byte integers.

  • Details: For more details about this find_first_of performance drawback and the proposed enhancements, please read my corresponding article in CodeProject: find_first_of: A performance pitfall among the STL algorithms

  • Outcome: The STLport STL implementation has followed my suggestions. (STLport Changelog)

single_pass_search: Generic sequence searching through single-pass iterators

  • Abstract: During the year 2008, I have implemented a generic sequence searching template function, which was more generic than the search STL algorithm [i] and had better time complexity at the same time. I named this template function single_pass_search, to denote its ability to search for a sub-sequence in a search-range that is accessed through a pair of single-pass (input) iterators.

  • Details: For more details about the single_pass_search template function, please read my corresponding article in CodeProject: single_pass_search: Generic sequence searching through single-pass iterators

Remarks for coding improvements


Significant coding inefficiencies in notable implementations of the search STL algorithm

  • Abstract: During the year 2007, I have noticed some coding inefficiencies in the search algorithm [i] implementation of the SGI STL v3.3 library, which have been also inherited to both the STLport and the libstdc++ libraries. These performance deficiencies can be considered significant, since search is among the most frequently used STL algorithms and the above three libraries are also among the most notable STL implementations.

  • Details: For more details about these coding inefficiencies of the search STL algorithm please read this explanation.

  • Outcome: Both the STLport and libstdc++ STL implementations have followed my suggestions. (STLport Commit, libstdc++ Commit)

Excessive object creation in the swap_ranges STL algorithm

  • Abstract: Early in the year 2008, I have noticed that in the most notable STL implementations, the swap_ranges algorithm [i] has the tendency to create much more objects than normally required for carrying out its task. This fact is probably harmless as long as the swapped ranges contain C/C++ primitives or PODs, but can also make the swap_ranges algorithm very inefficient, when working with objects which have relatively expensive constructors.

  • Details: For more details about this coding inefficiency of the swap_ranges STL algorithm please read this explanation.