Performance http://www.agner.org/optimize/#manuals
http://www.ece.cmu.edu/~pueschel/teaching/18-645-CMU-spring08/course.html
http://lbrandy.com/blog/2009/07/computational-performance-a-beginners-case-study/
http://assemblyrequired.crashworks.org/
http://www.memorymanagement.org/
http://en.wikipedia.org/wiki/Automatic_vectorization#Automatic_vectorization
http://en.wikipedia.org/wiki/SSE2
http://www.nullstone.com/htmls/category.htm
http://www.tantalon.com/pete/cppopt/main.htm
http://www.tantalon.com/pete.htm
http://groups.google.com/group/perfo/
http://code.google.com/p/google-perftools/ performance tools
http://oprofile.sourceforge.net OProfile
http://www.artima.com/cppsource/lazy_builder.html
code coverage: gcov http://gcc.gnu.org/onlinedocs/gcc/Gcov-Intro.html
http://www.verifysoft.net/en_ctcpp.html http://www.bullseye.com/
http://www.azillionmonkeys.com/qed/tech.shtml
http://www.azillionmonkeys.com/qed/optimize.html Performance optimization
http://en.wikibooks.org/wiki/Category:Optimizing_C%2B%2B Optimizing C++
cflow: http://www.gnu.org/software/cflow/ http://www.kimbly.com/code/egypt++/
#include <string>
class Timer { std::string m_name; std::clock_t m_started; public: Timer (const std::string &name = "undef"); ~Timer (void); }; // ----- source ----- #include <iostream> using std::cout; using std::endl; using std::clock; Timer::Timer (const std::string &name): m_name(name), m_started(clock()) { // empty } Timer::~Timer (void) { double secs = static_cast<double>(clock() - m_started) / CLOCKS_PER_SEC; cout << m_name << ": " << secs << "secs." << endl; }
#include <ctime> using namespace std; class timer { public: timer(std::string description = "") // constructor { _description = description; _start_time = clock(); } ~timer() // destructor{ clock_t end_time = clock(); double total_time = end_time - _start_time; cout<< _description << "; elapsed [s]: "<< total_time/CLOCKS_PER_SEC << endl; } private: string _description; clock_t _start_time; };
http://memfix.phaseit.net/publications
Constructing an Object at a Pre-Determined Memory Position:
http://gethelp.devx.com/techtips/cpp_pro/10min/10min0999.asp
http://www.devx.com/cplus/10%20Minute%20Solution/37078
http://www.oroboro.com/rafael/docserv.php/index/programming/article/memmgr
http://www.ibm.com/developerworks/edu/au-dw-au-memorymanager-i.html
http://www.ibm.com/developerworks/linux/library/l-memory
Local: IBM article C++ memory manager
http://www.devx.com/cplus/10MinuteSolution/20676/1954?pf=true Optimize Your Member Layout
http://gethelp.devx.com/techtips/cpp_pro/10min/10min1100.asp overwriting new and delete
valgrind --tool=callgrind http://valgrind.org/docs/manual/cl-manual.html
It will generate a file called callgrind.out.x. You can then use kcachegrind tool,to read this file. It will give you a graphical analysis of things,with results like which lines costs
massif
http://kcachegrind.sourceforge.net/html/Home.html
http://www.devx.com/getHelpOn/10MinuteSolution/18019
http://gethelp.devx.com/techtips/cpp_pro/10min/2002/July/10min0702.asp
The bitwise operators and their names are:
~ bitwise NOT, or one's complement (unary)
& bitwise AND
| bitwise OR
^ bitwise XOR (exclusive OR)
Here's a brief description of each operator and its functionality. The ~ operator flips, or reverses, every bit in a sequence:
typedef unsigned char BYTE;
BYTE s=127; //binary: 0111 1111
BYTE flipped_s= ~s; //dec 128, binary 1000 0000
http://beej.us/guide/bggdb/ GDB
Self-assignment
In addition to the basic four operators shown, C++ defines three self-assigning versions thereof:
&= self-assigning bitwise AND
|= self-assigning bitwise OR
^= self-assigning bitwise XOR
Thus, the previous code listing can be rewritten like this:
BYTE s1=63; // 0x3f, 0011 1111
BYTE s2=67; // 0x43, 0100 0011
s1^=s2; // equivalent to: s1 = s1 ^ s2
Shifting
Two additional operators are left shift and right shift:
<< left shift
>> right shift
__restrict on a pointer promises the compiler that it has no aliases: nothing else in the function points to that same data. Thus the compiler knows that if it writes data to a pointer, it doesn’t need to read it back into a register later on because nothing else could have written to that address. Without __restrict, the compiler is forced to read data from every pointer every time it is used, because another pointer may have aliased x.
this code will run slowly:
int slow(int *a, int *b){ *a = 5; *b = 7; return *a + *b; // LHS stall: the compiler doesn't // know whether a == b, so it has to // reload both before the add}
Whereas this code will run quickly:
int fast(int * __restrict a, int * __restrict b){ *a = 5; *b = 7; // RESTRICT promises that a != b return *a + *b; // no stall; the compiler hangs onto // 5 and 7 in the registers.}
inlining in C++:
bug in g++ 4.1.2 - if inline method in .cpp file then after #include "a.h" put 2 pragmas:
#pragma implementation
#pragma interface
another way is to put the body of inline methods inside a.h after class declaration:
class a{ void f()};
inline void a::f(){}
allocating memory in big chunks
reusing the memory - declare static memory object and reuse/reiniitilaze it
Pools are generally used when there is a lot of allocation and deallocation of small objects
http://www.boost.org/doc/libs/1_42_0/libs/pool/doc/index.html
http://www.flipcode.com/archives/Fast_Allocation_Pool.shtml
http://www.codeproject.com/KB/cpp/MemoryPool.aspx
http://www.drdobbs.com/article/printableArticle.jhtml?articleID=184406243&dept_url=/cpp/
http://warp.povusers.org/FSBAllocator/
http://www.embedded.com/1999/9901/9901feat2.htm
http://www.bearcave.com/software/c++_mem.html
http://accu.org/index.php/journals/1308
http://code.google.com/p/microallocator/
http://codesuppository.blogspot.com/2009/09/free-open-source-micro-allocator-in-c.html
http://eli.thegreenplace.net/2008/10/17/memmgr-a-fixed-pool-memory-allocator/
A flyweight is an object that minimizes memory use by sharing as much data as possible with other similar objects; it is a way to use objects in large numbers when a simple repeated representation would use an unacceptable amount of memory http://en.wikipedia.org/wiki/Flyweight_pattern http://www.boost.org/doc/libs/1_42_0/libs/flyweight/doc/index.html
for str::sort() define operator < and inline it
inline bool iPair::operator<(const iPair& b) const
{
if (this->start == b.start)
return (this->end < b.end);
else
return (this->start < b.start);
}
http://www.cs.utah.edu/dept/old/texinfo/as/gprof_toc.html
http://en.wikipedia.org/wiki/Setvbuf
http://www.cplusplus.com/reference/clibrary/cstdio/setvbuf/
gcc -pg test.c -o test
And run it with the command line:
test
The output file that was generated during the run is called gmon.out. It contains the results of the profile action and can be viewed with the following command line:
gprof test > test.out
This command will redirect the output of the profiler to yet another file called test.out, which is a normal text file that can be viewed with any editor. Because the test.out file contains a lot of information and explanation, only the most-important part is shown here:
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls us/call us/call name
62.50 0.10 0.10 1 100000.00 100000.00 mul_double
37.50 0.16 0.06 1 60000.00 60000.00 mul_int
0.00 0.16 0.00 1 0.00 160000.00 main
% the percentage of the total running time of the
time program used by this function.
cumulative a running sum of the number of seconds accounted
seconds for by this function and those listed above it.
self the number of seconds accounted for by this
seconds function alone. This is the major sort for this
listing.
calls the number of times this function was invoked, if
this function is profiled, else blank.
self the average number of milliseconds spent in this
ms/call function per call, if this function is profiled,
else blank.
total the average number of milliseconds spent in this
ms/call function and its descendents per call, if this
function is profiled, else blank.
name the name of the function. This is the minor sort
for this listing. The index shows the location of
the function in the gprof listing. If the index is
in parenthesis it shows where it would appear in
the gprof listing if it were to be printed.
https://www.ibm.com/developerworks/library/pa-dalign/
http://itman.livejournal.com/316983.html
http://codesynthesis.com/~boris/blog/category/c-compilers/gcc-g/
The reasoning behind all-pairs testing is this: the simplest bugs in a program are generally triggered by a single input parameter. The next simplest category of bugs consists of those dependent on interactions between pairs of parameters, which can be caught with all-pairs testing.
If system has 13 input params and each could have 3 values the number of combinations is 3^13. It is possible to cover all pairwise input combinations in 15 testes.