There are a few ways to profile Chromium and Blink. Here are some of the tools that work well for diagnosing performance problems.
See also the Deep memory profiler.
For a broader understanding of Chromium speed and bottlenecks, as well as understanding how posted-task and threads interact in aggregate, there is a cross-platform, task-level profiler built in. Profiler results can be seen in about:profiler (or equivalently chrome://profiler) For more details, visit (http://www.chromium.org/developers/threaded-task-tracking).
See chrome://tracing for timelines showing TRACE_EVENT activity across all the different threads; originally used for GPU performance, and will probably require you to add TRACE_EVENT calls to the features you're interested in outside of compositing & rendering (this was named about://gpu through M14).
For native C++ code the tools depend on the OS.
Note that basic printf debugging and using a general debugger (such as gdb) may be sufficient for some purposes. However, more specialized tools are available.
See LinuxProfiling for alternative discussion.
The gperftools project, from which we get TCMalloc, also includes a very nice profiler: Google CPU Profiler.
You should be able to increase (or decrease) the sampling frequency (defaults to 100 Hz = every 10 milliseconds) via the CPUPROFILE_FREQUENCY environment variable, but
For nice viewing, output in DOT format and view with one of these programs: XDot (packaged in Ubuntu), ZGRViewer.
You can also pipe directly to xdot if you don't want a temporary file:
The profile will be saved in a file called "chrome-profile" in the working directory. You can't stop and restart the profiler without blowing away the previously stored data currently.
You can also use the standard Linux perf tool:
By default this saves "perf.data" in the current working directory, which can be renamed. perf report may be able to run on older data, but perf annotate will be inaccurate if you've since rebuilt the executable.
Profiling for Chrome OS is very similar to Linux, with a couple of key differences
DTrace and the pre-packaged "CPU Sampler" tool in XCode work well. Shark or the command-line sample work also, though they both will spend an exceedingly long time processing symbols if you are running Leopard (10.5). Anecdotally this is much faster in Snow Leopard (10.6)
I've heard that Purify has a profiler but have no experience with this personally.
AMD Code Analyst is a free profiler that can run inside Visual Studio. It captures frequency counts for functions in every process on the computer. It can optionally capture call-stack information, %CPU, and memory usage statistics; even with the Frame Pointer Omission optimization turned off (build\internal\release_defaults.gypi; under 'VCCLCompilerTool' set 'OmitFramePointers':'false'?), the call stack capture can have lots of bad information, but at least the most-frequent-caller seems accurate in practice.
Intel's VTune 9.1 does work in the Sampling mode (using the hardware performance counters), but call graphs are unavailable in Windows 7/64. Note also that drilling down into the results for chrome.dll is extremely slow (on the order of many minutes) and may appear hung. It does work (I suggest coffee or foosball). VTune has been essentially supplanted by Intel® VTune™ Amplifier XE, which is an entirely new code base and interface, AFAIK.
Very Sleepy (http://www.codersnotes.com/sleepy) is a light-weight standalone profiler that seems to works pretty well for casual use and offers a decent set of features.
Similarly to Linux, perf is the recommended tool to profile native code on Android. First, make sure you have built the browser with the set of GYP_DEFINES described above. Then, use the following wrapper script to launch the browser and follow the instructions:
$ tools/perf/record_android_profile.py --browser=android-chromium-testshell --profiler=perf
Press enter to start profiling...
>> Starting profiler perf
Press enter or CTRL+C to stop
The script will automatically pull the profiling data from the device and print out instructions for viewing it. Note that several files will be generated: one for the browser process and ones for each renderer process.
To view the profile, run:
tools/telemetry/bin/prebuilt/android/perfhost report --symfs /tmp/tmpjySSsF -n -i /tmp/tmpjySSsF/perf.browser0
To view the profile, run:
tools/telemetry/bin/prebuilt/android/perfhost report --symfs /tmp/tmpjySSsF -n -i /tmp/tmpjySSsF/perf.renderer0
Both nVidia PerfHUD and Microsoft PIX are freely available. They may not run without making minor changes to how the graphics contexts are set up; check with the chrome-gpu team for current details.
The OpenGL Profiler for OSX allows real-time inspection of the top GL performance bottlenecks, as well as call traces. In order to use it with Chrome/Mac, you must pass --disable-gpu-sandbox on the command line. Some people have had more luck attaching it to the GPU process after-the-fact than launching Chrome from within the Profiler; YMMV.
GPUView is a Windows tool that utilizes ETW (Event Tracing for Windows) for visualizing low-level GPU, driver and kernel interactions in a time-based viewer. It's available as part of the Microsoft Windows Performance Toolkit, in %ProgramFiles%\Microsoft Windows Performance Toolkit\GPUView. There's a README.TXT in there with basic instructions, or see http://graphics.stanford.edu/~mdfisher/GPUView.html. N.B.: There's a known bug which causes GPUView to crash when visualizing traces captured on machines with more than 8 cores. On an HP Z600, disabling hyperthreading in the BIOS is enough to work around this issue.