Predator

What's Predator?

To manually identify the culprit CL for a Chrome crash, Clusterfuzz testcase or UMA sampling profiler performance regression, a sheriff has to read the crash stack traces, go over the complete list of CLs in the regression range, make estimation, greps, git blame, etc.

Predator is to automate the triage of crashes and performance changes. Predator recommends suspected CLs with justification reasons and correlations that can't be done easily in manual triage. Results from Predator are used by Stability sheriffs for bug owner/component assignment.

Who are working on it?

Sharu Jiang (katesonia@)

Who can use it?

Stability sheriffs
Anyone who triages a Chrome crash on ClusterFuzz and Fracas
Anyone investigating a performance regression or improvement detected by the UMA Sampling Profiler

What is supported?

C++ crashes on Linux, Win, and Mac
C++ and Java crashes on Android
Performance regressions and improvements
6 sanitizers (ASAN, MSAN, TSAN, SyzyASAN, UBSAN, UBSAN VTR) and ASSERTS
Crashes on ClusterFuzz & Fracas

How it works?

Currently, Predator is based on Git blame and heuristics from manual triage experience.

Once a regression is passed over to Predator, it will go through this simplified main analysis flow:

0. Input:

Raw stacktraces recorded during a crash or from a profiler
The relevant Chromium revision or Chrome version
Regression range (or crash rates at different versions)

1. Extract stack traces from the raw data:

A Chrome crash could include more than one stack. For crashes found by different sanitizer tools (ASAN, MSAN, etc), some stacks are much more important than other stacks. Eg., for MSAN the importance decreases from the creation stack, to storage stack, and then to crashing stack, while for ASAN the crashing stack is the most important one.

A performance change can also happen over multiple call stacks. In the case of a reported performance change, all relevant call stacks from the profiler are aggregated to form a subtree contained within the overall call tree for a process. The root of this subtree is usually the focal point from which the regression or improvement originates.

When Predator extracts the expected stacktraces, Predator extracts file path and line number of each frame and also the frame index in the stack.

2. Detect dependency regression ranges:

Quite often, Chrome regressions happen in dependencies like v8, pdfium, skia, etc.

With a regression range, Predator will detect dependency regression ranges so that CLs in dependencies will also be checked.

3. Pull changes logs from Gittiles:

Predator will check the extracted stacktraces to determine which dependencies change logs should be pulled, and pull the change logs from Gitiles repositories accordingly.

4. Heuristics-based Analysis:

The heuristics for regression analysis are quite complicated, but below are important ones:

Only analyze top 7 frames for crashes: crashes are usually related to the top frames, while some stack has 50+ frames.
Only analyze a subtree for performance changes: performance changes usually occur in function call subtrees; a group of call stacks that all originate from a single function call. Stack frames outside of this subtree are usually irrelevant and can be ignored.
Distance between the lines in the stacktrace and the lines changed by a CL: the closer the change, the more likely the CL is suspected.
Index of frame: file change of higher frames is more suspected than lower ones.
Number of changed files in a CL: the more change, the more suspected.

Report abuse