Findit

What is Findit?

Findit (https://findit-for-me.appspot.com/) identifies culprits for compile/test/flake failures on Chromium Waterfall and Commit Queue.

This dashboard of failures shows culprits identified for compile and test failures.
This dashboard of analyzed flakes shows regressions & culprits identified for flaky tests.

Who are working on it?

Shuotao Gao (TL, stgao@), Chan Li (chanli@), Jeff Li (lijeffrey@), Prasad Vuppalapu (prasadv@), Roberto Carrillo (robertocn@), Yuke Liao (liaoyuke@).

Who can use it?

Chromium tree sheriffs (Findit is integrated with Sheriff-o-Matic)
Anyone who triages a compile/test failure or a flaky test

What is supported?

Compile/test failures:
- Compile failures, Swarmed gtests, Layout tests, and Android Instrumentation tests
- All build/test configurations (except ios) in the 9 tree-closer masters on Chromium Waterfall
Flaky tests:
- Swarmed gtests, Layout tests, and Android Instrumentation tests
- Any test configuration on Chromium Waterfall and Commit Queue

What is not supported yet?

Non-swarmed tests
Telemetry-based tests
JUnit tests

Flow Overview

How it works for compile/test failures?

Findit takes two complementary approaches to identify the culprits or suspects:

Heuristic-based analysis: correlate a CL with error messages in the failure log.
- Instant: 1~2 minutes
- May yield false positives
- Requires manual verification for the identified suspects
Try-job-based analysis: rerun failed compile or tests and search for the culprit in the regression range.
- Reliable as verified by rerun of failures.
- Fast:
  - Compile failures: median = 14 minutes
  - Swarmed gtest failures: median = 24 minutes

How it works for flaky tests?

Given a flaky test at a specific build cycle, Findit uses post-submit build artifacts on Chromium Waterfall of the same test configuration to bypass compile, triggers Swarming tasks directly to rerun the flaky test N times (currently up to 400) at different revisions in a variant of exponential search, and then narrows down the regression range into a single build cycle on Chromium Waterfall. Once a regression range is identified with high confidence, a series of try-jobs are run to compile binaries and identify the exact culprit.

In particular, for a flaky test at a try-bot build on Chromium Commit Queue reported by Chromium-Try-Flakes, Findit maps the CQ try-bot to Waterfall buildbot before the above flow.

File bug or feature request

Click this link to file a bug for Findit

Other project info

Code locations: findit appengine app, recipes, and recipe modules