Pipelined Builds

Context

Large system

Teams and source code organized around different sub-systems

Description

Continuous Integration jobs are divided in pipelines organized under different levels. Pipelines in preceding level trigger downstream pipelines. Materials is configuration such that the upstream pipeline materials do not trigger downstream pipelines directly. Downstream pipelines are triggered directly by materials specific to it only. The output of upstream pipeline acts as material for downstream pipeline. The helps in decoupling the build of different sub-systems from each other.

In practice one might run into shared code problem.

Source code is organized around sub-systems but there is shared code (cross sub-system code) as well. This means that commit to shared code would potentially trigger multiple pipelines. In such a scenario since the build time differs between pipelines, the downstream pipelines might get triggered multiple times. This can lead to false negatives. e.g.

09.00 Commit

09.01 Pipeline-1 starts

09.01 Pipeline-2 starts

09.15 Pipeline-1 passes

09.16 Pipeline-11 starts (triggered by Pipeline-1's output)

09.20 Pipeline-2 passes

09.26 Pipeline-11 fails (it hasn't got the output from Pipeline-2, created from same commit)

09.27 Pipeline-11 starts (triggered by Pipeline-2's output)

09.38 Pipeline-11 passes

This pattern is suitable for creating a chain of continuous integration builds. Breaking up a single build like this can have serious disadvantages. To understand the difference continuous integration should be understood in the context of team organization. Continuous integration should create output which can be consumed by downstream activities. In the diagram above if the downstream activity (lets say exploratory testing) cannot be started after Pipeline1 succeeds and useful output comes out only after Pipeline11 succeeds then it is not really multiple chained builds. This distinction is important to understand and breaking a single build like this can have following consequences.

  • Build time shoots up because it is equals to greater_of(time(pipeline1), time(pipeline2)) + time(pipeline11). The commiter has to wait for all downstream pipelines to succeed before the commit is considered successful. In above example, commit is successful when Pipeline11 is green.
  • It becomes difficult for the commiters to track their changes through various pipelines. This tracking is needed for finding out whether their commit was successful or not.
  • The continuous integration build is different from local build. Local build would not have the flexibility of tying the pipelines. This means it is not verifiable in commiters environment. This might not be an issue if we the output transfer between pipelines is well thought out to be less brittle.
  • Applying this pattern to break up a single build would suffer from material synchronization issues, similar to multiple single jobs build pattern.
  • Shared code problem as illustrated above.